A Scanpy-based single-cell RNA sequencing analysis pipeline for B cells across multiple timepoints, covering quality control, normalization, clustering, and visualization.
- Cell type: B cells
- Timepoints: Day 2, Day 4, Day 6
- Format: 10x Genomics H5 (filtered feature-barcode matrix)
- Size: 36,306 cells × 36,601 genes
| Sample | Cells |
|---|---|
| Day 2 | 15,285 |
| Day 4 | 11,127 |
| Day 6 | 9,894 |
- Data loading — Load 10x H5 files, standardize gene names, deduplicate, and merge samples with unique barcodes
- Quality control — Calculate mitochondrial, ribosomal, and hemoglobin gene fractions; filter low-quality cells
- Normalization — Normalize per cell, log1p transform
- Feature selection — Identify highly variable genes
- Dimensionality reduction — PCA, UMAP
- Clustering — Leiden clustering
- Visualization — UMAP plots colored by timepoint, cluster, and marker genes
scanpy==1.12.1
anndata==0.12.16
pandas==2.3.3
numpy==2.4.6
scipy==1.17.1
matplotlib
seaborn
gtfparse
h5py
Open and run the notebook:
jupyter notebook scRNAseq_small_pipeline.ipynb