windowCNV is a window-based tool for detecting copy number alterations (CNAs) from single-cell RNA-seq data. Inspired by infercnvpy, it extends its functionality with flexible CNA simulation, inference, and evaluation. The package supports cell type–aware analysis and event-level performance metrics.
Note: This package is experimental. Though benchmarked against inferCNVpy on simulated and real datasets, CNA classification accuracy in real-world datasets may be limited. Feedback and contributions are welcome.
- Simulate CNAs globally or per cell type
- Infer CNAs from gene expression using customizable smoothing windows
- Assign CNA events to individual cells
- Evaluate precision, recall, and F1 score at the event level
- Visualize CNAs with heatmaps and summary tables
We recommend using a dedicated conda environment:
conda create -n windowcnv python=3.10
conda activate windowcnvThen install windowCNV and its required dependencies:
pip install infercnvpy scanpy matplotlib pandas
pip install git+https://github.com/Li-Jiabei/windowCNV.gitImport the required packages:
import numpy as np
import pandas as pd
import scanpy as sc
import infercnvpy as cnv
import matplotlib.pyplot as plt
import warnings
from collections.abc import Sequence
import windowCNV as wcnvYou can explore how to use windowCNV in the following notebooks:
- Original infercnvpy usage: Shows the baseline workflow using
infercnvpy, enhanced with our new plotting and evaluation functions. - windowCNV implementation: Demonstrates the core
windowCNVpipeline and comparison withinfercnvpy.
These notebooks use the benchmarking dataset: PBMC_simulated_cnas_041025.h5ad
- CNA simulation and windowCNV application: Shows how to simulate CNAs and apply
windowCNVinference.
This notebook uses the dataset: pbmc_10k_v3_filtered_feature_bc_matrix.h5
Note: Many real-world datasets (including the one above) lack chromosome and genomic position annotations in
AnnData.var. To address this, we provide a helper function for automatic annotation. The usage is shown in the notebook. However, you must supply a gene annotation file.
In our example, we use the following reference file: mart_export_GRCh38.p14.txt This file contains:
- Gene stable ID
- Gene name
- Chromosome/scaffold name
- Gene start (bp)
- Gene end (bp)
The file was generated using Ensembl BioMart, which allows easy download of such annotations for genome build.
- windowCNV on PBMC-4k with CAS CNA Labels: Applies windowCNV to a 10x PBMC dataset with CAS-based high-confidence labels.
This notebook uses the dataset: SCP2745_high_conf_CAS_cell_types.h5ad
- windowCNV on TNBC iPSC-derived scRNA-seq data: Applies windowCNV to a triple-negative breast cancer (TNBC) iPSC dataset.
This notebook uses the dataset: GSM4476486_combined_UMIcount_CellTypes_TNBC1.txt.gz
- windowCNV on CRISPR-edited T cell iPSC data: Applies windowCNV to T cells derived from iPSCs edited with CRISPR.
This notebook uses the dataset: GSM7744300_GUIDEvsNT_CHR14_RESULTS.txt and gencode.v38.annotation.gtf