AI-based hematological malignancy prediction from peripheral blood smears in a large diagnostic laboratory cohort
Leukemia
Paper link | Download Model | Download Dataset | Interactive vis | Cite
cAItomorph is an explainable AI model, trained to classify hematological malignancies based on peripheral blood cytomorphology. Our data comprises peripheral blood single-cell images from 6115 patients with a wide range of hematological malignancies, and 495 healthy controls, categorized into 8 coarse classes. cAItomorph leverages a hematology foundation model and aggregates image encodings with a transformer into a single vector. It achieves an overall accuracy of 0.72 in 8-disease classification, with F1 scores of 0.76 for acute leukemia, 0.80 for myeloproliferative neoplasms and 0.94 for healthy cases. It reaches an area under the curve of 0.97 for binary malignancy identification. Our model highlights clinically relevant diagnostic cells in both internal and external test sets.
Test data is publicly available and includes 201,560 single-cell images from 409 patients spanning eight distinct hematologic conditions (seven disease classes and one healthy cohort). This data represents an isolated 20% testing split of a larger curated diagnostic laboratory cohort sourced from the Munich Leukemia Laboratory (MLL) between 2021 and 2022.
![]() |
|---|
cAItomorph leverages DinoBloom-B hematology foundation model to encode singel cell image representations.
- Real-World Dataset: We assembled the first real-world dataset of peripheral blood smears for hematological malignancy diagnosis.
- Foundation Model Backbone: Built upon DinoBloom, a hematology foundation model, enabling robust and generalizable cytomorphological feature learning.
- Strong Diagnostic Performance: Achieves good performance on acute leukemias and myeloproliferative neoplasms.
- Clinical Relevance: Supports human experts by providing disease probabilities and cell level attentions, guiding downstream diagnostics.
Follow the steps below to set up the environment and install dependencies.
conda create -n caitomorph python=3.10
conda activate caitomorph
pip install torch torchvision torchaudio
pip install numpy pandas h5py transformers Pillow einopsAfter clonning this repo, follows the steps:
cd cAItomorph
wget -O weights.zip "https://nefeli.helmholtz-munich.de/records/63wmp-ccp64/files/weights.zip?download=1"
unzip weights.zip
rm weights.zipSee demo to start using our model.
-
AML_Hehr: Patient-level single-cell images from 189 subjects, including four genetic AML subtypes and controls.
https://doi.org/10.7937/6ppe-4020 -
cAItomorph: Patient-level single cell images from 409 subjects, including 7 different hematological conditions and healthy control. https://nefeli.helmholtz-munich.de/records/9bv4e-3ag16
-> DinoBloom embeddings are available at: Marrlab/DinoBloom_hemato_embeddings
Visit HematoVis, an interactive tool to visualize single cells, model predictions and more...
Please cite us if you use the model and data:
Dasdelen MF, Kukuljan I, Lienemann P, Ozlugedik F, Sadafi A, Hehr M, Spiekermann K, Pohlkamp C, Marr C. AI-based hematological malignancy prediction from peripheral blood smears in a large diagnostic laboratory cohort. Leukemia. 2026 Mar 23:1-5.
@article{dasdelen2026ai,
title={AI-based hematological malignancy prediction from peripheral blood smears in a large diagnostic laboratory cohort},
author={Dasdelen, Muhammed Furkan and Kukuljan, Ivan and Lienemann, Peter and Ozlugedik, Fatih and Sadafi, Ario and Hehr, Matthias and Spiekermann, Karsten and Pohlkamp, Christian and Marr, Carsten},
journal={Leukemia},
pages={1--5},
year={2026},
publisher={Nature Publishing Group UK London}
}
Prof. Dr. Carsten Marr
📧 carsten.marr@helmholtz-munich.de
🏛️ Institute of AI for Health, Helmholtz Munich

