Skip to content

Dataset: Leukemia and gene expression #3

@pbiecek

Description

@pbiecek

Problem

This is a binary classification problem.
On the basis of historical data, models (of varying degrees of complexity) should be developed to predict the type of leukemia.
The best models should be explained using XAI tools at the instance level and at the data set level.

Data

Source: Molecular Classification of Cancer by Gene Expression Monitoring. Gene expression dataset (Golub et al.)
https://www.kaggle.com/crawford/gene-expression#data_set_ALL_AML_independent.csv
The original authors used the data to classify the type of cancer in each patient by gene expressions.

Note

Due to number of features, this dataset will be more interesting for people that have some experience in med/bio applications.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions