Skip to content

Fraximov/CANVAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CANVAS

CANVAS: CANOPUS and SIRIUS Visualization & Analysis System


Overview

CANVAS is an interactive Dash application for visualization and interpretation of LC-MS metabolomics data.
It integrates aligned peak intensity data (e.g., from MSDIAL) with results from the SIRIUS workflow, including compound classification output from CANOPUS.

The tool enables researchers to:

  • Integrate SIRIUS and CANOPUS outputs with intensity-based measurements.
  • Explore metabolite classes dynamically using thresholds and ontology levels.
  • Visualize results with sunburst plots, bar charts, PCA, and random forest classifier.
  • Compare metabolite distributions across samples or conditions.
  • Streamline hypothesis generation in untargeted metabolomics workflows.

Features

  • 📊 Interactive visualizations (sunburst, bar charts, PCA, random forest).
  • 🧭 Dynamic filtering by class confidence, SIRIUS annotation scores, hierarchy levels, and intensity levels.
  • 🔍 Exploration of SIRIUS & CANOPUS outputs with user-friendly controls.
  • 🧪 Example datasets provided (plant metabolomics, lipidomics, human cell metabolomics).

image

Installation

1. Clone the repository

git clone https://github.com/Fraximov/CANVAS.git
cd CANVAS

2. Install dependencies

pip install -r requirements.txt

3. Run CANVAS

python app.py

This should automatically open a browser window at http://127.0.0.1:8050.


Data Input Processing

CANVAS uses four input files:

  1. Integrated intensity features (.csv)

    • Typically exported from MSDIAL.
    • Other LC–MS/MS extraction tools may work if formatted correctly.
  2. SIRIUS output (structure file) (structure_identification.tsv)

    • Exported from the “Summaries” tab in SIRIUS (top 1 hit recommended).
  3. CANOPUS output (class annotations) (canopus_structure_identification.tsv)

    • Exported from SIRIUS after classification.
  4. Metadata file (.csv)

    • User-generated, containing sample annotations and experimental variables.

1. Integrated area peak file

The easiest way to start is with integrated features extracted in MSDIAL.
Tutorials can be found on the MSDIAL website.

Export both the aligned area peaks and the .mat MS/MS spectra file for further SIRIUS processing. Example export settings:

MSDIAL export

2. SIRIUS structure & CANOPUS output files

Run SIRIUS with the .mat file generated by MSDIAL.
Export the following from the “Summaries” panel (TSV format):

  • structure_identification.tsv
  • canopus_structure_identification.tsv
SIRIUS export

3. Metadata

The metadata .csv file must follow these rules:

  • First row = column names.
  • First column must be named name_file (sample names).
  • Sample names must match those from MSDIAL output.
  • Additional columns = experimental variables.
  • Blank samples must contain the keyword "Blank" for automatic blank subtraction.
metadata

Starting CANVAS

1. Loading the data

After starting CANVAS, upload the four input files in the header section.

  • For first-time analysis: select “Files are raw”.
  • For reloading saved data: select “Load Files”.

Large datasets (>50 MB) may take several seconds to minutes to load.

load files

2. Processing the data

Steps include:

  1. Blank removal – average blank samples, remove features below user-defined ratio (e.g., 0.1).
  2. Imputation – replace missing/zero values with small sampled values.
  3. Normalization – by TIC (Total Ion Chromatogram).
  4. ScalingStandardScaler() (mean-centered, unit variance).
processing

Visualization & Data Analysis

GUI overview

The GUI provides panels for:

  • Filters
  • Display options
  • Selection options
  • Feature browsing

Filters

Three sliders allow filtering by:

  • Intensity threshold
  • SIRIUS score threshold
  • CANOPUS score threshold

Visualizations

  • Sunburst plot
  • Bar chart
  • Boxplots

Multivariate Analysis

  • PCA – overview of sample variance, feature contributions.
  • Random Forest – supervised feature importance analysis.

Exporting Data

Use the “Export data” button to save the processed dataset.

  • With “Filtered” checked → exports only filtered data.
  • Without → exports full processed dataset.

Citation

If you use CANVAS in your research, please cite:

Lehr, F.-X., Paczia, N. CANVAS: An Interactive Dash Application for Visualization and Analysis of LC-MS Metabolomics Data using the SIRIUS Workflow. Year.


License

Distributed under the MIT License. See LICENSE for details.

About

CANOPUS and SIRIUS Visualization & Analysis System - Dash application

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages