CANVAS: CANOPUS and SIRIUS Visualization & Analysis System
CANVAS is an interactive Dash application for visualization and interpretation of LC-MS metabolomics data.
It integrates aligned peak intensity data (e.g., from MSDIAL) with results from the SIRIUS workflow, including compound classification output from CANOPUS.
The tool enables researchers to:
- Integrate SIRIUS and CANOPUS outputs with intensity-based measurements.
- Explore metabolite classes dynamically using thresholds and ontology levels.
- Visualize results with sunburst plots, bar charts, PCA, and random forest classifier.
- Compare metabolite distributions across samples or conditions.
- Streamline hypothesis generation in untargeted metabolomics workflows.
- 📊 Interactive visualizations (sunburst, bar charts, PCA, random forest).
- 🧭 Dynamic filtering by class confidence, SIRIUS annotation scores, hierarchy levels, and intensity levels.
- 🔍 Exploration of SIRIUS & CANOPUS outputs with user-friendly controls.
- 🧪 Example datasets provided (plant metabolomics, lipidomics, human cell metabolomics).
git clone https://github.com/Fraximov/CANVAS.git
cd CANVASpip install -r requirements.txtpython app.pyThis should automatically open a browser window at http://127.0.0.1:8050.
CANVAS uses four input files:
-
Integrated intensity features (
.csv)- Typically exported from MSDIAL.
- Other LC–MS/MS extraction tools may work if formatted correctly.
-
SIRIUS output (structure file) (
structure_identification.tsv)- Exported from the “Summaries” tab in SIRIUS (top 1 hit recommended).
-
CANOPUS output (class annotations) (
canopus_structure_identification.tsv)- Exported from SIRIUS after classification.
-
Metadata file (
.csv)- User-generated, containing sample annotations and experimental variables.
The easiest way to start is with integrated features extracted in MSDIAL.
Tutorials can be found on the MSDIAL website.
Export both the aligned area peaks and the .mat MS/MS spectra file for further SIRIUS processing. Example export settings:
Run SIRIUS with the .mat file generated by MSDIAL.
Export the following from the “Summaries” panel (TSV format):
structure_identification.tsvcanopus_structure_identification.tsv
The metadata .csv file must follow these rules:
- First row = column names.
- First column must be named
name_file(sample names). - Sample names must match those from MSDIAL output.
- Additional columns = experimental variables.
- Blank samples must contain the keyword
"Blank"for automatic blank subtraction.
After starting CANVAS, upload the four input files in the header section.
- For first-time analysis: select “Files are raw”.
- For reloading saved data: select “Load Files”.
Large datasets (>50 MB) may take several seconds to minutes to load.
Steps include:
- Blank removal – average blank samples, remove features below user-defined ratio (e.g., 0.1).
- Imputation – replace missing/zero values with small sampled values.
- Normalization – by TIC (Total Ion Chromatogram).
- Scaling –
StandardScaler()(mean-centered, unit variance).
The GUI provides panels for:
- Filters
- Display options
- Selection options
- Feature browsing
Three sliders allow filtering by:
- Intensity threshold
- SIRIUS score threshold
- CANOPUS score threshold
- Sunburst plot
- Bar chart
- Boxplots
- PCA – overview of sample variance, feature contributions.
- Random Forest – supervised feature importance analysis.
Use the “Export data” button to save the processed dataset.
- With “Filtered” checked → exports only filtered data.
- Without → exports full processed dataset.
If you use CANVAS in your research, please cite:
Lehr, F.-X., Paczia, N. CANVAS: An Interactive Dash Application for Visualization and Analysis of LC-MS Metabolomics Data using the SIRIUS Workflow. Year.
Distributed under the MIT License. See LICENSE for details.