feat: layers= arg routing h5ad layers to Seurat slots#22
Open
jackytamkc wants to merge 1 commit into
Open
Conversation
h5ad2seurat() gains an optional `layers=` named character vector mapping Seurat slots (counts/data/scale.data) to h5ad matrices (a /layers/<name> or the literal 'X'). NULL (default) preserves the original adata.X->counts behaviour exactly. Adds internal helpers .h5ad_layer_path / .h5ad_load_layer in h5ad_util.R, reusing the existing h5ad2Matrix reader. Skips loading adata.X only when no slot needs it (fixes a silent zero-data slot in the counts-from-layer + data='X' case present in the prototype). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi firstly thanks for the very useful function! I have been using schard often to shuttle things between anndata and seurat but find the layers porting difficult without loading the file twice and splice the assays together manually (e.g.,
library(schard)library(Seurat)seu <- schard::h5ad2seurat('adata.h5ad')log1p_mat <- GetAssayData(seu, assay = "RNA", layer = "counts")raw_mat <- schard::h5ad2Matrix(f, "layers/counts")dimnames(raw_mat) <- dimnames(log1p_mat)seu <- SetAssayData(seu, assay = "RNA", layer = "counts", new.data = raw_mat)seu <- SetAssayData(seu, assay = "RNA", layer = "data", new.data = log1p_mat))I intended to add a
layersargument into the function so this is much more friendlier.h5ad2seurat() gains an optional
layers=named character vector mapping Seurat slots (counts/data/scale.data) to h5ad matrices (a /layers/ or the literal 'X'). NULL (default) preserves the original adata.X->counts behaviour exactly. Adds internal helpers .h5ad_layer_path / .h5ad_load_layer in h5ad_util.R, reusing the existing h5ad2Matrix reader.Skips loading adata.X only when no slot needs it (fixes a silent zero-data slot in the counts-from-layer + data='X' case present in the prototype).
Hopefully it's useful!
Usage:
layers is a named character vector:
• Names must be Seurat slots: counts, data, and/or scale.data.
• Values are either a layer name found in adata.layers, or the literal "X" to refer to adata.X.
# raw counts from a layer, log-normalised X into the data sloth5ad2seurat("adata.h5ad", layers = c(counts = "counts", data = "X"))# put a pre-scaled layer straight into scale.datah5ad2seurat("adata.h5ad", layers = c(counts = "counts", scale.data = "scaled"))# unchanged, original behaviourh5ad2seurat("adata.h5ad") # adata.X -> countsInvalid input is rejected early with a clear message:h5ad2seurat("adata.h5ad", layers = c(foo = "X"))#> Error:layersmust be a named character vector with names in#> {counts, data, scale.data}h5ad2seurat("adata.h5ad", layers = c(counts = "nope"))#> Error: layer(s) not found in adata.h5ad: nope (available: counts)New argument is the last positional parameter and defaults to NULL.
With layers = NULL, output verified byte-identical to the released behaviour.
Verified with devtools::load_all() against a synthetic h5ad (built with anndata, carrying a distinct counts layer and a log1p X) and a real Stereo-seq file:
• ✅ Routing c(counts='counts', data='X'): counts slot holds raw integers, data slot holds the log1p X, and the two are distinct.
• ✅ Default (layers=NULL): adata.X lands in counts, and the full object (dims, assay, reductions, counts sum, metadata columns, dimnames) is identical to stock schard 1.0.0 on a real file.
• ✅ Validation guards fire for bad slot names and missing layers.
• ✅ Package installs cleanly into a fresh library; installed copy exposes the new argument.