Skip to content

Refactor samples.tsv schema: explicit input_type column and output_bids_dataset support#74

Draft
Copilot wants to merge 3 commits intodev-v0.2.0from
copilot/refactor-samples-tsv-for-multiple-bids
Draft

Refactor samples.tsv schema: explicit input_type column and output_bids_dataset support#74
Copilot wants to merge 3 commits intodev-v0.2.0from
copilot/refactor-samples-tsv-for-multiple-bids

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 7, 2026

Workflow selection was implicitly derived from acq substrings (e.g., {acq,[a-zA-Z0-9]*blaze[a-zA-Z0-9]*}), making it fragile and unintuitive. A single root output path also prevented multi-dataset workflows.

Schema changes

samples.tsv gains two new columns:

Column Values Purpose
input_type blaze | prestitched | imaris Explicit workflow selection per sample
output_bids_dataset path (optional) Override output root per-run
subject  sample  acq   input_type  output_bids_dataset  stain_0  sample_path
mouse1   brain   4x    blaze       /data/bids_out        Lectin   /data/raw
mouse2   brain   4x    prestitched                       DAPI     /data/raw2

Routing refactor

  • Replaced all inline {acq,[a-zA-Z0-9]*blaze[a-zA-Z0-9]*} wildcard patterns with wildcard_constraints blocks driven by the samples table
  • Added get_acq_constraint(input_type_str) — generates a per-type regex from actual acq values in samples; returns a never-matching regex ((?!x)x) when no samples use that type
  • get_output_ome_zarr() no longer takes an acq_type argument; each calling rule declares its own constraint
rule zarr_to_ome_zarr:
    wildcard_constraints:
        acq=get_acq_constraint("blaze"),
    output:
        **get_output_ome_zarr(),

output_bids_dataset support

  • If the column is present and all rows share one non-empty value, it overrides the global root config before rules load
  • Multiple distinct values raise a clear ValueError with a message pointing to future per-sample routing support

CLI

  • Adds --input-type {blaze,prestitched,imaris} (default: blaze) so the single-sample run.py path also sets the column explicitly

Bug fix

imaris_to_ome_zarr was referencing prestitched_to_metadata output; corrected to imaris_to_metadata.

Copilot AI and others added 2 commits April 7, 2026 16:54
… replace acq substring matching with explicit wildcard_constraints routing

Agent-Logs-Url: https://github.com/khanlab/SPIMprep/sessions/91f24ab5-f565-4a53-8e6f-33d6d4b26771

Co-authored-by: akhanf <11492701+akhanf@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor to support multiple BIDS datasets and improve workflow selection Refactor samples.tsv schema: explicit input_type column and output_bids_dataset support Apr 7, 2026
Copilot AI requested a review from akhanf April 7, 2026 16:57
@akhanf akhanf changed the base branch from main to dev-v0.2.0 April 7, 2026 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor to support multiple BIDS datasets and improve workflow selection

2 participants