Skip to content

Add --archive option to ./mfc.sh run for interactive runs#1376

Draft
drcole17 wants to merge 4 commits intoMFlowCode:masterfrom
drcole17:claude/archive-run-option
Draft

Add --archive option to ./mfc.sh run for interactive runs#1376
drcole17 wants to merge 4 commits intoMFlowCode:masterfrom
drcole17:claude/archive-run-option

Conversation

@drcole17
Copy link
Copy Markdown

Description

Adds an --archive PATH option (short -A) to ./mfc.sh run that, after an interactive run completes, collects the case inputs and produced artifacts into a timestamped archive alongside a manifest.yaml (invocation, git tagline, build lock, targets, contents). A companion --archive-format flag selects the container: dir (default), tar, or tar.zst.

Motivation: make it easy to snapshot a run (case.py, generated *.inp files, <name>.sh/<name>.out, equations.dat, time_data.dat, D/, p_all/, restart_data/, etc.) for later reference or sharing without manually copying files.

Scope is currently limited to interactive runs; for batch submissions the flag is ignored with a yellow warning, since outputs are produced asynchronously after qsub/sbatch returns. Would like to implement for batch runs as well.

What it adds

  • toolchain/mfc/cli/commands.py — two new run flags with help text, examples, and shell-completion entries.
  • toolchain/mfc/run/archive.py — new module: source collection, manifest build, dir/tar writers, and tar --zstd subprocess path.
  • toolchain/mfc/run/run.py — wires the archive call after __execute_job_script when the queue system is interactive.
  • toolchain/mfc/run/test_archive.py — unit tests (registered in toolchain/bootstrap/lint.sh).

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Something else

Scope

  • This PR comprises a set of related changes with a common goal

How Has This Been Tested?

Smoke-tested with examples/1D_sodshocktube/case.py across all three formats:

  • ./mfc.sh run examples/1D_sodshocktube/case.py --archive /tmp/mfc-archives — produced a MFC-<timestamp>/ directory containing case.py, simulation.inp, pre_process.inp, equations.dat, time_data.dat, D/, p_all/, restart_data/, and manifest.yaml.
  • --archive-format tar — produced a valid .tar readable by tarfile/tar -tf.
  • --archive-format tar.zst — produced a valid .tar.zst readable by tar --zstd -tf.

Unit tests (toolchain/mfc/run/test_archive.py, 7 tests) cover:

  • __collect_sources picks up case file, namelist, and artifact files/dirs; skips missing artifacts.
  • Round-trip for each of dir, tar, tar.zst formats (the zst test skips if tar --zstd is unavailable).
  • No-op when --archive is unset.
  • MFCException is raised when the destination already exists.

Tests registered in toolchain/bootstrap/lint.sh and run as part of ./mfc.sh lint.

Checklist

  • I have added tests that prove my fix is effective or that my feature works
  • I have added the correct license header (SPDX) to any new source files
  • I have checked that this PR does not introduce a breaking change
  • I have documented my changes to prevent confusion (docs)

GPU Changes

No GPU code paths were touched; this is a toolchain-only change.

drcole17 and others added 2 commits April 22, 2026 11:31
Bundles case source, namelists, stdout log, and output dirs (D/, p_all/, restart_data/, silo_hdf5/) into a timestamped snapshot with a manifest.yaml capturing git rev, build lock, and invocation. Supports dir, tar, and tar.zst formats.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Covers source collection, dir/tar/tar.zst round-trips, the no-op path when --archive is unset, and destination-collision errors. Registered in toolchain/bootstrap/lint.sh alongside the other module test suites.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

📝 Walkthrough

Walkthrough

This pull request introduces archiving functionality to the MFC toolchain's run command. It adds two new CLI arguments (-A/--archive PATH and --archive-format FMT) to enable archiving case inputs and outputs in dir, tar, or tar.zst formats. A new archive module collects artifacts (case files, inputs, scripts, outputs) and generates a manifest.yaml metadata file, then creates the archive based on the selected format. The run function conditionally invokes archiving for interactive queue systems only, with a warning for batch systems. A new test suite validates archive creation across formats, manifest generation, and edge cases, and the lint configuration includes the test in its execution pipeline.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add --archive option to ./mfc.sh run for interactive runs' clearly and specifically describes the main feature added: a new --archive option for interactive runs.
Description check ✅ Passed The PR description comprehensively covers all required template sections: motivation, type of change, testing approach with specific examples, and a thorough checklist with most items addressed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
toolchain/mfc/run/test_archive.py (1)

90-110: Unusual contextmanager-as-method pattern; consider simplifying.

from contextlib import contextmanager as _cm inside the class body and then using it as a decorator on an instance method works, but is hard to read and unidiomatic. A plain helper function (or @contextlib.contextmanager at module scope with self-less args) would be clearer. Non-blocking.

toolchain/mfc/run/archive.py (2)

112-153: Minor: redundant arcname computation and missing tar archive hardening.

  • Line 132: rel = arcname(src)[len(arcroot) + 1:] reconstructs what os.path.relpath(src, dirpath) already produces; calling arcname just to strip its prefix is confusing. Factor out a single _rel_for(src) helper.
  • Line 150-153: the uncompressed tar branch relies entirely on arcname(src) being safe. Combined with the output_summary issue above, a bad src turns into a tar member with .. segments. Consider asserting not arcname(src).startswith("..") before tf.add, or reusing the staging-dir approach for both branches for symmetry.

Non-blocking once the output_summary issue is fixed, but worth hardening defensively.


156-201: LGTM overall, two small nits.

  • Line 169: suffix = {...}[archive_format] will KeyError if called with an unexpected format; argparse choices protects callers today, but a defensive suffix = {...}.get(archive_format) with an explicit MFCException would give a friendlier error for programmatic callers.
  • Lines 180-201: cons.indent() is called before the with tempfile.NamedTemporaryFile(...). If that construction raises, cons.unindent() in the finally is never reached (the try doesn't cover it) — indent state leaks. Minor, but moving cons.indent() inside the try (with a matching unindent in finally) makes it symmetric.

Otherwise the orchestration (abspath/expanduser on archive_root, pre-existence check, no-op on empty sources, manifest-on-disk then copy/tar) reads cleanly.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 286685a4-ed7b-4b4b-81dc-c5f3bb688fb5

📥 Commits

Reviewing files that changed from the base of the PR and between a8a1cc5 and 6b07eab.

📒 Files selected for processing (5)
  • toolchain/bootstrap/lint.sh
  • toolchain/mfc/cli/commands.py
  • toolchain/mfc/run/archive.py
  • toolchain/mfc/run/run.py
  • toolchain/mfc/run/test_archive.py

Comment thread toolchain/mfc/run/archive.py Outdated
- Skip output_summary when it lives outside case dir (prevents '..' arcnames
  escaping the archive root in copy/tar writers)
- Raise MFCException on unknown archive_format instead of KeyError
- Move cons.indent() inside the try so unindent() in finally runs only when
  paired
- Extract rel_for() helper in __write_tar; add defensive '..' checks in both
  tar branches
- Hoist _run_archive context manager out of the class body in tests
- Add test covering the output_summary escape fix

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@sbryngelson
Copy link
Copy Markdown
Member

Nice feature. Seems like a good idea! I do wonder if the -A option with conflict with the -A flag used in sbatch to specify an account for submission (check the toolchain via grep).

@drcole17
Copy link
Copy Markdown
Author

@sbryngelson Sure, I will push a more appropriate flag choice. Thanks!

@sbryngelson
Copy link
Copy Markdown
Member

-z for zip?

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 64.75%. Comparing base (a8a1cc5) to head (18fbcbc).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1376   +/-   ##
=======================================
  Coverage   64.75%   64.75%           
=======================================
  Files          71       71           
  Lines       18711    18711           
  Branches     1549     1549           
=======================================
  Hits        12117    12117           
  Misses       5638     5638           
  Partials      956      956           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sbryngelson sbryngelson marked this pull request as draft April 23, 2026 03:51
Address @sbryngelson's review comment: -A would be confusing next to
sbatch's -A (account) convention even though MFC's --account is -a.
Just use the long --archive form; single-letter flags are scarce real
estate and this isn't something users type often enough to earn one.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants