You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The merge gate (runner/merge_gate.py) today refuses a PR for a closed source issue (apply_issue_closed_gate, #65) and for a weak mutation oracle (apply_mutation_survivor_gate, #381). It does not check whether the worker's diff stayed inside the filesystem sandbox it was leased.
We lease each worker a CapabilityPolicy whose filesystem.write_roots scopes Write/Edit to its worktree (worker_worktree.py:_fs_allow_entries). But that lease is enforced only at the Claude settings layer — and Bash(*) is granted whenever any write root exists, because Bash cannot be path-scoped in settings. So a worker can bash-write a file outside its worktree, and on an otherwise-green run that escape sails through the merge gate into mergeable state.
Concrete example: worker for issue #X is leased write_roots=("/tmp/forge-x/wt-loop-512",). Its agent runs echo pwned >> /home/u/forge-loop/src/forge_loop/runner/tick.py. CI is green, critic approves, the merge gate never inspects the diff's paths — auto-merge fires. A Bash escape just advanced into durable cognition.
Acceptance criteria
A pure function exists that, given a worker's collected changed paths and the leased write_roots, returns the non-empty tuple of paths that fall outside every write root (empty tuple for a clean diff). Path comparison is normalized (abs paths, trailing-slash insensitive), mirroring worktree_sweep's normalization.
A merge-gate enforcer step, wired into runner/merge_gate.py following the existing apply_*_gate shape, feeds the worker's changed paths into that function against the saga's leased CapabilityPolicy.
When the violation tuple is non-empty: the gate FAILS (branch never advances to merged — disable_pr_auto_merge, status flipped merged→open like the other gates), the worktree is moved to quarantine by reusing existing primitives (worker_worktree.quarantine_if_blocking and/or tasks/store.mark_quarantined), a violation-naming PR comment is posted, and a distinct merge_refused_write_root_escape (name TBD) event is emitted with the offending paths.
When the violation tuple is empty (diff entirely within write_roots), the gate is a pass-through: current merge behavior is byte-for-byte unchanged.
The gate is injectable/testable: changed-path collection goes through a seam (a "git runner" callable or equivalent) so a test can drive it with a fake — no real git, no real subprocess.
No new quarantine machinery is introduced; the change reuses what already exists.
Test matrix
Unit (pure violation function), in tests/:
in-bounds path → empty tuple (pass).
out-of-bounds path → tuple contains exactly that path.
normalization: a path equal-but-for-trailing-slash / relative-vs-abs to a write root is in-bounds (no false positive).
empty write_roots (no lease) → assert the chosen fail-safe stance (recommend: any changed path is a violation, i.e. closed-by-default), matching the gate's "conservative on uncertainty" convention.
Integration (the gate), in tests/test_runner_merge_gate.py:
Happy path: fake git runner reports a diff entirely under a write_roots=(worktree,) policy → gate passes, auto-merge NOT disabled, status stays merged, no quarantine call, no event.
Adversarial / sad path: fake git runner reports a path outside the worktree → gate refuses (auto-merge disabled, status merged→open), quarantine_* primitive invoked on the worktree, merge_refused_write_root_escape event emitted carrying the offending path, PR comment posted.
Best-effort side effects: a raising pr_comment / disable_pr_auto_merge does NOT prevent the event + status flip from firing (match apply_mutation_survivor_gate's contextlib.suppress pattern).
Out of scope
Do NOT build a new quarantine subsystem, registry, or state — reuse quarantine_if_blocking / mark_quarantined only.
Do NOT attempt to prevent the Bash write at runtime (no seccomp, no FUSE, no live FS interception) — this is a post-hoc gate, not a runtime jail.
Do NOT change the Claude settings rendering (render_worker_settings, _fs_allow_entries) or the Bash(*) grant decision.
src/forge_loop/runner/merge_gate.py — add the new gate function modeled on apply_mutation_survivor_gate (Protocol slice for the gh client, check_* per-outcome + apply_* over the list, event + status flip).
src/forge_loop/sandbox/policy.py — FilesystemScope.write_roots lives here; strong candidate home for the pure write_root_violations(changed_paths, write_roots) helper.
src/forge_loop/runner/tick.py — _run_merge_gate (line ~1072) is the production wiring site; thread the worktree + leased policy in alongside the existing gate calls.
src/forge_loop/worker.py — WorkerOutcome (line ~52) does NOT currently carry the worktree path or leased policy; (investigate) how the gate obtains the saga's leased CapabilityPolicy and worktree for a given outcome (thread one field through, or look it up from the saga store by issue/task id).
Changed-path collection (worker_changed_paths-style helper): (investigate) whether a git-diff path collector already exists; if not, add one behind an injectable seam so the test can supply a fake git runner.
tests/test_runner_merge_gate.py — add the integration tests next to the existing gate tests.
Worker note
AC is wide — it spans merge_gate.py, policy.py, the quarantine primitives in two modules, the tick.py wiring, and a new diff-collection seam, with an open (investigate) on how the gate reaches the leased policy. You are at high risk of running out of turns before pushing. Apply COMMIT DISCIPLINE (wip-commit every ~20 turns / ~5 file-edits) aggressively. Land the pure write_root_violations function + its unit tests FIRST (self-contained), commit, then build the gate and wiring. Run the EXIT CHECKLIST even if the wiring feels incomplete.
Original report
Part of epic: "Enforce leased filesystem write-roots against the worker's actual diff". An enforcer step wired into runner/merge_gate.py feeds worker_changed_paths into write_root_violations against the saga's leased CapabilityPolicy; non-empty violations FAIL the gate (branch never merges) and quarantine the worktree by reusing the existing primitive in tasks/store.py + worker_worktree.py — no new machinery. A clean diff leaves merge behavior unchanged. Test drives the gate with a fake git runner + a policy whose write_roots is the worktree, asserting block+quarantine out-of-bounds and pass-through in-bounds. ~140 net LOC incl. tests.
Customer story: A maintainer deciding whether an autonomous patch is semantically trustworthy needs the merge gate to reject and quarantine a patch that wrote outside the worker's leased sandbox, so a Bash escape cannot advance into mergeable state on a green run.
Problem
The merge gate (
runner/merge_gate.py) today refuses a PR for a closed source issue (apply_issue_closed_gate, #65) and for a weak mutation oracle (apply_mutation_survivor_gate, #381). It does not check whether the worker's diff stayed inside the filesystem sandbox it was leased.We lease each worker a
CapabilityPolicywhosefilesystem.write_rootsscopes Write/Edit to its worktree (worker_worktree.py:_fs_allow_entries). But that lease is enforced only at the Claude settings layer — andBash(*)is granted whenever any write root exists, because Bash cannot be path-scoped in settings. So a worker canbash-write a file outside its worktree, and on an otherwise-green run that escape sails through the merge gate into mergeable state.Concrete example: worker for issue #X is leased
write_roots=("/tmp/forge-x/wt-loop-512",). Its agent runsecho pwned >> /home/u/forge-loop/src/forge_loop/runner/tick.py. CI is green, critic approves, the merge gate never inspects the diff's paths — auto-merge fires. A Bash escape just advanced into durable cognition.Acceptance criteria
write_roots, returns the non-empty tuple of paths that fall outside every write root (empty tuple for a clean diff). Path comparison is normalized (abs paths, trailing-slash insensitive), mirroringworktree_sweep's normalization.runner/merge_gate.pyfollowing the existingapply_*_gateshape, feeds the worker's changed paths into that function against the saga's leasedCapabilityPolicy.disable_pr_auto_merge, status flippedmerged→openlike the other gates), the worktree is moved to quarantine by reusing existing primitives (worker_worktree.quarantine_if_blockingand/ortasks/store.mark_quarantined), a violation-naming PR comment is posted, and a distinctmerge_refused_write_root_escape(name TBD) event is emitted with the offending paths.write_roots), the gate is a pass-through: current merge behavior is byte-for-byte unchanged.Test matrix
Unit (pure violation function), in
tests/:write_roots(no lease) → assert the chosen fail-safe stance (recommend: any changed path is a violation, i.e. closed-by-default), matching the gate's "conservative on uncertainty" convention.Integration (the gate), in
tests/test_runner_merge_gate.py:write_roots=(worktree,)policy → gate passes, auto-merge NOT disabled, status staysmerged, no quarantine call, no event.merged→open),quarantine_*primitive invoked on the worktree,merge_refused_write_root_escapeevent emitted carrying the offending path, PR comment posted.pr_comment/disable_pr_auto_mergedoes NOT prevent the event + status flip from firing (matchapply_mutation_survivor_gate'scontextlib.suppresspattern).Out of scope
quarantine_if_blocking/mark_quarantinedonly.render_worker_settings,_fs_allow_entries) or theBash(*)grant decision.mutation-checkcommand over one configured high-risk module #379/Refuse frontier-advance/merge when scoped mutation survivors exceed the threshold #381 wiring; keep the new gate independently engageable.File pointers
src/forge_loop/runner/merge_gate.py— add the new gate function modeled onapply_mutation_survivor_gate(Protocol slice for the gh client,check_*per-outcome +apply_*over the list, event + status flip).src/forge_loop/sandbox/policy.py—FilesystemScope.write_rootslives here; strong candidate home for the purewrite_root_violations(changed_paths, write_roots)helper.src/forge_loop/worker_worktree.py— reusequarantine_if_blocking(wt: Path)(line ~196).src/forge_loop/tasks/store.py— reusemark_quarantined(task_id, *, reason)(line ~568);tasks/saga.pyhasTaskState.QUARANTINED.src/forge_loop/runner/tick.py—_run_merge_gate(line ~1072) is the production wiring site; thread the worktree + leased policy in alongside the existing gate calls.src/forge_loop/worker.py—WorkerOutcome(line ~52) does NOT currently carry the worktree path or leased policy; (investigate) how the gate obtains the saga's leasedCapabilityPolicyand worktree for a given outcome (thread one field through, or look it up from the saga store by issue/task id).worker_changed_paths-style helper): (investigate) whether a git-diff path collector already exists; if not, add one behind an injectable seam so the test can supply a fake git runner.tests/test_runner_merge_gate.py— add the integration tests next to the existing gate tests.Worker note
AC is wide — it spans
merge_gate.py,policy.py, the quarantine primitives in two modules, thetick.pywiring, and a new diff-collection seam, with an open(investigate)on how the gate reaches the leased policy. You are at high risk of running out of turns before pushing. Apply COMMIT DISCIPLINE (wip-commit every ~20 turns / ~5 file-edits) aggressively. Land the purewrite_root_violationsfunction + its unit tests FIRST (self-contained), commit, then build the gate and wiring. Run the EXIT CHECKLIST even if the wiring feels incomplete.Original report
Part of epic: "Enforce leased filesystem write-roots against the worker's actual diff". An enforcer step wired into
runner/merge_gate.pyfeedsworker_changed_pathsintowrite_root_violationsagainst the saga's leasedCapabilityPolicy; non-empty violations FAIL the gate (branch never merges) and quarantine the worktree by reusing the existing primitive intasks/store.py+worker_worktree.py— no new machinery. A clean diff leaves merge behavior unchanged. Test drives the gate with a fake git runner + a policy whose write_roots is the worktree, asserting block+quarantine out-of-bounds and pass-through in-bounds. ~140 net LOC incl. tests.Customer story: A maintainer deciding whether an autonomous patch is semantically trustworthy needs the merge gate to reject and quarantine a patch that wrote outside the worker's leased sandbox, so a Bash escape cannot advance into mergeable state on a green run.