feat(method): ground phase · ground-context · verify-integrity (foundation fv25→27) by pilotspacex-byte · Pull Request #6 · pilotspace/ADD

pilotspacex-byte · 2026-06-11T09:32:36Z

What this branch ships

Three ADD-method milestones, each closed with a human-gated fold into the versioned foundation (foundation-version 24 → 27):

1. `ground-phase` — a phase-0 GROUND preamble (→ fv25)

A new ground phase rides in front of the seven steps (PHASES is now 9: ground → … → done). It is AI-owned, with no new human gate (the one approval stays at the §3 contract freeze). Each task carries a ## 0 · GROUND map (real files · symbols · the anchors §3 cites); status/check surface the grounding state as a never-red measure. Additive to the frozen 7-step flow.

2. `ground-context` — Ground gathers the whole working folder, efficiently (→ fv26)

The §0 gather now spans the working folder, not just code: docs/textbase · TODOs · config/manifests · data/fixtures. 0-ground.md gained a gather-method hint — sweep the broad pass cheaply (a small-model subagent / fast index / skim), then deepen task-specifically — a recommendation the engine never spawns (tool-agnostic). Closed the fv25 zero-lived-run ceiling.

3. `verify-integrity` — prove the green was EARNED, not gamed (→ fv27)

The method's TRUST core gains its first mechanically-enforced HARD-STOP:

Mechanical floor (tamper-tripwire) — md5(red tests + §3 contract) snapshotted at tests→build, re-checked at the verify gate; any post-red edit blocks an auto-PASS.
Judgment ceiling (earned-green-rubric) — overfit · vacuous · stubbed-away, scored by an independent adversarial refute-read (a subagent recommended under auto; the engine never spawns it — tool-agnostic).
Bounded self-heal (heal-then-escalate) — a confirmed cheat (either layer) returns to build for ≤3 monotonic honest re-builds, then HARD-STOPs to the human. A gamed green is never auto-passed and never RISK-ACCEPTED-waived — HARD-STOP-class, like security.

Engine pin bumped to 7b05eaf9 ×3 trees.

Verification

Full unittest suite 839 OK; dogfood add.py check 249/0.
The three verify-integrity exit criteria each verifier-backed; every gate human-confirmed (or auto-resolved on complete evidence, per the autonomy ladder).
All mirror trees (engine ×3 · guides ×3 · book ×4 · templates ×3) byte-identical.

Foundation

16 verify-integrity competency deltas folded into CONVENTIONS.md (5 new conventions + 3 flip-cite reinforcements) and PROJECT.md (§Spec ship bullet + §Key Decisions row); foundation-version: 27.

…to (prose ≡ enforcement) Task explicit-autonomy-dial closes milestone flag-first-freeze (2/2). Replaces the binary conservative-only high-risk guard with an explicit ordered ladder manual < conservative < auto, declared per task in the TASK.md header, and aligns every surface — engine, skill, book, glossary, templates — so the prose names exactly what the engine enforces. Engine (add.py, all 3 trees byte-identical @ c0c9329c): - _AUTONOMY_LEVELS = (manual, conservative, auto) + _AUTONOMY_LINE_RE; _autonomy_level() returns the rung / None (unset) / "?" (unknown token); _autonomy_lowered() = high-risk-safe. - high-risk guard widened: cmd_gate + audit refuse risk:high without a LOWERED rung (manual OR conservative), not conservative-only (unguarded_high_risk_auto). - cmd_check: unknown_autonomy_level (red) + implicit_autonomy (live-only WARN). - cmd_status surfaces the active task's autonomy rung every session. - TASK seed defaults autonomy: auto. engine_pin re-aimed to c0c9329c. Docs/skill (synced ×3): GLOSSARY (survivor + template), 11-governance, 10-setup, appendix-c, run.md, streams.md, SKILL.md all name the 3-mode ladder; "dial" kept only as the formerly-bridge marker (vocab-linter exempt). Tests: test_explicit_autonomy_dial.py DocsAccordTest extended from 1→4 doc surfaces (GLOSSARY + appendix-c + 10-setup + 11-governance), pinning prose ≡ enforcement so it cannot silently regress. 717 green · check 259/0 · audit clean (50). Verify gate: PASS — human-confirmed (Tin, verify gate, 2026-06-10); risk:high·conservative, human-owned (not auto-resolved). §7 emits 2 ADD competency deltas (docs-accord must pin every named surface; a word-ban linter misses a stale multi-valued description). author: Tin Dang

…ersion 23 Milestone flag-first-freeze (the freeze/autonomy seam) closed at 2/2 and archived. Fold (human-gated, fold.md) — 4 open deltas → foundation-version 22→23: - CONVENTIONS +2 bullets: verified-marker-scopes-enforcement-forward (a new guard stamps a marker on the guarded crossing and enforces only on MARKED records, never retro-redding predecessors); prose-accord-pins-every-surface + word-ban-blind-to- stale-enumeration (two faces of necessary-not-sufficient on a "prose ≡ enforcement" deliverable — DocsAccordTest pinned 1 of 4 named surfaces; a word-ban misses a stale "auto | conservative" enumeration). - CONVENTIONS flip-cite: a lived working LABEL drifts from its canonical glossary TERM (bridge with "formerly …" or migrate, never silent-rename) onto the cross-surface-term bullet. - PROJECT §Spec: flag-first-freeze SHIPPED bullet. - PROJECT §Key Decisions: fold row; foundation-version 22→23; updated 2026-06-10. - 4 deltas flipped open→folded in unflagged-freeze + explicit-autonomy-dial §7. Close hygiene (disclosed): the MILESTONE.md was a never-authored stub (goal line only) — the milestone ran entirely task-driven. Back-filled at close with real scope + tasks + 3 observable exit criteria mapped to their tasks, checked [x] per the human's close- affirmation + both verify-gate PASSes. milestone-done passed the v20 goal-gate (3/3), wrote RETRO.md; archive-milestone removed it from active state (+ pre-archive backup). check 249/0 · audit clean (48) · no open deltas · engine unchanged (c0c9329c ×3). author: Tin Dang

The high-risk and autonomy guards read their tokens with `\b<token>:` and took the FIRST match anywhere in the scanned header region — which includes the freeform H1 title and quoted prose, not just the declaration lines. Found at the init-auto-default freeze: a task titled "# TASK: Project seeds autonomy: auto by default at init" read as `auto` even though its declaration line said `autonomy: conservative` — the title substring won. Verified by execution: `_autonomy_level -> auto` (should be conservative), `_autonomy_lowered -> False`. The symmetric hazard is worse: a title containing "autonomy: conservative" on a task whose real declaration is `auto` + risk: high would read as LOWERED, so the `unguarded_high_risk_auto` guard would wave it through. `_RISK_HIGH_RE` shared the identical `\b` flaw. Severity, honestly: a correctness defect in a guard's reader. The title and the declaration are written by the same human in the same file at the same time — no external input, no adversary — so this is a self-inflicted footgun, NOT a security gate event. Fixed, not ceremonialized as a HARD-STOP. Fix: a token counts only at a DECLARATION position — line-start (optionally indented) OR just after the `·` slug-line separator — never a title/prose substring. The deliberately-supported inline form `… · autonomy: conservative` still reads; a title/prose `autonomy: <x>` / `risk: high` no longer does. The FROZEN grammar (`manual|conservative|auto`, line + inline forms) is UNCHANGED — the reader is made to honor it, not amended. Applied to BOTH readers so the two declaration-token readers share one collision behavior (the same fix protects the forthcoming `_project_autonomy`, which reads PROJECT.md prose). Tests (new test_autonomy_reader_anchor.py, 9): line ✓ · inline ✓ · title ✗ · prose ✗ · guard-reliability (title cannot fake lowered) · unset-when-title-only; plus risk line/inline/title. Red-first against the defect, green after the fix. add.py ×3 byte-identical, engine_pin re-aimed c0c9329c -> 6009233a. Full suite green (728) except the unrelated, in-progress init-auto-default red bundle. author: Tin Dang

…autonomy default) Open the goal-auto-ready sub-milestone and ship its first task. North-star (recorded as direction, NOT this milestone's deliverable): challenge the spine — drive toward fewer human gates. This milestone builds the PREREQUISITE — autonomy earned by goal-clarity; whether a clarified goal may RELAX the freeze gate is deferred to its own later milestone. init-auto-default (task 1 · autonomy: conservative · human-gated verify PASS): autonomy stops being a constant buried in the TASK.md template and becomes an EXPLICIT, project-scoped, INHERITABLE posture. - init declares `autonomy: auto` in PROJECT.md (PROJECT.md.tmpl, ×3 trees). - new-task INHERITS it via `_project_autonomy(root)` — a PURE read-path mirroring `_project_goal`, fail-SAFE: declared+recognized -> that rung; no line -> `auto` (v7: absent = auto); garbled/unrecognized -> `conservative` + a `garbled_project_autonomy` warning (NEVER silently `auto`). TASK.md.tmpl line 4 `autonomy: auto` -> `autonomy: {{autonomy}}`. - status surfaces `project autonomy: <level> (default — new tasks inherit)`. - the load-bearing proof: a NON-auto PROJECT.md default flows into a new task (test_non_auto_default_inherited) — the declared line is load-bearing, not cosmetic. Frozen contract held under pressure: a bonus `project_autonomy` key added to `status --json` was caught by the frozen-surface guard (test_json_surface_frozen, `json_surface_unsanctioned_key`) and REVERTED — the frozen test was left intact, not edited to pass. A JSON extension would need its own ratified change-request. Tests: test_init_auto_default.py (8) — init-declares · inherit-auto · non-auto- inherits (load-bearing) · absent->auto · garbled->conservative+warn · status- surface · helper-resolves · templates-×3. Full suite 734 green. add.py ×3 byte-identical; engine_pin re-aimed 6009233a -> cad072c1. .add/PROJECT.md declares `autonomy: auto` (dogfood). Verify gate: human PASS (Tin) at the conservative gate — auto-PASS disabled for a trust-layer touch. 4 open §7 deltas for the milestone-close fold: (ADD) declaration-token readers anchor to a declaration position; (SDD) project-level inheritable autonomy default; (SDD) deferred `init --autonomy` flag; (ADD) the build stays inside the frozen contract even for "harmless additive" changes. Builds on 55d64d9 (the reader-anchor defect fix that this task's own title exposed). author: Tin Dang

Add the auto-ready-goal classifier: a milestone goal is AUTO-READY iff its `## Exit criteria` has >= 1 criterion AND every one cites a verifier `(verify: <test|command|metric>)`, so a self-driving run can check its own result against the goal without human judgement. Autonomy is EARNED by goal-clarity — the central lever for a fully-auto AI loop. Engine (×3 trees, byte-identical; engine_pin re-aimed): - _exit_criteria_cited(root, mslug) -> (cited, total): PURE read over MILESTONE.md exit criteria; a NON-EMPTY (verify: …) counts — a bare (verify:) does not (the mid-text substring trap). - _goal_auto_ready(root, mslug) -> bool: total >= 1 AND cited == total. - check: WARN goal_not_auto_ready (NEVER red) for the OPEN active milestone when total >= 1 AND cited < total; a zero-criteria milestone stays silent (writing criteria is milestone-shaping's nudge, separable from citing them). - status: a goal-ready: line surfaced every session. Live-only (Must #4): the WARN excludes a done-but-not-yet-archived active milestone (status=done stays the active pointer until archive clears it) as well as closed/archived predecessors — found at the verify gate and closed test-first (test_done_active_milestone_not_flagged, red→green). Docs (prose ≡ enforcement, synced ×3): .add/GLOSSARY.md + GLOSSARY.md.tmpl define "auto-ready goal"; appendix-c-glossary.md + 11-governance.md + skill run.md name the term and the goal→autonomy link. Surface-only by design: the freeze gate, the per-task autonomy contract, and milestone_goal_unmet are UNCHANGED. Honest limit: the lint forces a citation slot but cannot prove the citation is honest ((verify: it works) passes) — citation-theater is the recorded irreducible-floor limit; resolving/running the cited verifier is the deferred upgrade. Tests: new suite test_goal_auto_ready_gate (13); full suite 747 OK. Verify gate: human PASS (conservative + risk: high; auto-PASS disabled). Milestone goal-auto-ready closed (2/2 tasks, 3/3 exit criteria met); 7 open deltas flagged for the next foundation fold. author: Tin Dang

…sion 24 Milestone goal-auto-ready (autonomy earned by goal-clarity) closed at 2/2 tasks, 3/3 exit criteria; init-auto-default gate PASS (auto), goal-auto-ready-gate gate PASS (human, conservative + risk:high — auto-PASS disabled). Fold (human-gated, fold.md) — 7 open deltas → foundation-version 23→24, all 7 confirmed (none rejected): - CONVENTIONS +3 bullets: - anchor-declaration-token-reader-to-a-declaration-position — a freeform title or quoted `token: value` must never read as a declaration; a title faking a lowered rung can DEFEAT a guard (init-auto-default, fixed @ 55d64d9). - live-only-guard-keys-on-terminal-STATUS — a done-but-not-yet-archived milestone stays the active pointer until `archive`, so pointer-membership alone flags a CLOSED milestone; key on status, never just the pointer. - lint-forces-a-SLOT-not-honesty — verifier-citation raises the goal-clarity floor but can't prove the citation is real (citation-theater); the irreducible floor — the human still owns honesty. - CONVENTIONS flip-cite onto frozen-guard→fix-build-not-matcher: a bonus project_autonomy key on `status --json` tripped json_surface_unsanctioned_key and was reverted (build fixed, frozen test intact). - PROJECT §Spec: goal-auto-ready SHIPPED bullet (project auto default + auto-ready-goal check; `init --autonomy` knob + the freeze-gate-relaxation SPINE decision deferred OPEN). - PROJECT §Key Decisions: fold row 23→24; the OSError-guard divergence on _exit_criteria_cited recorded as an ACCEPTED CEILING (mirrors the sibling _exit_criteria convention; not hardened in isolation). - 7 deltas flipped open→folded in init-auto-default + goal-auto-ready-gate §7. 747 tests OK; add.py check 0 fail; no open deltas remain. author: Tin Dang

Collapse the done goal-auto-ready milestone out of active state (2 tasks: init-auto-default, goal-auto-ready-gate). Files on disk untouched; the active state.json drops the milestone + its task records and clears active_milestone. A pre-archive-state.bak.json recovery snapshot is written (design-for-failure: an accidental archive stays recoverable — it captures the full milestone + member task records the archived slug-list summary drops). author: Tin Dang

… fold → fv25 Add a `ground` phase-0 preamble before specify so a task's contract, tests and build are grounded in the REAL current codebase, not assumption. The seven steps (specify→observe = §1–§7) keep their brand; ground is AI-owned and adds NO new human gate (the one approval stays at the §3 contract freeze). Three tasks (breadth-first), all done & PASS: - ground-phase-engine: insert `ground` as PHASES[0] (PHASE_OWNER=ai · guide map · new-task→ground · advance ground→specify) + the `## 0 · GROUND` TASK.md template (×3) + phases/0-ground.md guide (×3) + ~12 downstream test conformances. Engine byte-identical ×3 (pin e6b8c3da). - ground-bundle-wiring: the contract-freeze "grounded? cite anchors" checklist line + `add.py status`/`check` SURFACE the grounding state — tri-state measure (_grounded_state), human-readable + a never-red WARN on the existing `warnings` array, no new --json key (mirrors goal-auto-ready). Measure, never block. - ground-prose-align: book (02-the-flow ×4 · appendix-c ×4) + skill phase-table (SKILL.md ×3) + GLOSSARY name `ground` and render it as the §0 preamble to the seven steps, byte-synced across all trees. Retrospective consolidation (human-confirmed fold → foundation-version 24→25): 12 open deltas → 5 new CONVENTIONS bullets (ground-before-§3 · ordered-constant-index-hazard · additive-surface-byte-invisible · engine-derived-prose-guard · grandfather-retrofit-ceiling) + 1 flip-cite onto four-mirror-trees + 1 §Spec ground-phase ship bullet + the 7-step frozen-line parenthetical. Then archive-milestone + compact (3 task dirs → .add/archive/). Honest ceiling: shipped with ZERO lived runs STARTING at ground — all 3 tasks were grandfathered at specify (created before ground existed); §0 retrofitted at build so each dogfoods `grounded ✓` live. First lived run is next-milestone. Full suite 790 OK; dogfood check 0 failed. author: Tin Dang

… the working folder The ground phase now gathers more than code. `0-ground.md`'s `## Gather` gains a **Context (working folder)** bullet — docs/textbase · TODOs · config/manifests · data/fixtures (task-delta only) — and the `## 0 · GROUND` template gains one light `Context (working folder):` line between Touches and Honors. The grounding measure is untouched (the `Anchors the contract cites:` line keeps its role; add.py byte-identical to engine_pin), and §0 stays lean (one line, not a per-category block). ground-context-sources ran the FULL flow as the FIRST lived ground run — a task created AT `ground` (not retrofitted), reaching `grounded ✓` live. Dogfooded the milestone's own technique: a haiku subagent did the broad working-folder sweep (returned the ×3/×3 sync md5s + the guard list) while the main context deepened on the precise guard assertions. - §3 FROZEN @ v1 (human-approved: the new Context line over folding into Touches) - test_ground_context.py written RED first (3 feature failures) -> green by the guide+template edits only; full suite 800 -> 810 OK; dogfood check 0 failed - 0-ground.md ×3 + TASK.md.tmpl ×3 byte-identical; book prose untouched (scoped out) Opens the ground-context sub-milestone (1 of 2 tasks). Closes the "zero lived runs starting at ground" ceiling folded at fv25. author: Tin Dang

…heap, deepen task-specifically) The ground phase now has an opinion on grounding *economics*, not only *completeness*. `0-ground.md` gains a compact gather-METHOD hint (HOW — distinct from task 1's WHAT): a "How — gather efficiently:" line closing `## Gather` (prefer a small-model subagent / fast index / skim for the BROAD sweep, then DEEPEN on what THIS task needs — never lock a shallow first pass), a leading Step 0 in the `## AI prompt`, and the intro reworded "REAL current codebase" -> "REAL current working folder" (closing task-1's §7 coherence follow-up). The hint RECOMMENDS a subagent; the engine spawns nothing (tool-agnostic) — add.py stays byte-identical to engine_pin. Dogfooded the technique in-flight: a haiku subagent ran the broad working-folder sweep (returned the ×3/×3 sync md5s + the guard list) while the main context deepened on the guard assertions — exactly the sweep-cheap-then-deepen split this task ships. - §3 FROZEN @ v1 (human-approved: compact "How" line + Step 0 + intro reword) - test_ground_context.py EXTENDED red-first (GatherMethodHint: 3 RED -> green by the guide edits only); full suite 803 OK; dogfood check 259 passed, 0 failed - 0-ground.md ×3 byte-identical (md5 ba7147e5); add.py == engine_pin (no engine action); book prose untouched (scoped out) - verify auto-resolved PASS (autonomy: auto — prose-only, no security/concurrency/ architecture residue) Closes the ground-context sub-milestone (2 of 2 tasks). The milestone goal — ground gathers the full working-folder context efficiently + task-specifically — is met. author: Tin Dang

Human-gated fold at ground-context close (2/2 tasks, 2/2 exit criteria). The milestone gave ground a second axis: it gathers not only WHAT (the working-folder categories) but HOW (sweep the broad pass cheaply, then deepen task-specifically). Consolidate, not append-9-bullets (lean foundation): 9 open deltas → 4 new CONVENTIONS bullets + 2 flip-cites onto fv25 (δ6 self-closed within-milestone). CONVENTIONS.md (+4 bullets): - (ADD) Ground has two axes — completeness (WHAT) + economics (HOW) [δ7] - (ADD) A capability can be ADDED as guide-prose recommendation while the engine stays tool-agnostic — the pin holds across the addition [δ8] - (ADD) Dogfooding the shipped technique in-flight validates it [δ5 + δ9] - (TDD) A prose feature is RED-greenable by token-presence guards; triage the RED split [δ2 + δ3] CONVENTIONS.md (+2 flip-cites onto fv25 bullets): - additive-byte-invisible → the TEMPLATE twin held (additive §0 line invisible to structure/token guards) [δ1] - grandfather-ceiling → CLOSED: the first lived ground run (a task created AT `ground`) reached `grounded ✓` live [δ4] PROJECT.md: - foundation-version 25 → 26 - §Spec ground-context ship bullet (SHIPPED 2026-06-11) - §Key Decisions fold row - δ6 self-closed: task 2 reworded the intro task 1 flagged (no bullet; ledger annotated) All 9 deltas flipped open→folded; `add.py deltas` → 0 open; `add.py check` 261/0; full suite 803 OK. Engine e6b8c3da ×3 unchanged (prose/template only). author: Tin Dang

Closes the ground-context sub-milestone after its fv26 fold. Moves the milestone + 2 task bundles (5 files) to .add/archive/ground-context/ — state removed from active, files preserved for recovery (reverse the moves). - archive-milestone: ground-context removed from active state (2 tasks) - compact: milestones/ground-context/ (3 files) + tasks/ground-context-sources/ + tasks/ground-gather-hint/ -> .add/archive/ground-context/ - archived rollup now 18 milestones (58 tasks); no active task - add.py check 249/0 author: Tin Dang

Verify-integrity milestone, task 1 of 3. The build→verify half of ADD trusts the green; nothing stopped a build from GAMING that green by editing the red tests or the frozen §3 contract after the red run. This adds a mechanical floor: an md5 tripwire that catches test/contract tampering without ever running a test (tool-agnostic). Engine (×3 byte-identical — canonical · dogfood · bundled; pin bumped to a6eed5e0c374694945cf4273d1a2581d): - SNAPSHOT at the tests→build advance: inside cmd_advance's existing `if nxt == "build":` block, unconditionally overwrite state[task]["tripwire"] = {contract_md5, tests:{relpath:md5}}. The test set is exactly what the resolver returns (reuses _resolved_test_files, never re-globs); the §3 text is _raw_phase_bodies(...).get(3,""). The existing flag_verified set in the same block is the co-witness. - RE-CHECK at the verify gate: inside cmd_gate, before any COMPLETING outcome, _tamper_guard re-reads + re-hashes. A weakened/deleted test or an edited frozen §3 → _die HARD-STOP. A tamper is HARD-STOP-class — NOT launderable through RISK-ACCEPTED (the guard sits before the waiver write). - TRI-STATE, co-witnessed by flag_verified: present+match → pass · present+diverged → HARD-STOP · absent+flag_verified → HARD-STOP (tripwire_missing, the self-erase bypass) · absent+unverified → skip (legacy). - FAIL-CLOSED: any md5/read error on a tracked file → treated as diverged, never a crash (design-for-failure). - STANDING MONITOR: cmd_check gains a never-red build_tampered WARN for a diverged non-done task (early signal; the gate is where it bites). - Refactor: _tests_count/_declared_tests_count delegate to new _primary_test_files/_declared_test_files/_resolved_test_files — one resolution source, paths reused by the snapshot, every prior count preserved. Tests: add-method/tooling/test_tamper_tripwire.py — 12 red→green (snapshot/overwrite · weakened/deleted/contract-edit block · not-launderable + clean-RISK-ACCEPTED placement guard · tri-state clean/legacy/suspicious · standing never-red WARN). Full suite 815 OK; dogfood check 254 passed, 0 failed. Known limit (human-ratified at the verify gate, risk:high/conservative): the snapshot lives in agent-writable state.json, so a determined agent that forges both the snapshot AND flag_verified still slips — the mechanical floor raises bypass cost, it does not seal. The semantic refute-read (earned-green-rubric) + the bounded ≤3-attempt self-heal (heal-then-escalate) + the human gate are the backstops. md5 here is tamper-EVIDENCE not authentication (consistent with engine_pin's file-identity idiom) — ratified as NOT a security HARD-STOP. footer: verify-integrity 1/3 — the mechanical floor only; the judgment-cheat refute-read and the bounded self-heal loop remain. author: Tin Dang

…ARNED its green Task 2 of 3 in milestone verify-integrity. Task 1 shipped the MECHANICAL floor (an md5 tamper tripwire that catches an edited test or frozen contract). This task adds the JUDGMENT layer for the cheats the tripwire cannot see — a build that makes the UNCHANGED red suite pass without earning it: - src OVERFIT to the test fixtures (special-cased to the literal inputs) - VACUOUS / tautological asserts (green even against an empty implementation) - real logic STUBBED away (the function returns a constant the tests accept) Scored by an INDEPENDENT adversarial refute-read — a reviewer (a subagent under autonomy:auto is recommended; the engine never spawns one) prompted to argue "the green was NOT earned". A confirmed earned-green failure is HARD-STOP-class: never auto-passed, never RISK-ACCEPTED. The verify-gate, whole-suite specialization of run.md's adversarial verify (single-source pointer). Prose + template only — add.py stays byte-identical to engine_pin (the engine stays judgment-free; the resolver, not the engine, judges earned-vs-gamed). ENFORCEMENT (the auto-gate wiring + the <=3-attempt self-heal loop) is task 3 (heal-then-escalate), named as the explicit KNOWN LIMIT in the frozen contract. Surfaces (the rubric stated identically across every copy): - guide ×3 phases/6-verify.md — new "Part four — was the green earned?" - book ×4 docs/08-step-6-verify.md (incl. the previously-unguarded root) - TASK.md.tmpl §6 ×3 — one additive earned-green check line - GLOSSARY.md.tmpl ×3 + the living .add/GLOSSARY.md — two new terms (earned green · adversarial refute-read) Guarded by a new prose-TDD suite (test_earned_green_rubric.py, 13 tests): anchor-presence per surface, _norm whitespace-collapse for "stated identically" across hard-wrapped copies, mirror-parity md5 (guide ×3 / book ×4 / template ×3 each one hash), a root<->canonical book guard (08 is not a woven chapter), a scope guard that the task-3 loop machinery has NOT leaked into the task-2 guide, and test_engine_unchanged (add.py == pin). test_xml_convention gains the new Part-four heading so the over-tagging guard stays real. Dogfood: this task crossed tests->build under task 1's LIVE tamper tripwire and re-checked clean at the verify gate (gate PASS, exit 0, no HARD-STOP) — the tripwire validated end-to-end on a real task. The gate auto-resolved under autonomy:auto (normal-risk, complete evidence, no residue); the one human gate was the §3 contract freeze. Red -> green: the 13 tests ran red (8 anchor-absent failures) before the prose build, green after. Full suite 815 -> 828 OK. Dogfood add.py check 259 passed, 0 failed. author: Tin Dang

…escalate) A confirmed cheat no longer dies on first sight — it returns the task to BUILD for an honest redo, MONOTONICALLY up to a cap of 3, then records HARD-STOP to the human. This closes the milestone's third exit criterion: "a confirmed cheat self-heals for up to 3 honest re-build attempts before it HARD-STOPs; a gamed green is never auto-passed." Engine (×3 byte-identical trees + pin re-aimed a9a91cf7→7b05eaf9): - HEAL_CAP = 3. - _heal_or_escalate(root, state, slug, *, reason, source): attempts < cap -> increment + phase="build" (DIRECT, no re-snapshot) + save_state BEFORE raise SystemExit(3); attempts >= cap -> gate="HARD-STOP" + _die. The increment is durable-before-exit so a re-run never grants a free attempt. MONOTONIC — never auto-resets (an unguarded reset via the open cmd_phase would be a trivial cap bypass; the advisor blocked reset-on-recross). - _tamper_guard's `if diffs` branch rewired from immediate _die to the loop (source "tamper", mechanical/ENFORCED); tripwire_missing stays immediate _die. - `add.py heal <slug> --reason "..."` — the semantic entry (source "refute-read", honor-system): requires phase==verify, always exits non-zero (3=redo, 1=escalate). The engine names the CHANNEL but never spawns the read (tool-agnostic). Detection is two-layer: the mechanical tripwire (task 1) feeds the loop's enforced entry; the earned-green refute-read (task 2) feeds its honor-system entry. A confirmed cheat is HARD-STOP-class — never RISK-ACCEPTED-waived, like a security finding. Guides + book + glossary (mirrors all one md5): - run.md gains "The bounded self-heal loop" (home); 6-verify.md + book 08 point to it; 5-build.md + book 07 carry the honest-redo note; GLOSSARY adds the "bounded self-heal" term (cap 3 · monotonic · escalation · honor-system). Tests (839 OK; +11 new, 3 existing EVOLVED — not weakened): - test_heal_then_escalate.py (NEW, 11): mechanical loop ×6, monotonic ×1, semantic ×3, loop-documented ×1. - test_tamper_tripwire.py: _assert_blocked phase=="verify" -> in ("verify", "build") (a first tamper now returns-to-build); gate=="none" kept STRICT. - test_earned_green_rubric.py: test_engine_unchanged 1->3 cheat tokens + the "NOT earned" prompt (allows the frozen "refute-read" label); the absence-check flipped to a presence/separation guard now the loop landed. Coverage up. - test_min_pillar.py: added `heal` to the lifecycle census + a non-zero-exit tolerance (heal is a loop/refusal verb). Each existing-test edit is an evolution of an OBSOLETE assertion (the behavior changed), not a weakening to force a pass: the real invariant stays guarded and coverage holds-or-rises. Dogfooded task 2's own rubric on this build — an independent adversarial refute-read returned EARNED (zero hard findings; its one nit, a trivially-true assert, was strengthened before the gate). Verify gate: human-gated PASS (risk: high · autonomy: conservative). Reviewed by Tin Dang 2026-06-11 — the engine's first mechanical self-heal loop + a pin bump human-owned, like task 1. milestone: verify-integrity (3/3 tasks done · goal-ready) author: Tin Dang

Milestone verify-integrity done (3/3 tasks, 3/3 exit criteria, each verifier-backed); RETRO.md written. The method's TRUST core gained its first mechanically-enforced HARD-STOP: a two-layer anti-cheat (mechanical tamper- tripwire + adversarial earned-green refute-read) plus a bounded self-heal that returns a confirmed cheat to build ≤3 times then HARD-STOPs. A gamed green is never auto-passed. Human-gated fold of the 16 open deltas → foundation-version 26→27 (consolidate, not append-16; all 16 confirmed, none rejected): CONVENTIONS.md — 5 new bullets: - (ADD) build-integrity = a mechanical floor + a judgment ceiling; the floor on agent-writable state is necessary-not-sufficient, so the refute-read + human gate are the real backstop; a confirmed cheat is HARD-STOP-class. - (ADD) a mechanical-HARD-STOP guard = snapshot-at-seam → re-check-at-gate → fail-closed; its self-heal cap is real only if it cannot be cleared without a recorded human action (monotonic; the phase verb is unguarded). - (TDD) an engine change that invalidates an EXISTING assertion makes the test edit an EVOLUTION not a weakening iff the real invariant stays guarded, coverage holds-or-rises, and the reason is documented. - (ADD) a security-line classification can EMERGE during build — surface it for human ratification AT the gate, never self-grant. - (SDD) two how-we-author sharpenings: a scope guard against later-stage machinery leaking backward into earlier prose; a path-returning helper so a new feature and an existing counter share ONE resolution source. CONVENTIONS.md — 3 flip-cite reinforcements (no new bullets): - dogfood-at-own-gate (first normal task = cheapest E2E; method audits own builds) - presence necessary-not-sufficient (existence on one surface ≠ agreement across two) - prose-guide red→green (anchor-presence + mirror-parity, engine byte-pinned) PROJECT.md — §Spec verify-integrity SHIPPED bullet (carrying the both-gate-paths validation + the live-dogfood re-anchor path); §Key Decisions fv27 row; foundation-version 26→27. 16 deltas flipped open→folded across the 3 task TASK.md files; add.py deltas reports no open deltas; check 265/0; engine 7b05eaf9 ×3 unchanged this step. milestone: verify-integrity (closed) author: Tin Dang

…re milestone Heavy-archive of the done verify-integrity milestone: 6 files moved out of the active tree into .add/archive/verify-integrity/ (the MILESTONE.md + RETRO.md + the 3 task dirs). state.json updated to the archived rollup (19 milestones / 61 tasks); active tree is now clean (no active task). Recovery: reverse the moves; state needs no edit. No engine or test change — the pin stays 7b05eaf9 ×3; dogfood check 249/0. milestone: verify-integrity (archived) author: Tin Dang

…aware docs twin PR #6 was CI-red on Python 3.10/3.12 though green locally on 3.14 — two test-only portability defects the newer interpreter masked. Engine untouched (pin 7b05eaf9 ×3 unchanged); no behavior change, no weakened assertion. Family B — argparse intermixing (2 failures): test_heal_then_escalate._gate and test_tamper_tripwire._to_verify_and_gate built `gate <outcome> --owner.. --ticket.. --expires.. <slug>` (slug LAST). argparse <=3.12 cannot bind an optional (nargs="?") positional that follows value-taking flags -> "unrecognized arguments: <slug>" -> the gate never records (gate stays 'none'/exit 2). 3.13+ intermixing fixed it, masking it on 3.14. Aligned both helpers to the house order the rest of the suite uses: `gate <outcome> <slug> [--flags]` (slug right after outcome). The behavior under test (RISK-ACCEPTED records a waiver / a cheat is not launderable) is unchanged — only the incidental argv order. Family A — gitignored .add/docs twin (5 errors): test_explicit_autonomy_dial.DocsAccordTest and test_goal_auto_ready_gate. DocsAccordTest read the `.add/docs` twin unconditionally. `.add/docs` is gitignored (regenerated by `add.py init`) and ABSENT on a clean CI checkout -> FileNotFoundError. Adopted the present-trees idiom already used by test_foundation_update_loop / test_flow_diagram: assert parity only against twins that exist, requiring the tracked canonical + bundle. The ×3 byte-sync guarantee is preserved (the gitignored mirror is checked locally where it exists, skipped where it can't). Verified GREEN on the CI interpreters: python3.10 839 OK · python3.13 839 OK (skipped=3) · python3.14 839 OK. Follow-up (not in this commit): the engine itself rejects flags-before-slug on py<=3.12 for every optional-slug subcommand — a real robustness gap to harden behind a pin bump in a separate task. The natural order (slug after outcome) works on all versions. author: Tin Dang

… + worker commits its report (#7) * chore(method): log engine-argv-portability follow-up task Tracks the deferred robustness half found during PR #6 review: the CLI rejects `gate <outcome> --flags <slug>` (flags before the optional slug positional) on Python <=3.12 across every optional-slug subcommand — argparse cannot bind an optional positional that follows value-taking flags (3.13+ intermixing fixed it). The test helpers were fixed to the natural order in 9d52302; the engine itself is left as-is and tracked here behind a future pin bump. Scaffold carries the repro + fix-direction in §0/§1 so it is actionable. author: Tin Dang * feat(method): wave-protocol-runtime — streams.md merge-time fork-base + worker commits its report Amend the parallel-streams rubric (streams.md ×3) to close v19 wave deltas #7 and #8, so the concurrency protocol is satisfiable on a spawn-time-worktree runner and a worker durably persists its own report. Amendment A — merge-time fork-base shift: on a runner that creates each worktree AT spawn from a pool, the pre-spawn `rev-parse HEAD` evidence cell is unsatisfiable, so the `unverified_fork_base` check SHIFTS (it never skips) to worker step-0 (sync-to-base + re-echo), verified by the orchestrator at merge-time before merge-back. The pre-spawn rule stays the DEFAULT for fresh-HEAD-worktree runners; the merge-time path is an additive ALTERNATIVE. Amendment B — worker commits its report: the worker `<return>` contract now requires COMMITTING SUMMARY.md + deltas.md in the worktree (uncommitted files survive only by harness courtesy), so the serial-integration merge-back carries the worker's verdict. Bundle ran §1→§7 under risk:high · autonomy:conservative with red/green TDD: 4 token-presence + ×3-parity guards in test_streams.py (2 new-behaviour tests red before the build, 2 invariant tests green throughout). Full suite 843 OK on py3.10 AND py3.14. Tamper tripwire CLEAN (engine `_tripwire_divergence` -> []); §3 and test_streams.py byte-unchanged since the tests->build snapshot — the build touched only streams.md ×3. engine_pin HOLDS (prose-only, no add.py change). Human verify gate: PASS (green EARNED per the refute-read). The one residue is the freeze-approved deferred-enforcement flag: these guards lock the WORDS and the MIRROR, not engine EXECUTION of the shift (the engine can't see a worktree pool). Logged as the engine-merge-base-enforcement follow-up so the disclosed gap is tracked, not forgotten. author: Tin Dang * docs(method): wave-protocol-runtime — land Amendment A in its other §3-named home (ledger evidence-cell) PR #7 careful review found Amendment A stated the merge-time fork-base shift in the "Design for failure" bullet but NOT in the ledger "Evidence cells, not ticks" paragraph — a same-file second mention that still read pre-spawn-only, so a reader of the ledger section alone would think a spawn-time-pool-runner spawn is impossible (the opposite of what A enables). Add one clause to that paragraph: on a spawn-time pool runner the pre-spawn paste is unsatisfiable, so the fork-base cell holds the worker's step-0 post-sync echo (still == base) and the `unverified_fork_base` refusal shifts to merge-time before merge-back — it shifts, it never lifts. WITHIN the frozen §3 — the contract named "the evidence-cell `unverified_fork_base` note" as a valid home and the freeze flag said "either spot is conformant". No test weakened, no contract edited, engine_pin HOLDS. streams.md ×3 re-synced byte-identical (md5 82e08b0d); full suite 843 OK on py3.10; dogfood check 256/0. A post-gate honesty note records that this text changed AFTER the PASS as a within-frozen-§3 merge-review refinement. author: Tin Dang --------- Co-authored-by: Tin Dang <tindang.ht97@gmail.com>

TinDang97 added 18 commits June 10, 2026 18:20

pilotspacex-byte merged commit 4d48205 into main Jun 11, 2026
3 checks passed

pilotspacex-byte mentioned this pull request Jun 11, 2026

feat(method): wave-protocol-runtime — streams.md merge-time fork-base + worker commits its report #7

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(method): ground phase · ground-context · verify-integrity (foundation fv25→27)#6

feat(method): ground phase · ground-context · verify-integrity (foundation fv25→27)#6
pilotspacex-byte merged 18 commits into
mainfrom
feat/ground-phase

pilotspacex-byte commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pilotspacex-byte commented Jun 11, 2026

What this branch ships

1. ground-phase — a phase-0 GROUND preamble (→ fv25)

2. ground-context — Ground gathers the whole working folder, efficiently (→ fv26)

3. verify-integrity — prove the green was EARNED, not gamed (→ fv27)

Verification

Foundation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `ground-phase` — a phase-0 GROUND preamble (→ fv25)

2. `ground-context` — Ground gathers the whole working folder, efficiently (→ fv26)

3. `verify-integrity` — prove the green was EARNED, not gamed (→ fv27)