feat: independent context-isolated verifier auto-gate (Generator–Verifier split)#4
Open
ewalliss wants to merge 3 commits into
Open
feat: independent context-isolated verifier auto-gate (Generator–Verifier split)#4ewalliss wants to merge 3 commits into
ewalliss wants to merge 3 commits into
Conversation
d6b34b6 to
88295ac
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Under
autonomy: auto, the run's "adversarial verify" and "completeness-critic" execute in the same agent and same context that wrote the code (run.md). In-context self-critique is unreliable — a model favours its own output (self-enhancement bias) and repeats its own errors, so the in-context skeptic approves it uncritically (Zheng et al. 2023, LLM-as-judge; Huang et al. 2023, "LLMs Cannot Self-Correct Reasoning Yet"). This quietly reintroduces "one agent finds it plausible" — the exact thing ADD sets out to eliminate.What
Apply the Generator–Verifier (Generator–Critic) pattern for real, with the engine enforcing structure + the deterministic decision and the prompts driving the independent verification:
skill/add/verify-critic.md): the run spawns fresh subagents per lens (wiring·concurrency-security·contract-conformance) that see only §3 contract, §4 tests, the diff and CONVENTIONS — never the build transcript — and must refute._validate_verdict): a verdict is data (lens/verdict/evidence), validated before write — no shallow "looks good" auto-PASS._consensus): any security finding or refutation →HARD-STOP; residue →ESCALATE; auto-PASSonly with agreement across ≥3 distinct lenses.add.py eval): scores the decision logic against labeled fixtures so the gate's own judgment is measurable —recall= no seeded-bad build slips through. TDD's red/green applied to the AI's judgment, not just the code.New CLI:
add.py verdict(append-only per-task ledger, fail-closed) ·add.py consensus(read-only PASS/HARD-STOP/ESCALATE) ·add.py eval. Wired intorun.md's auto-gate andphases/6-verify.md. The engine stays LLM-free;cmd_gateis untouched (non-breaking).Tests
Built test-first (
tooling/test_independent_verify.py, 19 cases). Full suite 721 green on CI Python (3.10/3.12); all repo invariants honored — three byte-identicaladd.pycopies + re-pinnedENGINE_MD5, skill-tree parity across canonical/_bundled/.claude, regenerated bundle, subcommand-coverage census, wording rubric.