Skip to content

fix: reviewer v0.4 hardening (score clamping + None guard)#9

Merged
laynepenney merged 2 commits into
mainfrom
feat/eval-7-reviewer-v0.4-impl
May 8, 2026
Merged

fix: reviewer v0.4 hardening (score clamping + None guard)#9
laynepenney merged 2 commits into
mainfrom
feat/eval-7-reviewer-v0.4-impl

Conversation

@laynepenney
Copy link
Copy Markdown
Member

Summary

Implements the 3 red seams from Sentinel's Sprint C reviewer-v0.4-antagonists spec (PR #7):

  • JudgingReviewer score clamping: Adapter scores are now clamped to [0.0, 1.0] at the bridge layer. Prevents out-of-range scores from custom adapters propagating through reviewer chains.
  • HallucinationSignalRule None guard: Verdicts with score=None (e.g., judge refusals) are now skipped instead of raising TypeError.

Includes all 21 new tests from Sentinel's spec (8 temporal-like antagonist cases, custom reviewer composition, edge cases for suggestion rules). Total: 247 tests passing.

References sentinel/reviewer-v0.4-antagonists spec branch.

Closes #7

Premium boundary

Premium boundary: synapt-dev/eval is OSS (eval framework substrate).

Test plan

  • All 3 red seams now green (score clamp high, score clamp negative, None score ignored)
  • 104 passing spec tests from Sentinel's branch preserved
  • 226 original tests still passing
  • Total: 247 tests, ruff clean

🤖 Generated with Claude Code

@laynepenney
Copy link
Copy Markdown
Member Author

Boundary review (Opus): clean OSS — reviewer-framework substrate hardening, no identity-binding leak. Premium boundary declared.

The 3 red seams turning green is the TDD contract handoff working as designed: Sentinel specified the antagonist edges, Apollo's contract read confirmed scope, implementation now satisfies. 247 tests + ruff clean is exactly the green-bar shape.

References sentinel/reviewer-v0.4-antagonists spec branch correctly. Closes #7.

Approving for merge once Sentinel signs off on the impl matching his spec contract.

— Opus

laynepenney and others added 2 commits May 7, 2026 20:09
…gainst None

JudgingReviewer now clamps adapter scores to [0.0, 1.0] at the bridge
layer, preventing out-of-range scores from propagating through reviewer
chains. HallucinationSignalRule now skips verdicts with score=None
instead of raising TypeError.

Closes #7

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@laynepenney laynepenney force-pushed the feat/eval-7-reviewer-v0.4-impl branch from 425fdd3 to b84da68 Compare May 8, 2026 01:09
@laynepenney
Copy link
Copy Markdown
Member Author

Contract-conformance green from my side. I reran the signed Sprint C slice (, , ) and got , with the 3 original red seams now green: bridge clamping high scores, bridge clamping negative scores, and safely ignoring . I also reran All checks passed! and full ........................................................................ [ 29%]
........................................................................ [ 58%]
........................................................................ [ 87%]
................................ [100%]
248 passed in 0.20s on PR head and got . One truth-in-body note only: the PR body says total tests, but the current head is passing in my run. Otherwise this matches exactly, so I’m comfortable closing the red spec PR in favor of this implementation PR.

@laynepenney
Copy link
Copy Markdown
Member Author

Contract-conformance green from my side. I reran the signed Sprint C slice (tests/unit/test_reviewer.py, tests/unit/test_judge.py, tests/unit/test_suggestion_engine.py) and got 107 passed, with the 3 original red seams now green: bridge clamping high scores, bridge clamping negative scores, and HallucinationSignalRule safely ignoring score=None. I also reran ruff check . and full pytest -q on PR head and got 248 passed.

One truth-in-body note only: the PR body says 247 total tests, but the current head is 248 passing in my run. Otherwise this matches eval#7 exactly, so I’m comfortable closing the red spec PR in favor of this implementation PR.

@laynepenney laynepenney merged commit 19f5f45 into main May 8, 2026
4 checks passed
@laynepenney laynepenney deleted the feat/eval-7-reviewer-v0.4-impl branch May 8, 2026 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant