handlers: StewardHandler returns SKIPPED when no merge actually happened#153
Open
Antawari wants to merge 2 commits into
Open
handlers: StewardHandler returns SKIPPED when no merge actually happened#153Antawari wants to merge 2 commits into
Antawari wants to merge 2 commits into
Conversation
Pre-fix, StewardHandler.handle returned TaskStatus.COMPLETED for every
non-error path — even when verdict was "reject", verdict was missing,
or pr_number couldn't be identified. The closer's contract is to seal
the work; reporting "completed" for cases where no merge happened
makes the pipeline lie about its actual outcomes. Downstream gates and
CLI summaries see COMPLETED → assume the PR landed → propagate the
false-positive.
## Fix
src/bonfire/handlers/steward.py:
- Compute final_status from the actual-merge precondition:
merge_actually_happened = verdict == "approve" and pr_number is not None
final_status = COMPLETED if merge_actually_happened else SKIPPED
- Reject verdict + pr_number → SKIPPED
- Empty verdict + pr_number → SKIPPED
- Approve verdict without pr_number → SKIPPED (nothing to merge)
- Approve + pr_number → COMPLETED (the only true-merge path)
- All other return-envelope behavior preserved: same metadata
(steward_verdict + steward_pr), same result string, same error path.
SKIPPED is the right semantics over FAILED — the closer evaluated its
preconditions and intentionally declined; it didn't crash.
## Tests
tests/unit/test_steward_handler_skipped_on_non_approve.py (new · Knight RED):
Five tests across two classes:
- TestStatusReflectsActualMerge (4):
* reject verdict → SKIPPED
* missing verdict → SKIPPED
* approve verdict without pr_number → SKIPPED
* approve + pr_number → COMPLETED (regression guard for the only
true-merge path, plus a sanity check that merge_pr was actually
invoked on the github_client mock)
- TestStatusPreservesEnvelopeStructure (1):
* SKIPPED path still attaches steward_verdict + steward_pr metadata
+ preserves the result string — only the status changes.
## Verification
pytest tests/unit/test_steward_handler_skipped_on_non_approve.py
5 passed (Knight RED → GREEN verified — 4 failed pre-fix, all
green post-fix; the 1 pre-fix-passing test is the COMPLETED
regression guard)
pytest tests/unit/test_steward_handler.py (existing 570-line suite)
30 passed (zero regressions — the existing reject-verdict tests
only asserted "merge_pr was not called", not the resulting
TaskStatus, so they're unchanged by this fix)
ruff check + format on changed files: clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to the prior commit. ruff format was reported as needing a re-flow on steward.py after the new merge_actually_happened block; running it now closes the format check that CI would otherwise flag. No behavior change.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pre-fix,
StewardHandler.handlereturnedTaskStatus.COMPLETEDfor every non-error path — even when verdict wasreject, verdict was missing, orpr_numbercouldn't be identified. The closer's contract is to seal the work; reporting "completed" for cases where no merge happened makes the pipeline lie about its actual outcomes. Downstream gates and CLI summaries seeCOMPLETED→ assume the PR landed → propagate the false-positive.Existing tests already verified the behavioral side of the cold path (verdict != "approve" does NOT call
merge_pron the github_client mock) — but never asserted the resultingTaskStatus. That's the gap this PR closes.Fix
src/bonfire/handlers/steward.py— three-line addition right before themodel_copy:verdict="approve"+ pr_number setverdict="reject"+ pr_number setverdict="approve"+ no pr_numberAll other return-envelope behavior preserved: same metadata (
steward_verdict+steward_pr), sameresultstring, same error path. Onlystatuschanges.SKIPPEDwas chosen overFAILEDper the ticket — the closer evaluated its preconditions and intentionally declined; it didn't crash.SKIPPEDconveys "intentionally did not run" exactly.Tests
tests/unit/test_steward_handler_skipped_on_non_approve.py(new · Knight RED) — 5 tests:statuschangesTDD verification
tests/unit/test_steward_handler.py(570-line suite) — 30/30 still green.ruff check+ruff format --checkon changed files: clean.Test plan
pytest tests/unit/test_steward_handler_skipped_on_non_approve.py— confirm 5/5 green.pytest tests/unit/test_steward_handler.py— confirm 30/30 still green.request_changesWizard verdict; confirm the closer stage now shows SKIPPED in CLI summaries rather than COMPLETED.🤖 Generated with Claude Code