Skip to content

fix(hooks): stop self-reporting benign BrokenPipeError to gateway (AI-GATEWAY-3J)#177

Open
anonpran wants to merge 1 commit into
stagingfrom
nanda/fix-hook-broken-pipe-ai-gateway-3j
Open

fix(hooks): stop self-reporting benign BrokenPipeError to gateway (AI-GATEWAY-3J)#177
anonpran wants to merge 1 commit into
stagingfrom
nanda/fix-hook-broken-pipe-ai-gateway-3j

Conversation

@anonpran

@anonpran anonpran commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes Sentry AI-GATEWAY-3JHookError:general: [Hook:claude-code] Exception in main: [Errno 32] Broken pipe (~2200 occurrences, ongoing). This is client-side-only; the gateway is correct (fail-open, always 200) and is untouched.

Root cause: the four coding-tool hooks end main() by writing their JSON response to stdout via print(..., flush=True). When the host (Claude Code / Cursor / Copilot / Codex) has already closed the read end of stdout — hook timeout, user cancel, session end, or (dominant) a foreground PreToolUse approval-poll blocking up to APPROVAL_TIMEOUT=4h — the flush raises Python BrokenPipeError. The broad except Exception then mistook this benign closed pipe for a crash and self-reported it via log_error -> report_error_to_gateway -> POST /v1/hooks/errors -> Sentry. A secondary bug: the fallback print in the except re-hit the same dead pipe and raised an unhandled second BrokenPipeError, so the hook exited noisily/non-zero instead of degrading gracefully.

Changes

  • Add an _emit(text) helper to each hook that writes to stdout but treats a closed reader pipe as a benign no-op (swallows BrokenPipeError/OSError, redirects stdout to os.devnull so the interpreter-shutdown flush can't re-raise).
  • Route every response write in main() through _emit().
  • Split main()'s catch-all so except BrokenPipeError: is intercepted first (benign — not reported) before except Exception: (genuine errors are still reported and redacted via log_error).
  • Cursor's WEB-4734 stderr redaction (redact_secrets(str(e), _cached_api_key)) is preserved and now guarded against a broken stderr pipe.
  • No new imports (os/sys already present in all four) — frozen-binary drift-guard unaffected. The four hooks remain intentionally standalone (per-file _emit duplication is deliberate for PyInstaller vendoring).

Files: claude-code/hooks/unbound.py, codex/hooks/unbound.py, copilot/hooks/unbound.py, cursor/unbound.py (+ a test_broken_pipe.py beside each).

Test plan

  • 12 new tests (3 per hook) pass via stdlib unittest:
    • (a) a BrokenPipeError on the stdout write is not forwarded to report_error_to_gateway/log_error
    • (b) a non-broken-pipe exception in main() is still reported ("Exception in main: boom", category general)
    • (c) _emit to a dead pipe does not raise and swaps stdout to devnull
  • Mutation-checked: reversing the except order fails the suite (assertions are load-bearing)
  • Real dead-pipe simulation exits 0 with no shutdown-flush traceback
  • Existing per-tool suites green (codex 36 passed, copilot 20 passed); two unrelated pre-existing failures (test_identity allow-list, binary frozen-session-start) reproduce on pristine HEAD
  • All four files py_compile/ast.parse clean

Follow-ups (out of scope here)

  • Clamp the 4h foreground poll_approval_status on the PreToolUse path — the dominant trigger of the closed pipe (own enforcement blast radius; deferred).
  • Add a CI job to execute these unittest files — static-checks.yml currently runs only py_compile + pyflakes, so this behavior isn't CI-enforced.

🤖 Generated with Claude Code


Note

Low Risk
Client-side hook I/O and error-handling only; genuine exceptions still report, with no gateway or policy behavior changes.

Overview
Stops AI-GATEWAY-3J noise by treating a closed stdout pipe as normal when the IDE hook host ends the session or times out before the hook flushes its JSON response.

Each of the four standalone unbound.py hooks (Claude Code, Codex, Copilot, Cursor) adds _emit, routes all hook responses through it instead of print(..., flush=True), and handles BrokenPipeError in main() before the generic except Exception so those cases are not sent to log_error / the gateway. _emit swallows write/flush failures and redirects stdout to os.devnull so shutdown does not raise again; Cursor also guards stderr redaction prints the same way.

Adds test_broken_pipe.py beside each hook (12 tests total) covering benign pipe behavior, real errors still reported, and dead-pipe _emit behavior.

Reviewed by Cursor Bugbot for commit 4c1c7ab. Bugbot is set up for automated code reviews on this repo. Configure here.

Greptile Summary

This PR fixes AI-GATEWAY-3J by stopping all four coding-tool hooks from self-reporting benign BrokenPipeErrors that occur when the host closes the read end of stdout before the hook can flush its JSON response. The fix introduces an _emit helper that silently redirects stdout to /dev/null on a closed pipe, and splits main()'s outer exception handler so BrokenPipeError is caught first (discarded) and genuine errors remain reported via log_error.

  • Adds _emit(text) to all four hooks (claude-code, codex, copilot, cursor) with internal broken-pipe suppression and stdout-to-devnull swap, plus an outer except BrokenPipeError guard in main() as a second layer of defence.
  • Adds 12 new unit tests (3 per hook) verifying: broken pipe not forwarded to gateway, genuine exceptions still reported, and _emit itself surviving a dead pipe without raising.
  • Cursor's stderr redaction path is now correctly guarded against a broken stderr pipe, and the intentionally standalone per-file structure is preserved for PyInstaller vendoring compatibility.

Confidence Score: 4/5

Safe to merge — the core fix is structurally sound and well-tested, with the broken-pipe path correctly isolated from genuine error reporting in all four hooks.

The _emit implementation correctly suppresses the noisy broken-pipe reports and the exception ordering change ensures real errors are still forwarded to the gateway. The two gaps are: _emit catches bare OSError (a superset of BrokenPipeError), silently swallowing non-pipe I/O errors on stdout with no logging; and the new silent path has no observability — after deploy, the team can only confirm the fix worked by watching Sentry go quiet rather than monitoring a direct signal. Neither blocks correctness of the fix itself.

All four unbound.py files share the same _emit pattern — any change to the OSError catch scope should be applied consistently across all of them.

Important Files Changed

Filename Overview
claude-code/hooks/unbound.py Adds _emit helper and splits except BrokenPipeError before except Exception; core fix is correct, minor concern that OSError catch in _emit is broader than intended and the broken-pipe path has no observability
codex/hooks/unbound.py Same _emit pattern and exception ordering as claude-code hook; same broad OSError catch and silent broken-pipe path apply
copilot/hooks/unbound.py Same _emit pattern applied; copilot previously emitted {} not {"suppressOutput": true} on broken pipe — the outer BrokenPipeError handler now silently exits instead, consistent with the other hooks
cursor/unbound.py Adds _emit + broken-pipe handler; cursor-specific stderr redaction is now correctly guarded by its own try/except; handle_deny_and_exit() and sys.exit(2) still execute after a silent _emit failure, which is acceptable since the host is gone
claude-code/hooks/test_broken_pipe.py Three targeted tests: broken pipe not reported, real exception still reported, _emit swaps stdout to devnull on dead pipe; tearDown correctly restores sys.stdout
codex/hooks/test_broken_pipe.py Mirrors claude-code tests; omits get_api_key patch correctly because codex reads via os.getenv() directly (no call to a get_api_key function before the guarded try block)
copilot/hooks/test_broken_pipe.py Identical to claude-code test suite; all three test cases present and cover the right invariants
cursor/test_broken_pipe.py Adds cursor-specific patch.object(unbound.sys, "stderr", io.StringIO()) guard in the real-exception test, correctly covering cursor's stderr redaction path

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[main called] --> B[get api key]
    B --> C[stdin read]
    C --> D{input empty?}
    D -- yes --> E[_emit suppressOutput]
    D -- no --> F[json.loads event]
    F --> G{event type}
    G --> H[process event]
    H --> I[_emit response]
    E --> Z{BrokenPipeError in _emit?}
    I --> Z
    Z -- yes --> J[stdout = devnull, return silently - no report]
    Z -- no --> K[return normally]
    H --> Y{exception in main body?}
    Y -- BrokenPipeError --> L[outer BrokenPipeError handler - stdout devnull - no report]
    Y -- other Exception --> M[outer Exception handler - log_error then gateway - _emit suppressOutput]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[main called] --> B[get api key]
    B --> C[stdin read]
    C --> D{input empty?}
    D -- yes --> E[_emit suppressOutput]
    D -- no --> F[json.loads event]
    F --> G{event type}
    G --> H[process event]
    H --> I[_emit response]
    E --> Z{BrokenPipeError in _emit?}
    I --> Z
    Z -- yes --> J[stdout = devnull, return silently - no report]
    Z -- no --> K[return normally]
    H --> Y{exception in main body?}
    Y -- BrokenPipeError --> L[outer BrokenPipeError handler - stdout devnull - no report]
    Y -- other Exception --> M[outer Exception handler - log_error then gateway - _emit suppressOutput]
Loading

Reviews (1): Last reviewed commit: "fix(hooks): stop self-reporting benign B..." | Re-trigger Greptile

Greptile also left 2 inline comments on this PR.

…-GATEWAY-3J)

The Claude Code / Codex / Copilot / Cursor hook scripts end main() by writing
their JSON response to stdout via print(..., flush=True). When the host has
already closed the read end of stdout (hook timeout, user cancel, session end,
or a foreground PreToolUse approval-poll blocking up to 4h), the flush raises
BrokenPipeError [Errno 32]. The broad `except Exception` caught it and
self-reported it via log_error -> report_error_to_gateway -> /v1/hooks/errors
-> Sentry, producing ~2200 noise events collapsed under one fingerprint. A
secondary bug: the fallback print in the except re-hit the dead pipe and raised
an UNHANDLED second BrokenPipeError (noisy non-zero exit).

Fix (client-side only; the gateway is correct/fail-open and is untouched):
- Add an _emit() helper that writes to stdout but treats a closed reader pipe
  as a benign no-op (swallows BrokenPipeError/OSError and redirects stdout to
  os.devnull so the interpreter-shutdown flush cannot re-raise).
- Route every response write in main() through _emit().
- Split main()'s catch-all so `except BrokenPipeError` is intercepted first
  (benign, not reported) before `except Exception` (genuine errors are still
  reported and redacted via log_error). Cursor's WEB-4734 stderr redaction is
  preserved and guarded.

Adds test_broken_pipe.py to all four hooks (12 tests): a broken pipe is not
reported, real exceptions are still reported, _emit on a dead pipe does not raise.

Follow-ups (out of scope): clamp the 4h foreground poll_approval_status that is
the dominant trigger; add a CI job to execute these unittest files
(static-checks currently runs only py_compile + pyflakes).

Fixes AI-GATEWAY-3J

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@anonpran anonpran requested a review from a team June 23, 2026 11:12
Comment on lines +155 to +163
a benign no-op. The host may close the read end (timeout, cancel, session
end, blocked approval-poll) before we flush — that is not a hook error."""
try:
sys.stdout.write(text + "\n")
sys.stdout.flush()
except (BrokenPipeError, OSError):
try:
sys.stdout = open(os.devnull, "w")
except Exception:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Silent OSError swallow is broader than intended

_emit catches (BrokenPipeError, OSError), but BrokenPipeError is already a subclass of OSError, so the tuple is redundant and the bare OSError catch covers every OS-level I/O error on stdout — not just closed-pipe scenarios. A non-pipe OSError (e.g., ENOSPC on a redirected stdout, a permission error on a named pipe used for IPC) would be silently swallowed, stdout swapped to /dev/null, and no error reported anywhere. The same pattern is duplicated in codex/hooks/unbound.py, copilot/hooks/unbound.py, and cursor/unbound.py.

Comment on lines +1737 to +1743
except BrokenPipeError:
# Host closed the read end of our stdout pipe (timeout / cancel /
# session end / blocked approval-poll). Benign — do not self-report.
try:
sys.stdout = open(os.devnull, "w")
except Exception:
pass

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Broken-pipe path is completely dark — no way to confirm the fix is working

The except BrokenPipeError handler redirects stdout to /dev/null and exits silently. Nothing is written to stderr, no counter is incremented, and no gateway call is made. Once deployed, the team can only verify the fix by watching AI-GATEWAY-3J go quiet in Sentry. If broken-pipe events spike again (e.g., a new 4h foreground poll timeout trigger), or if the fix has a regression in an edge case, there is no signal to detect it. Even a single print("broken pipe suppressed", file=sys.stderr) (or a non-Sentry lightweight counter via the existing gateway metrics path) would allow the team to confirm frequency post-deploy. This same gap exists in all four hooks.

@vigneshsubbiah16 vigneshsubbiah16 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛡️ Automated Security Review (consensus)

0 findings — 0 high-confidence, 0 to triage. Reviewers: Cursor, Claude, Semgrep, Gitleaks.

✅ Security consensus: no issues found. (reviewers: Cursor, Claude, Semgrep, Gitleaks)

Previously acknowledged (not re-flagged)

  • Broad OSError catch in _emit — PR intentionally swallows BrokenPipeError/OSError on stdout and redirects to os.devnull; Greptile P2 robustness concern, not a security defect.
  • Silent broken-pipe path (no observability) — Accepted trade-off for this fix; post-deploy validation via Sentry quieting; explicit follow-up deferred out of scope.
  • Semgrep insecure-file-permissions (chmod 0o755 / $BITS) — Pre-existing code on unchanged lines; not introduced by this diff.
  • Semgrep sqlalchemy-execute-raw-query (cursor/unbound.py:510) — Pre-existing local SQLite usage; unchanged by this PR.

🤖 consensus review · reviewers: Cursor,Claude,Semgrep,Gitleaks · head 4c1c7abd · 2026-06-23T11:24Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants