Skip to content

fix(firewall): close redaction-leak egress paths and add canary suite (#149, #150, #151, #172, #206)#229

Open
dgenio wants to merge 1 commit into
mainfrom
claude/github-issues-triage-npnfcs
Open

fix(firewall): close redaction-leak egress paths and add canary suite (#149, #150, #151, #172, #206)#229
dgenio wants to merge 1 commit into
mainfrom
claude/github-issues-triage-npnfcs

Conversation

@dgenio

@dgenio dgenio commented Jun 14, 2026

Copy link
Copy Markdown
Owner

What changed

Closes the four context-firewall redaction-leak paths an internal audit identified, and adds the regression net that locks them. One coherent change set, all in the firewall/trace egress area.

Why

The firewall is the I-01 boundary (docs/agent-context/invariants.md): raw tool output must never reach the LLM unredacted. Redaction was only guaranteed on the first transform() summary/table path; expansion, streaming, the depth boundary, and the trace store were independent egress paths that could leak. These four issues were filed together (#206 explicitly frames #149/#150/#151 as the three audit-found leaks), so they share a code area, a bug category, and an implementation path — and are cleanest fixed and regression-tested as one unit.

How verified

All commands run in a clean venv (pip install -e ".[dev]"), mirroring make ci (fmt-check → lint → type → test → example):

  • ruff format --check src/ tests/ examples/ — 96 files already formatted
  • ruff check src/ tests/ examples/ — All checks passed
  • mypy src/ — Success: no issues found in 52 source files
  • pytest -q660 passed, 1 skipped (incl. 11 new canary tests)
  • All 11 examples/*.py (the make ci example target) — run clean

Each canary test fails on main (the leak paths) and passes here; the depth/expand/stream/trace cases were confirmed against the pre-fix behavior.

Tradeoffs / risks

  • Streaming holdback delays the trailing overlap window of a string field until the next chunk or end-of-stream — a deliberate safety-for-latency trade. Documented limit: a pattern containing internal whitespace (phone/SSN/spaced card) split exactly at the held boundary may still evade detection (noted in docs/security.md); contiguous secrets (JWT/Bearer/API-key/connection-string body) are never split across a commit boundary.
  • Depth boundary now elides nested containers rather than returning them — a behavior change (over-redaction at depth, fail-closed). Acceptable for a security boundary; no existing test relied on the old fail-open behavior.
  • Extend trace-argument redaction beyond memory.* capabilities #172 runs redact() over all trace args; benign scalars/values are returned unchanged, so audit value is preserved.

Scope notes

Limited to the recommended issue group from the triage (#149, #150, #151, #172, #206). Adjacent firewall items intentionally left as follow-ups: #207 (avoid full-payload serialization in the size hot path — same file, but a perf concern), #174 (summarizer truncation/boolean handling), #178 (operator-configurable redaction patterns), #214 (spotlighting/untrusted-output delimiting).

https://claude.ai/code/session_01Gq2ooVRbX8rxi5d3dvREN6


Generated by Claude Code

Route every Frame/trace egress through the firewall redactor so secrets and
PII cannot escape the I-01 boundary via a non-summary path:

- #149: redact() fails closed at max_depth — scrub leaf strings, elide nested
  containers instead of returning them verbatim.
- #150: HandleStore.expand() redacts projected rows (inline secrets in
  grant-permitted fields are scrubbed); expansion Frames carry warnings.
- #151: Firewall.apply_stream() holds back a per-field overlap window via a new
  StreamRedactor so a secret split across chunks is reassembled before emission.
- #172: ActionTrace.args (all capabilities) and driver error text pass through
  the redactor before persistence; memory.* payload stripping unchanged.
- #206: new tests/test_firewall_canary.py asserts planted canaries never appear
  in any egress (summary/table/raw Frames, expansion, streaming, trace
  args/errors, adapter payloads).

Docs (security.md, context_firewall.md) and CHANGELOG updated, including the
honest cross-chunk whitespace-pattern limit.

https://claude.ai/code/session_01Gq2ooVRbX8rxi5d3dvREN6
Copilot AI review requested due to automatic review settings June 14, 2026 14:21

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the context firewall’s redaction guarantees by closing multiple previously-independent egress paths (depth boundary, handle expansion, streaming chunk boundaries, and trace persistence) and adding a canary-based regression suite to enforce the global “no secret escapes” invariant.

Changes:

  • Make redact() fail closed at max_depth, eliding nested containers while still scrubbing leaf strings.
  • Add cross-chunk streaming redaction via a stateful StreamRedactor carried per top-level field in Firewall.apply_stream().
  • Redact sensitive content in handle expansion and in persisted traces (args + error text), and add a comprehensive secret-canary test suite plus doc/changelog updates.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_firewall_canary.py Adds an end-to-end “canary never appears in any egress” regression suite covering depth, expansion, streaming, traces, and adapter payload rendering.
tests/test_firewall_boundary.py Updates the trace-args test docstring to reflect that args now pass through redaction for all capabilities.
src/weaver_kernel/kernel/_invoke.py Applies redaction to all stored trace args and scrubs driver error text before persisting failure traces.
src/weaver_kernel/handles.py Routes handle expansion previews through redact() and surfaces redaction warnings on the returned Frame.
src/weaver_kernel/firewall/transform.py Adds per-field streaming redaction state (StreamRedactor) and merges stream redaction warnings into emitted Frames.
src/weaver_kernel/firewall/redaction.py Refactors string redaction into _redact_string(), changes max-depth behavior to fail closed, and introduces StreamRedactor.
docs/security.md Documents the newly-covered egress paths and the depth/streaming redaction behaviors and limitations.
docs/context_firewall.md Describes redaction coverage across depth, expansion, streaming, and trace persistence.
CHANGELOG.md Records the security fixes and the added canary regression suite.

Comment thread src/weaver_kernel/firewall/redaction.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants