Skip to content

docs(spec): in-app feedback via github-proxy worker#404

Open
intendednull wants to merge 27 commits intomainfrom
claude/add-feedback-system-XP7YR
Open

docs(spec): in-app feedback via github-proxy worker#404
intendednull wants to merge 27 commits intomainfrom
claude/add-feedback-system-XP7YR

Conversation

@intendednull
Copy link
Copy Markdown
Owner

Adds the design for a willow-feedback worker that proxies in-app
feedback to GitHub issues, plus a settings-page UI in willow-web.
Reuses the existing worker request/response gossip pathway rather
than introducing a new ALPN for v1.

claude added 27 commits April 27, 2026 08:35
Adds the design for a willow-feedback worker that proxies in-app
feedback to GitHub issues, plus a settings-page UI in willow-web.
Reuses the existing worker request/response gossip pathway rather
than introducing a new ALPN for v1.
Round-1 parallel review (architecture / wire / security / scope) flagged
~10 critical and ~25 notable issues. Major changes:

- Make WorkerRole::handle_request async (was sync, blocked HTTP from
  inside the role); document migration impact for replay/storage.
- Add forward-compat note: v1 workers drop the whole envelope on
  unknown variants; mark the affected enums #[non_exhaustive] and
  document why PROTOCOL_VERSION is not bumped.
- Align FeedbackErrReason with WireRejectReason units (ms not secs);
  add idempotency dedup_id, structured GithubFailure { message },
  typed ClientPlatform enum, FeedbackCategory::Other { detail },
  locale, currently_rate_limited gauge.
- Drop EndpointId from public Client::submit_feedback API; resolve
  worker peer ID from client config; return parsed url::Url.
- Spec __WILLOW_FEEDBACK_PEER_ID window global wiring through
  init.js plus .dev/feedback-peer-id dev plumbing through dev.sh.
- Mandatory fenced-code-block sanitization of user body (defeats
  @mentions, autolinks, image exfil, metadata-block spoofing).
- Add global 50/hour rate limit (per-peer alone is bypassed by free
  identity rotation); 401 transitions to Unconfigured permanently;
  403 secondary-rate-limit cooldown; restart-throttle.
- Salted-hash reporter handle (rotateable salt for incident
  response); no raw peer ID in issue body; coarse-grained UA only.
- secrecy::SecretString for PAT; no Debug derive; logging never
  contains PAT/salt/title/body.
- Per-request structured logs; configured-but-unreachable UI states
  with explicit copy table and GitHub-direct fallback link.
- Fix Docker paths (docker/feedback.Dockerfile, not crates/...);
  --print-peer-id / --generate-identity flags; depends_on: relay;
  GITHUB_TOKEN via compose .env.
- Honest content-moderation trade-off; expanded follow-ups list.
Round-2 parallel review surfaced several concrete implementation gaps
that round-1's broad rewrite glossed over, plus tightening on
sanitization and abuse-protection edge cases.

Critical fixes:
- Replace hand-wavy trunk static-file serving with concrete approach:
  copy crates/web/dev_assets/feedback-peer-id.txt into dist/ via a
  data-trunk copy-dir directive in index.html.
- Add docker/web-entrypoint.sh that substitutes WILLOW_RELAY_URL and
  WILLOW_FEEDBACK_PEER_ID into the served init.js at container start.
  Replaces today's stock-nginx web Dockerfile (which had no entrypoint
  and required build-time relay URL injection).
- Tighten body-fence sanitization to the precise CommonMark close-fence
  rule: scan with regex ^[ ]{0,3}`{N,}[ \t]*$, normalize CRLF, choose
  fence length max(3, N_max + 1). Adds adversarial test cases (tilde
  fences, indented fences, info-string tricks, HTML entities).
- Restate body cap as <= 65,000 chars (not 60 KB bytes; GitHub counts
  codepoints).
- Bump reporter handle hash from 6 bytes to 8 bytes; pin BIP-39 English
  wordlist (vendored) with 4 words from 44 bits + 5-hex suffix from
  remaining 20 bits; add salt-rotation runbook.
- Restart-throttle uses O_CREAT|O_EXCL atomic check + fixed file path
  (<dir>/.feedback-last-boot inside the named volume); pair with
  compose `restart: on-failure:5` so wedged workers don't flap.
- Add --generate-salt CLI flag (referenced in salt section but missing
  from flag table).

Notable:
- Length caps on FeedbackError::BadIssueUrl (512 chars) and
  FeedbackErrReason::InvalidInput { field: 64, message: 200 } to bound
  worker-supplied error formatting.
- GitHub-direct fallback URL hardcoded to https://github.com/ with
  owner/repo regex validation at both worker startup AND web side
  (prevents javascript:/data: injection via mis-set FEEDBACK_REPO).
- User-facing transport-privacy notice always rendered below Submit:
  "Your report is visible to Willow infrastructure peers in transit
  until E2EE ships."
- Diagnostics non-mutation invariant + test: worker MUST post
  diagnostic fields byte-equal to those received; no server-side
  enrichment.
- Cross-signer cache poisoning test: two distinct signers with same
  dedup_id get distinct issue URLs.
- Logging-redaction test parameterized over every reachable code path
  (each FeedbackErrReason variant, cooldown, 401 transition,
  idempotency hit, panic path).
- on_event no-op note clarifying async back-pressure is safe for
  FeedbackRole specifically.
- Cite both state.rs:113 and heartbeat.rs:124 for trait migration
  (round-1 listed only one).
- EndpointId::Display reference for bech32 peer ID format.
- .env gitignore + PAT revocation runbook + blast-radius enumeration.
- Exact justfile aggregate edit for test-workers.
- Forward-compat: legacy clients dropping FeedbackOk responses is also
  benign (mirror case).
- Update reporter-handle examples to match the 5-hex suffix the
  bit-arithmetic spec calls for (previously showed 4-hex 3a9c which
  contradicted the 44-bit + 20-bit split).
- Remove duplicate "Reused gossip request path" trade-off paragraph
  (residue of round-2 editing); keep the better-positioned version
  that follows Forward compatibility.
- Replace utime() portability hand-wave with concrete advice:
  tempfile + rename, or filetime::set_file_mtime.
Fresh-eyes round-3 reviewer found one critical issue and a handful of
notable gaps. Addressed:

- Critical: WorkerRole::handle_request had no signer parameter, but
  the spec said the role recovers signer from unpack_wire. Trait
  signature now explicitly takes `signer: EndpointId`; actor pulls
  it from the (WireMessage, EndpointId) tuple end-to-end. Replay
  and storage roles ignore the parameter.
- Pin async-trait crate (was: "or hand-rolled Pin<Box<dyn Future>>").
- Worst-case title math was wrong: [Other:<60>] is 71-char prefix,
  not ~50, so cap goes from 250 to 280 with truncation-with-... if
  GitHub's 256 ceiling is exceeded.
- Worker HTTP timeout (20s) explicitly bounded below client's 30s
  total, with honest documentation of the rare slow-then-retry
  duplicate-issue race and a follow-up to dedup via GitHub search.
- Pin token bucket refill semantics: continuous refill (5/3600s for
  per-peer, 50/3600s global), retry_after_ms is the precise time to
  next available token.
- Spell out crates/web/dev_assets/ contents (.gitignore content,
  .gitkeep, exact index.html insertion point near the existing
  data-trunk copy-file directive).
- Verify trunk does not asset-hash copy-file outputs, so hardcoded
  /usr/share/nginx/html/init.js path in entrypoint is safe; CI
  smoke test for placeholders.
- Cache-busting via __INJECT_BUILD_TAG__ placeholder in
  <script src="/init.js?v=...">, third entrypoint substitution.
- Note willow-agent does not impl WorkerRole, so trait migration
  doesn't touch it.
- GitHub API JSON fixtures live in
  crates/feedback/tests/fixtures/github/, captured from official
  docs / one-time real curl.
Extends WorkerRoleInfo, WorkerRequest, and WorkerResponse with Feedback
variants; adds FeedbackErrReason enum; marks all three enums
#[non_exhaustive]. Adds wildcard arms to downstream match sites in
willow-replay, willow-storage, and willow-worker (test double + integration
test) that were made non-exhaustive by the attribute.

https://claude.ai/code/session_01F3RA1a1rcNxM83ZsQPnTZX
SyncBatch and Snapshot previously fell through to `_ => false`,
making any `assert_eq!` self-comparison silently return false.
Adds explicit arms for both: SyncBatch compares by event content
hashes (the canonical identity of an Event); Snapshot compares
by its documented canonical SHA-256 hash plus post-snapshot event
hashes. Adds regression tests for both variants.

https://claude.ai/code/session_01F3RA1a1rcNxM83ZsQPnTZX
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants