fix(node): gate the ref-updates feeds on read visibility (#112, #114)#143
fix(node): gate the ref-updates feeds on read visibility (#112, #114)#143beardthelion wants to merge 10 commits into
Conversation
,#114) Add ref_update_row_visible: given the deduped local repo set + their visibility rules, decide whether a received_ref_updates row (keyed by its peer-supplied slug) is visible to the caller. Matches slug->repo by name plus prefix-tolerant owner key, and drops any row matching a local repo the caller cannot read at root; rows matching no local repo (remote/ gossip-only) and malformed slugs pass through. Fail-closed by construction (no default-keep). Pure/I-O-free so visibility.rs stays exhaustively unit-tested. Shared gate for the GraphQL (#112) and REST (#114) ref-updates feeds; call sites land in the following commits. Includes a DB-layer test pinning that a private canonical row beats a public mirror row in list_all_repos_deduped (else the feed gate would over-serve).
The GraphQL ref_updates resolver mapped every received_ref_updates row straight out with no visibility gate, so an anonymous caller could read a private repo's ref metadata (refName, SHAs, pusherDid, nodeDid) by name. Thread the caller DID and retain rows through the shared ref_update_row_visible filter (deduped local set + rules, same '/' decision the repos resolver uses); applies to both the repo:Some and repo:None branches. Remote/gossip-only rows still pass. Removes the now-live filter's dead_code allow. Tested via the GraphQL schema: anon leak reproduced before the gate (RED) and closed after (GREEN); owner still sees their private row; full-DID and truncated-key slug aliases fail closed; remote rows kept.
GET /api/v1/events/ref-updates returned every received_ref_updates row across all repos, including private ones, to any anonymous caller (the REST analog of #112). Thread the caller DID via optional_signature and retain rows through the shared ref_update_row_visible filter, deriving count from the filtered set. Keeps the crate::error::Result return type; the deduped/rules loads fail closed (500) on a DB error rather than serving unfiltered. Remote/gossip-only rows still pass. Tested via the crate's HTTP-API harness: anon leak reproduced before the gate (RED) and closed after (GREEN); owner still sees their private row; mixed feed yields only public rows with a filtered count; full-DID and truncated-key slug aliases fail closed; remote rows kept.
…unce (#112,#114) Code review surfaced a third reader of the same ref metadata the query and REST feeds gate: the GraphQL subscription. /graphql/ws is mounted after the optional_signature layer (no caller to gate against), and the push handler broadcast every ref update to it unconditionally — outside the if-announce block that already gates the sibling gossip and Arweave sends. So an anonymous websocket subscriber received live ref metadata (repo, refs, SHAs, pusher/node DID) for private repos as they were pushed — the subscription analog of the #112/#114 leak. Move the ref_update_tx broadcast inside the existing if-announce guard so only publicly-readable pushes reach the unauthenticated subscription, matching gossip and Arweave. Pin the gate decision with a replication_withheld_set test (announce false for private, true for public). Also harden the feed filter's owner-key normalization: normalize the slug owner to the last ':'-segment (same rule as record_key) instead of stripping only did:key:, so a multi-segment DID (did:web) and its full-DID slug match and fail closed rather than over-serving. Regression test added (RED under the old normalization).
…dd did:web keep test Code review flagged that the unauthenticated /graphql/ws ref_updates subscription now depends on a single-point invariant: safety rests entirely on every sender to ref_update_tx being announce-gated, since the resolver has no caller to gate against. Document that on the resolver so a future sender can't silently reopen the leak. Add the symmetric keep-side test for the did:web owner-key normalization (public multi-segment-DID repo row is kept for anon), so a regression that over-drops legitimate did:web rows is caught alongside the existing drop-side test.
📝 WalkthroughWalkthroughAdds fail-closed visibility gating for ref-update feeds in REST and GraphQL, pages feed reads through new DB helpers, and gates ref-update subscription broadcasts on ChangesRef-update visibility and announce gating
Estimated code review effort: 4 (Complex) | ~60 minutes Possibly related issues
Possibly related PRs
Suggested labels: Suggested reviewers: 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
jatmn
left a comment
There was a problem hiding this comment.
I found issues that need to be addressed before this is ready.
Findings
-
[P1] Preserve full non-
did:keyowners when matching ref-update rows
crates/gitlawb-node/src/visibility.rs:171
ref_update_row_visiblereduces both the row owner and every repo owner to the last:segment before doing the prefix match. That contradicts the existing owner-key rules indid_matchesandDEDUP_CTE, which strip only baredid:key:owners and intentionally keep non-key DID methods whole sodid:web:alice/widgetand another method/host ending inaliceremain different owners. With the current normalization, a private local repo such asdid:web:host:alice/widgetmakes an unrelated remote row likedid:gitlawb:alice/widgetordid:web:other:alice/widgetlook like a local match and get dropped for anonymous callers. Please reuse the same did:key-aware owner normalization as the repo dedupe/owner matching code, and add a regression proving same-name non-key DID owners with the same trailing segment do not collide. -
[P2] Apply the visibility filter before limiting the ref-update feeds
crates/gitlawb-node/src/api/events.rs:24
The REST feed loads onlylimitrows and then drops private rows in memory, and the GraphQL resolver does the same throughlist_ref_updates_filtered(repo.as_deref(), limit)before retaining visible rows. If the newest rows are private, an anonymous request such aslimit=5can return zero or fewer-than-five events even when older public ref updates exist, so the activity feed no longer returns the latest visible updates. Please either push the visibility predicate before the SQL limit or continue paging/over-fetching until the requested number of visible rows has been collected, and add a small-limit mixed public/private regression for both REST and GraphQL.
#112,#114) The comment claimed parity with did_matches/DEDUP_CTE, but the last-segment normalization deliberately diverges: the emitter (publish_ref_update) broadcasts the short, method-stripped slug, so a repo's own rows arrive keyed by the trailing segment and the gate must match on it to stay fail-closed. Reusing the did:key-aware keep-whole rule would fail open on a did:web repo's short slug. Add feed_private_didweb_short_slug_dropped_for_anon to pin that load-bearing canonical case; relabel the full-DID test as the non-canonical hand-forged form.
…114) The REST global feed and the GraphQL ref_updates resolver loaded `limit` rows and dropped private ones in memory, so an anonymous request could get fewer than `limit` events (or zero) when the newest rows named private local repos, even though older public ref updates existed. Add a shared collect_visible_ref_updates helper that applies the fail-closed gate before the limit: it over-fetches in bounded pages (128 rows, capped at max(limit, 2048) scanned) until `limit` visible rows are collected or the table is exhausted. Both feeds route through it so the one gate cannot drift. Back it with list_ref_updates_page (stable timestamp,id ordering for offset paging) and drop the now-unused list_ref_updates / list_ref_updates_filtered. The page size is a parameter on an inner fn so a regression can exercise multi-page paging. Regressions: newest-private, older-public, small limit still returns the latest visible rows on both surfaces (RED before, GREEN after); and a page=2 test pins offset paging across page boundaries (no skip or duplicate).
|
Thanks for the review. Both findings addressed in the two commits above. P1 (last-segment owner normalization). The over-drop you describe is real, but the suggested remedy would reopen the leak this PR closes, so I kept the normalization and fixed the comment that misdescribed it (that comment is what your finding correctly caught). The constraint is the emit side. Given that, reusing the did:key-aware keep-whole normalization fails open on the canonical case: a private So the divergence from Commit corrects the normalization comment (it wrongly claimed P2 (filter before the limit). Fixed. Both feeds now run the gate before the limit through a shared |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/gitlawb-node/src/db/mod.rs (1)
2120-2120: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winUse the same deterministic tie-break here.
Line 2120 still orders repo-scoped rows only by timestamp, so equal-timestamp rows around
LIMITcan be nondeterministic while the new paged helper is stable.Proposed fix
- FROM received_ref_updates WHERE repo = $1 ORDER BY timestamp DESC LIMIT $2", + FROM received_ref_updates WHERE repo = $1 ORDER BY timestamp DESC, id DESC LIMIT $2",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` at line 2120, The repo-scoped query in the received_ref_updates path still orders only by timestamp, which can make LIMIT results unstable when timestamps tie. Update the ORDER BY used by the repo filter query to include the same deterministic tie-break as the new paged helper, matching the existing stable ordering in the related received_ref_updates pagination logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/events.rs`:
- Around line 47-60: The visibility check in events handling is using
list_all_repos_deduped(), which excludes quarantined local mirrors and can cause
ref updates from those repos to be treated as remote rows. Update the logic
around the ref-update scan in the events path to use a local-match universe that
includes quarantined rows, or add a DB helper that returns quarantined repos as
unreadable visibility records, and ensure ref_update_row_visible() fails closed
for those slugs instead of allowing anonymous access.
- Around line 40-55: The shared collector in the events API is using the
caller’s raw limit, so GraphQL can bypass the intended bounded behavior. Update
the helper around the visible initialization and max_scan logic to clamp the
effective limit internally to the same upper bound used by REST before
allocating the Vec and calculating the scan cap. Make the fix in the shared
events collector function so both REST and GraphQL paths inherit the bound
automatically.
---
Outside diff comments:
In `@crates/gitlawb-node/src/db/mod.rs`:
- Line 2120: The repo-scoped query in the received_ref_updates path still orders
only by timestamp, which can make LIMIT results unstable when timestamps tie.
Update the ORDER BY used by the repo filter query to include the same
deterministic tie-break as the new paged helper, matching the existing stable
ordering in the related received_ref_updates pagination logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9e171b12-cbcc-4b73-96de-0dff323a389e
📒 Files selected for processing (4)
crates/gitlawb-node/src/api/events.rscrates/gitlawb-node/src/db/mod.rscrates/gitlawb-node/src/graphql/query.rscrates/gitlawb-node/src/visibility.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/gitlawb-node/src/graphql/query.rs
- crates/gitlawb-node/src/visibility.rs
… gate comment
The comment credited publish_ref_update with building the wire slug, but that
p2p method only broadcasts a pre-built RefUpdateEvent; git_receive_pack assembles
the {last-segment}/{name} slug and hands it over. Fix the attribution.
…114) Resolve two CodeRabbit findings on the ref-update feeds: - Clamp the effective limit inside the shared collector (max 200). REST already capped, but the GraphQL resolver passed its caller-provided limit uncapped, letting a large request return unbounded rows and scan unbounded DB rows. Both surfaces are now bounded identically. - Fold quarantined mirrors into the gate's match universe as unreadable. list_all_repos_deduped excludes quarantined rows, so a ref-update naming a quarantined local mirror matched nothing, was misclassified as remote, and was served to anon, bypassing the withheld-from-every-listing-surface contract. Add list_quarantined_repos and drop matching rows fail-closed. Regressions: oversized-limit clamp (201 rows -> 200) and quarantined-mirror withheld-from-anon, both RED before, GREEN after.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
crates/gitlawb-node/src/db/mod.rs (2)
2133-2136: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winUse the same deterministic tie-break as the paged feed.
list_ref_updates_pageorders bytimestamp DESC, id DESC, but this repo-scoped query limits ontimestampalone. Addid DESCso tied timestamps don’t make the per-repo feed nondeterministic.Proposed fix
- FROM received_ref_updates WHERE repo = $1 ORDER BY timestamp DESC LIMIT $2", + FROM received_ref_updates WHERE repo = $1 ORDER BY timestamp DESC, id DESC LIMIT $2",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` around lines 2133 - 2136, The repo-scoped ref updates query in list_ref_updates should use the same deterministic ordering as list_ref_updates_page, since ordering only by timestamp can return tied rows nondeterministically. Update the SQL in the received_ref_updates query inside list_ref_updates to order by timestamp DESC, id DESC so rows with equal timestamps are consistently sorted.
2150-2176: 🩺 Stability & Availability | 🔵 Trivial | 🏗️ Heavy liftConsider keyset pagination for the live ref-update feed.
The shared collector walks this API with repeated
OFFSETqueries; inserts between page reads can shift offsets and cause skipped or duplicated rows in one response. A(timestamp, id)keyset cursor or a transaction snapshot would make the page boundary stable under concurrent writes.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/gitlawb-node/src/db/mod.rs` around lines 2150 - 2176, The pagination in list_ref_updates_page uses OFFSET-based queries, which can drift under concurrent inserts and cause skipped or duplicated ref updates. Update the list_ref_updates_page query path to use a stable keyset cursor based on the existing timestamp and id ordering, or otherwise ensure the page is read from a consistent snapshot. Keep the ordering by timestamp DESC, id DESC and thread the cursor through the ReceivedRefUpdate listing logic so repeated fetches from the shared collector remain stable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/gitlawb-node/src/db/mod.rs`:
- Around line 2133-2136: The repo-scoped ref updates query in list_ref_updates
should use the same deterministic ordering as list_ref_updates_page, since
ordering only by timestamp can return tied rows nondeterministically. Update the
SQL in the received_ref_updates query inside list_ref_updates to order by
timestamp DESC, id DESC so rows with equal timestamps are consistently sorted.
- Around line 2150-2176: The pagination in list_ref_updates_page uses
OFFSET-based queries, which can drift under concurrent inserts and cause skipped
or duplicated ref updates. Update the list_ref_updates_page query path to use a
stable keyset cursor based on the existing timestamp and id ordering, or
otherwise ensure the page is read from a consistent snapshot. Keep the ordering
by timestamp DESC, id DESC and thread the cursor through the ReceivedRefUpdate
listing logic so repeated fetches from the shared collector remain stable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b5f230eb-5731-41c4-8b1d-208633b29c14
📒 Files selected for processing (4)
crates/gitlawb-node/src/api/events.rscrates/gitlawb-node/src/db/mod.rscrates/gitlawb-node/src/graphql/query.rscrates/gitlawb-node/src/visibility.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- crates/gitlawb-node/src/visibility.rs
- crates/gitlawb-node/src/api/events.rs
- crates/gitlawb-node/src/graphql/query.rs
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the ref-update surfaces and found one issue that still needs to be addressed before this is ready.
Findings
- [P1] Gate the repo events feed on the caller's repo visibility
crates/gitlawb-node/src/api/events.rs:138
The new shared visibility collector is used for/api/v1/events/ref-updatesand GraphQLrefUpdates, but the adjacent repo-scoped REST feed still bypasses it./api/v1/repos/{owner}/{repo}/eventsis mounted underoptional_signature, yetlist_repo_eventsnever reads the caller DID or callsauthorize_repo_read; it resolves the repo withget_repo, then returns local cert events andreceived_ref_updatesrows directly throughlist_ref_certificates/list_repo_ref_updates. An anonymous caller who can guess a private repo's owner/name can still read its ref metadata from this endpoint, which leaves the same ref-history leak class open on a fourth surface. Please gate this handler with the same repo root visibility decision and cover the anonymous-private, owner-private, and quarantined-mirror cases for/api/v1/repos/{owner}/{repo}/eventsas well.
…#112, #114) GET /api/v1/repos/{owner}/{repo}/events (list_repo_events) served a repo's ref certificates and received gossip ref-updates with no caller identity and no visibility check, so an anonymous caller who guessed a private repo's owner/name read its ref metadata. It was the last repo-scoped reader in the #112/#114 ref-metadata leak class. Gate it in two layers, using the caller DID the optional_signature layer already supplies: - The cert half is gated by a repo-root authorize_repo_read (keyed by the unique repo record id): a denied, quarantined, or not-hosted repo returns an opaque 404 (RepoNotFound), never a 403. - The gossip half is filtered per row through the shared collect_visible_ref_updates. received_ref_updates rows are keyed by the lossy, non-unique wire slug {last-segment}/{name}, so two owners (did:web:a:alice, did:web:b:alice) collide on `alice/name` and a plain slug query would serve a colliding private repo's rows to anyone allowed to read this one; the per-row gate drops them fail-closed. Both halves now propagate DB errors as 500 instead of swallowing them into an empty 200. Removes the now-unused list_repo_ref_updates. Deliberate behavior change: a repo this node does not host returns 404 (it holds no visibility record for it and fails closed; remote gossip is read via the global /api/v1/events/ref-updates feed). Drops list_repo_events from the egress-guard KNOWN_UNGATED allowlist; the source-scrape staleness assert enforces the removal. Tests cover anon-private, owner-private (both datasets), anon/authed-public, quarantined, released-from-quarantine, non-owner, not-hosted, did:web (owner and anon), the slug collision, oversized-limit clamping, and DB-error-fails-closed on both the gate and the cert fetch.
|
Done. While gating the gossip half I found it also leaked across a slug collision. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/gitlawb-node/src/api/events.rs`:
- Around line 144-148: The `fetch_events` limit handling only caps the upper
bound, so a negative `limit` can still become a huge `usize` at `truncate(limit
as usize)`. Update the `limit` parsing/clamping in `events.rs` to reject or
clamp negative values before the cast, and keep the existing
`MAX_VISIBLE_REF_UPDATES` cap intact. Add a regression test around
`fetch_events` using `?limit=-1` with at least one local certificate event to
confirm the response is still limited.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a66ef7c8-b57f-4eb5-81bb-dd17cc4acc33
📒 Files selected for processing (3)
crates/gitlawb-node/src/api/events.rscrates/gitlawb-node/src/api/mod.rscrates/gitlawb-node/src/db/mod.rs
💤 Files with no reviewable changes (2)
- crates/gitlawb-node/src/api/mod.rs
- crates/gitlawb-node/src/db/mod.rs
| let limit = params | ||
| .get("limit") | ||
| .and_then(|v| v.parse::<i64>().ok()) | ||
| .unwrap_or(50) | ||
| .min(200); | ||
| .min(MAX_VISIBLE_REF_UPDATES); |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Clamp negative limits before casting to usize.
Line 148 only caps the upper bound. A request like ?limit=-1 reaches the final truncate(limit as usize) as usize::MAX, so an authorized caller can bypass the response limit for local certificate events.
Proposed fix
let limit = params
.get("limit")
.and_then(|v| v.parse::<i64>().ok())
.unwrap_or(50)
- .min(MAX_VISIBLE_REF_UPDATES);
+ .clamp(0, MAX_VISIBLE_REF_UPDATES);Please add a regression with ?limit=-1 and at least one local certificate.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let limit = params | |
| .get("limit") | |
| .and_then(|v| v.parse::<i64>().ok()) | |
| .unwrap_or(50) | |
| .min(200); | |
| .min(MAX_VISIBLE_REF_UPDATES); | |
| let limit = params | |
| .get("limit") | |
| .and_then(|v| v.parse::<i64>().ok()) | |
| .unwrap_or(50) | |
| .clamp(0, MAX_VISIBLE_REF_UPDATES); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/gitlawb-node/src/api/events.rs` around lines 144 - 148, The
`fetch_events` limit handling only caps the upper bound, so a negative `limit`
can still become a huge `usize` at `truncate(limit as usize)`. Update the
`limit` parsing/clamping in `events.rs` to reject or clamp negative values
before the cast, and keep the existing `MAX_VISIBLE_REF_UPDATES` cap intact. Add
a regression test around `fetch_events` using `?limit=-1` with at least one
local certificate event to confirm the response is still limited.
jatmn
left a comment
There was a problem hiding this comment.
Thanks for the update. I rechecked the ref-update surfaces and found one issue that still needs to be addressed before this is ready.
Findings
- [P2] Clamp negative event limits before casting
crates/gitlawb-node/src/api/events.rs:144
CodeRabbit's current review item is still valid here:list_repo_eventsonly applies.min(MAX_VISIBLE_REF_UPDATES), so?limit=-1stays negative until the finalall_events.truncate(limit as usize). That cast turns-1intousize::MAX, which bypasses the intended response cap for local ref-certificate events on/api/v1/repos/{owner}/{repo}/events. Please complete the review request by clamping the lower bound before the cast, and add the negative-limit regression with at least one local certificate event so this cannot come back.
- Drop idx_ref_updates_owner from Migration v10; no query reads owner_did yet and CREATE INDEX (non-CONCURRENT) inside a transaction takes a write-blocking lock on populated nodes for zero benefit. Defer the index to Gitlawb#143 alongside the query that uses it. - Expose owner_did on repo-scoped events feed: add the field to the gossip_events json! block in list_repo_events so GET /api/v1/repos/{owner}/{repo}/events is consistent with GET /api/v1/events/ref-updates. - Carry owner_did through GraphQL: add owner_did: Option<String> to RefUpdateBroadcast (state.rs) and RefUpdateType (types.rs), populate it from the DB row in query.rs, from the broadcast in subscription.rs, and from the trusted local record.owner_did in the broadcast send in repos.rs. - Add migration_v10_existing_node_upgrade test: seeds a v1..v9 state, strips the column and version row, re-runs migrate(), and asserts the column and v10 row appear. This is the regression guard: it fails if the DDL is accidentally moved back into v1. Refs Gitlawb#144
Three surfaces served
received_ref_updatesrows with no visibility check, so an anonymous caller who could guess or enumerate a private repo's name could read its ref metadata:refName,oldSha,newSha,pusherDid,nodeDid,timestamp, and the repo slug. Same withholding-bypass class as #97 (repo listing) and #110 (blob content), on the ref-history surface.ref_updatesquery mapped every row straight out; the siblingreposresolver was already gated, this one was not.GET /api/v1/events/ref-updatesreturned rows across all repos, no auth, no repo argument.ref_updatessubscription over/graphql/wswas a third reader of the same data. The push handler broadcast every ref update to it unconditionally, outside theif announceblock that already gates the sibling gossip and Arweave sends, so a private-repo push streamed live to any anonymous websocket subscriber. Fixed here too since leaving it open would only relocate the leak.Approach
One canonical, fail-closed decision reused everywhere, rather than a bespoke check per surface.
ref_update_row_visible(pure, invisibility.rs) takes the deduped local repo set plus their visibility rules and decides each row against the samelistable_at_rootgate thereposresolver uses. A row is dropped when its slug matches a local repo the caller cannot read; rows that match no local repo (remote, gossip-only) pass through under the existing origin announce-gate assumption. The classify has no default-keep: the only keep paths are all-matches-readable and no-local-match, so an unreadable match always drops.Two details that matter for correctness:
did:keymethod) and can only ever over-drop a genuinely remote row, never over-serve a private local one.list_all_repos_deduped, whose CTE picks the canonical row over any mirror row, so a private canonical repo is never masked by a public mirror row. A DB error during the gate load propagates (500) rather than serving unfiltered.The two query feeds filter to an empty result on deny (a non-existent repo returns empty too, so it discloses nothing); the live subscription is gated write-side, since
/graphql/wscarries no caller identity to gate against.Tests
Each gate has load-bearing regression coverage, confirmed failing before the fix and passing after: anonymous leak reproduced then closed, owner still sees their own private rows, and the alias / truncated-key / mirror-coexistence / multi-segment-DID cases all fail closed. A DB-layer test pins that a private canonical row beats a public mirror in dedup, and a
replication_withheld_settest pins theannounceboolean the subscription broadcast now gates on. Full workspace suite, clippy-D warnings, and fmt all clean.Closes #112
Closes #114
Summary by CodeRabbit