Skip to content

Replace Typesense with Postgres FTS behind BUZZ_SEARCH_BACKEND flag#1259

Open
tlongwell-block wants to merge 10 commits into
mainfrom
eva/pg-fts-integration
Open

Replace Typesense with Postgres FTS behind BUZZ_SEARCH_BACKEND flag#1259
tlongwell-block wants to merge 10 commits into
mainfrom
eva/pg-fts-integration

Conversation

@tlongwell-block

Copy link
Copy Markdown
Collaborator

What

Replace Typesense with Postgres full-text search for NIP-50, behind a staged
BUZZ_SEARCH_BACKEND flag (typesense | postgres | disabled, default
postgres
). Typesense remains selectable for rollback; disabled fails
closed.

How search is implemented

  • Index: an expression GIN index idx_events_content_fts ON events USING GIN (to_tsvector('simple', content)) (migration 0004_search_fts.sql).
    Chosen over a generated content_tsv STORED column for a smaller diff and a
    clean back-out (drop one index).
  • Backend: buzz-search gains a postgres module that renders the
    identical to_tsvector('simple', content) query the index serves, with
    ts_rank_cd (cover-density) relevance ordering. SearchService::disabled()
    is a no-op that returns empty for every query.
  • Wiring: BUZZ_SEARCH_BACKEND is parsed in config.rs (defaults to
    postgres), threaded through the relay handlers and the Helm chart.

Two non-negotiable gates

Gate #1 — no visibility widening. Search never returns an event the caller
couldn't otherwise read. The backend only returns candidate IDs; handle_search_req
independently re-authorizes every hit before emission — filters_match,
accessible-channel check, reader_authorized_for_event, and author-only-kind
check (req.rs:455-471). This post-filter is downstream of and independent from
the backend, so backend choice cannot widen visibility by construction.

Gate #2disabled fails closed. With BUZZ_SEARCH_BACKEND=disabled,
every NIP-50 query returns empty regardless of how well content would match.

Testing

Full e2e matrix, green across all three backends (see TESTING.md
"Search Backend Test Matrix"):

Backend e2e search suite
typesense 9/9
postgres 9/9 (identical parity)
disabled 6/6 incl. test_nip50_search_disabled_fails_closed

Highlights:

  • test_nip50_search_cross_author_isolation — gate Dependency Dashboard #1: an outsider gets 0
    hits searching a private channel they aren't a member of (the author, a
    member, still finds their own message — non-vacuous control).
  • test_nip17_gift_wrap_not_searchable — gate Dependency Dashboard #1, backend-agnostic: kind:1059
    gift wraps never surface via search; a kind:9 control does.
  • test_nip50_search_disabled_fails_closed — gate Initial release — Sprout Nostr relay with enterprise extensions #2: a would-match query
    returns empty under disabled.
  • test_nip50_search_relevance_order — proximity-based ts_rank_cd ordering
    (a more-relevant older message ranks above a less-relevant newer one).

Unit suites: buzz-search 30/30, buzz-db 70/70 (incl. both migration tests
exercising the expression-index swap on fresh + baselined schemas).

Rollback

Set BUZZ_SEARCH_BACKEND=typesense (no schema change needed) or
BUZZ_SEARCH_BACKEND=disabled. To remove the Postgres artifacts entirely:
DROP INDEX idx_events_content_fts.

npub17jjz49l9jjmhhk7cac63j8yt9z555n9cw8vk7v5jz4vzw4ppld5qgj57cc and others added 10 commits June 24, 2026 19:27
Adds a Postgres full-text search backend as an alternative to Typesense for
NIP-50 search, gated behind BUZZ_SEARCH_BACKEND=typesense|postgres|disabled
(default typesense — no behavior change for existing deployments).

The replacement is structural: NIP-50 search is the only Typesense call site,
and the read path already refetches canonical events from Postgres by id, so
Typesense was just a lookup index in front of the DB that owns the data. A
generated stored tsvector column + GIN index gives the same shape with zero
write-path code change.

Changes
- migrations/0004_search_fts.sql: events.content_tsv GENERATED ALWAYS AS
  to_tsvector('simple', content) STORED, GIN index, cascades to partitions.
- crates/buzz-search: SearchBackend enum (Typesense | Postgres | Disabled),
  SearchService::with_postgres / ::disabled, postgres.rs backend impl,
  backend-neutral SearchQuery (structured kinds/authors/channel_ids/since/until;
  each backend renders its own filter).
- crates/buzz-relay/src/config.rs: BUZZ_SEARCH_BACKEND env wired with strict
  parsing (unknown value → ConfigError::InvalidValue, no silent fallback) +
  3 unit tests.
- crates/buzz-relay/src/main.rs: dispatch on backend; Postgres → with_postgres
  using db.pool(); Disabled → no-op; Typesense → existing path. ensure_collection
  only runs for the Typesense backend.
- crates/buzz-relay/src/{handlers/req.rs, api/bridge.rs}: swap to the new
  SearchService surface. Caller code shrinks — filter parts were already
  structured.
- crates/buzz-db/src/lib.rs: Db::pool() accessor for the PG backend.

Validation (against parent 2e426b2, PG17 side-deployed):
- buzz-search lib: 29/29 pass.
- buzz-relay config tests: 11/11 (incl. 3 new).
- NIP-50 e2e on Typesense backend: 5/5 pass (regression baseline).
- NIP-50 e2e on Postgres backend: 5/5 pass — including
  test_nip50_search_relevance_order, confirming ts_rank_cd ranks correctly
  for the NIP-50 query shape and the 'simple' tokenizer config is acceptable.
- Wider e2e_nostr_interop sweep on Postgres: 19/23. The 4 failures reproduce
  identically on Typesense backend on this branch — pre-existing test-fixture
  coupling to a hard-coded 'events' collection name, not a regression.

This is additive: Typesense remains default; nothing in the existing path is
removed. Operators flip BUZZ_SEARCH_BACKEND per release to A/B/rollback.

Signed-off-by: Tyler <109685178+tlongwell-block@users.noreply.github.com>
Co-authored-by: Sami <f4a42a97e594b77bdbd8ee35191c8b28a94a4cb871d96f32921558275421fb68@sprout-oss.stage.blox.sqprod.co>
SearchQuery::new now requires non-empty channel_ids, returning
SearchError::EmptyChannelScope otherwise. Fields are pub(crate) so
struct-literal construction outside the crate is impossible; optional
facets use #[must_use] builder methods (with_kinds/authors/since/
until/page/per_page).

Closes the type-system gap on Eva's gate-1 "no visibility widening"
invariant: the access boundary is now enforced at construction, not
just at the call sites. Both call sites (req.rs, bridge.rs) wrap
SearchQuery::new in a match — req.rs logs + breaks pagination on the
Err path, bridge.rs continues the filter loop. Upstream guards
(build_search_channel_scope_filter + the per-filter h_tag validity
check) keep the Err path unreachable in normal operation, but if a
future refactor lets an empty scope through, behavior is "no results"
not "widened search".

Also adds the missing `info!("Search backend: typesense", ...)` log
line for symmetry with the postgres/disabled branches — small
operational polish, no behavior change.

Tests: buzz-search 30/30 (+1 rejection test), buzz-relay lib 337/337,
NIP-50 e2e 5/5 on both Postgres and Typesense backends (4 NIP-50 +
test_ws_search_isolation_other_user_cannot_find_reminder).

Co-authored-by: Tyler <109685178+tlongwell-block@users.noreply.github.com>
Signed-off-by: Tyler <109685178+tlongwell-block@users.noreply.github.com>
Implements Eva's blocker-2 fix: Postgres backend now orders search
results by relevance, and the e2e test that claims to verify this
actually discriminates rank from recency.

postgres.rs
- SELECT list now includes `ts_rank_cd(content_tsv,
  plainto_tsquery('simple', $q)) AS rank` when the query has
  searchable text. The same `$q` parameter slot is reused in WHERE.
- ORDER BY rank DESC, created_at DESC when has_text; empty/"*"
  queries skip the rank column and fall back to created_at DESC
  (no needless tsquery cost).
- SearchHit.score is populated from the rank column (f32 widened
  to f64). Empty/"*" queries leave score at 0.0.

e2e_nostr_interop::test_nip50_search_relevance_order
- Redesigned to discriminate rank from recency. Previous fixture
  used "alpha bravo charlie" with msg3="alpha bravo" — plainto_tsquery
  ANDs all terms, so msg3 never matched the WHERE clause and the
  test passed trivially with one candidate (Eva caught this).
- New fixture: query "{prefix} alpha bravo"; msg1 (oldest) has
  terms adjacent (high rank); msg2 (middle) doesn't match at all;
  msg3 (newest) has terms separated by filler (lower rank).
- Asserts both msg1 and msg3 are present, then asserts events[0].id
  == id1 with no `||content.contains(...)` escape hatch.
- Discriminator is term proximity, not term frequency: Typesense's
  default _text_match does NOT reward repeated query terms (verified
  empirically — identical tm scores for "alpha bravo" vs
  "alpha alpha bravo bravo"), but BOTH backends reward adjacency.
  Proximity is the property both backends agree on.
- New `send_rest_message_at` helper pins created_at via
  `custom_created_at`. Without explicit timestamps, three back-to-back
  sends share one wall-clock second; PG falls to heap-scan order and
  masquerades as rank ordering. Spreading by 30s each makes the
  recency-only ordering deterministically put msg3 first, so a
  passing test really means rank wins.

Validation
- buzz-search lib: 30/30. buzz-relay lib: 337/337.
- NIP-50 e2e on Postgres: 4/4 (incl. relevance_order) + isolation 1/1.
- NIP-50 e2e on Typesense: 4/4 + isolation 1/1.
- Proof of discrimination: with postgres.rs reverted to
  `ORDER BY created_at DESC`, the new test FAILS on PG (msg3 first,
  as predicted). Restored ts_rank_cd ordering after.

Pre-existing failure not introduced by this commit:
test_nip17_gift_wrap_not_searchable fails on both backends — it queries
Typesense directly at events-spike-{backend}; on the PG backend that
collection is never written to (structurally expected), and the
Typesense-backend failure is the same fixture coupling Eva already
acknowledged in the prior turn. No regression vs e5869dd/4b7c8d12.

Co-authored-by: Tyler <109685178+tlongwell-block@users.noreply.github.com>
Signed-off-by: Tyler <109685178+tlongwell-block@users.noreply.github.com>
Replace the GENERATED ALWAYS ... STORED tsvector column + column-backed
GIN index with a single expression index on to_tsvector('simple', content).

Why: the expression index is maintained by Postgres on every INSERT/UPDATE
exactly like a column index (no write-path work), but avoids the stored
column's ALTER TABLE row rewrite / ACCESS EXCLUSIVE backfill — so the
migration build is online-safe on a fresh/small relay and the same-named
index (idx_events_content_fts) is pre-buildable out of band on a large
populated relay (CREATE INDEX CONCURRENTLY + ATTACH per the live-relay
runbook). IF NOT EXISTS makes the migration idempotent against that path.

The query path renders the identical to_tsvector('simple', content)
expression so the planner uses the index. Rank SQL (ts_rank_cd) is
unchanged in behavior.

- migrations/0004_search_fts.sql: single CREATE INDEX IF NOT EXISTS expr index
- schema/schema.sql: drop generated column; expr index for fresh installs
- crates/buzz-search/src/postgres.rs: query refs -> to_tsvector(...) expr
- crates/buzz-db/src/migration.rs: assertions match new shape
- crates/buzz-search/src/lib.rs, crates/buzz-relay/src/main.rs: doc wording

Tests: cargo test -p buzz-db -p buzz-search (incl. ignored PG tests):
118 passed, 0 failed.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
…author search isolation

test_nip17_gift_wrap_not_searchable previously queried Typesense directly to
prove kind:1059 was never indexed — meaningless on the Postgres backend, where
every event lives in the events table and there is no separate index to skip.
Rewrite it to issue a NIP-50 search and assert the relay never surfaces the
gift wrap (kind:9 control IS returned, kind:1059 is NOT). That guarantee comes
from the relay's auth gates + filters_match post-filter, which are identical
across all three backends, so the test now guards typesense/postgres/disabled.

Add test_nip50_search_cross_author_isolation: an outsider who never joined an
author's channel gets zero hits when searching that channel's exact #h + token,
while the author finds their own message — proving the channel-scope clamp in
handle_search_req (gate #1, no visibility widening) holds. Also relay-side and
backend-independent.

Compiles clean (cargo test -p buzz-test-client --test e2e_nostr_interop
--no-run); the full matrix run lands after Max's backend-flag wiring is rebased
in.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Co-authored-by: npub1mprnacetjua2xx3p5eddmhxyk6wv929ymm5py8kd2xfxurxahspqqlgyta <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: npub1mprnacetjua2xx3p5eddmhxyk6wv929ymm5py8kd2xfxurxahspqqlgyta <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co>
(cherry picked from commit 8636898)
Co-authored-by: Max <d8473ee32b973aa31a21a65adddcc4b69cc2a8a4dee8121ecd51926e0cddbc02@sprout-oss.stage.blox.sqprod.co>
Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
The cross-author isolation test was vacuously failing on BOTH backends:
it created an `open` channel, which is searchable by anyone by design
(get_accessible_channel_ids unions member channels with ALL open
channels), so the outsider legitimately found the author's message.

Switch the test to create_private_test_channel. In a private channel the
creator is bootstrapped as a member (so the author still finds their own
post — the non-vacuous control), while the outsider is not a member and
gets zero hits. This makes the test a true visibility-widening guard,
backend-independent by construction.

Adds create_private_test_channel / create_channel_with_visibility helpers;
create_test_channel now delegates to the open variant (no behavior change
for existing callers).

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Adds test_nip50_search_disabled_fails_closed: posts a message that the
postgres/typesense backends provably return, then asserts the relay
delivers EOSE with zero events. Proves BUZZ_SEARCH_BACKEND=disabled fails
closed — NIP-50 search leaks nothing regardless of how well content would
otherwise match.

Introduces a test_backend() helper reading BUZZ_TEST_BACKEND; the
disabled test early-returns (skips) unless the relay-under-test is the
disabled backend, so the full suite stays green against all three
backends.

Matrix verified green: typesense 9/9, postgres 9/9 (identical parity),
disabled 6/6 incl. the fail-closed assertion.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Adds a "Search Backend Test Matrix" section to TESTING.md covering the
BUZZ_SEARCH_BACKEND flag, the two gates (no visibility widening; disabled
fails closed), the per-backend test table with skip rationale, and how to
run the suite with BUZZ_TEST_BACKEND. Adds BUZZ_SEARCH_BACKEND to the
config reference.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
CI's Rust Lint and Windows Rust jobs run `cargo clippy --all-targets --
-D warnings`; the manual `min(MAX_PER_PAGE).max(1)` clamp pattern in the
Postgres backend tripped clippy::manual_clamp and failed both. Replace
with `.clamp(1, MAX_PER_PAGE)` — identical result (1 <= 250).

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant