Skip to content

Venice edge functions: embeddings milestone + shared-config track#197

Draft
sysread wants to merge 14 commits into
mainfrom
claude/supabase-embeddings-edge-function-edsvW
Draft

Venice edge functions: embeddings milestone + shared-config track#197
sysread wants to merge 14 commits into
mainfrom
claude/supabase-embeddings-edge-function-edsvW

Conversation

@sysread
Copy link
Copy Markdown
Owner

@sysread sysread commented May 26, 2026

Lands the first milestone of the Venice edge functions project: moving embedding backfill from the browser Web Worker to a pg_cron-scheduled edge function, plus the shared-config infrastructure every later endpoint reuses.

Summary

Embedding generation is the natural first mover - it's already background work (no UI latency), the database side is already claim-RPC structured, and there are no streaming or file-upload complications. This milestone also establishes the shared-config pattern: a project-global Venice API key stored server-side in a singleton app_config table, seeded once by the project owner via mise run config-set, and read by both the edge function and the browser.

Key changes

  • app_config table + RLS in supabase/schema.sql: singleton table (one row via id boolean primary key) holding the shared Venice API key. Authenticated users may SELECT; writes happen only through the service role (via mise run config-set or later the edge function).

  • mise run config-set target (scripts/config-set.mjs): prompts for or reads the Venice API key from VENICE_API_KEY env, then upserts it into app_config via the Management API. No dashboard clicking, no per-user key entry.

  • Deno tooling in mise (.mise.toml): pins Deno 2.8.0 so deno test and deno lint/deno fmt work offline for the edge function island.

  • Planning documents (docs/dev/in-progress/venice-edge-functions/):

    • README.md: overall project shape, architecture decisions (one fat venice function with internal routing, shared config in a table not env secrets, app stays Node/Vite while functions are a Deno island), and the five-phase migration pattern.
    • embeddings.md: fleshed-out first milestone with scope boundary (backfill only, not query-time), current state, target state, the shared-config track in detail, step-by-step implementation plan, surface-area inventory, testing strategy, and definition of done.
    • chat-completions.md, billing-usage.md, text-parser.md: skeleton sub-plans to be informed by embeddings lessons.

Notable implementation details

  • Shared-config migration as a static check: keep both app.config (local encrypted) and app.serverConfig (server-side) as distinct in-memory values. Migrate each veniceApiKey consumer one at a time. When the last one is migrated, delete the field from the local AppConfig type - svelte-check/tsc then enumerates every remaining reader as a compile error. This turns "did we get them all" into a static enumeration, not a runtime hunt.

  • Deno island isolation: supabase/functions/ is excluded from the app tsconfig and carries its own deno.json. deno lint/deno fmt cover it; eslint/knip stay scoped to src/. No migration of the app to Deno - the cost/benefit is lopsided.

  • Deferred code sharing: src/lib/venice.ts (which already takes a fetchImpl) is tempting to share with the function, but Deno's import resolution differs from Vite's. Duplicate the minimal wire-shape into _shared/ for now; revisit in the consolidation phase.

  • Learning loop: the final step of the embeddings milestone is to fold lessons back into the four sibling sub-plans, so the next endpoint starts from earned knowledge rather than best guesses.

This is a planning + infrastructure commit. The actual edge function code, the pg_cron scheduler, and the browser worker deletion follow in subsequent commits once the shared-config table and seeder are proven.

https://claude.ai/code/session_01PxPaatuMm1kfxPcqh59LTx

@sysread sysread force-pushed the claude/supabase-embeddings-edge-function-edsvW branch 4 times, most recently from 6e81a88 to 3dfb63d Compare May 28, 2026 16:59
claude added 10 commits May 28, 2026 17:50
The Supabase embeddings work will live in edge functions, which run on
Deno - a different runtime from the Node/Vite app. Pinning Deno here
(exact version, matching the node/pnpm convention) provisions it on a
fresh `mise install` so `deno test` is available for offline unit tests
of function logic without a manual install.

The Supabase CLI bundles its own edge runtime for `functions serve`, so
this pin is specifically for the `deno test` / `deno check` path, not
for serving.
This work - moving our Venice.ai API calls behind a Supabase edge
function with a project-global key and scheduled background generation -
is multi-milestone and will span many sessions, so it needs a durable
home that survives the ephemeral cloud containers.

docs/dev/in-progress/venice-edge-functions/ holds the overall plan plus
one sub-plan per Venice endpoint. The README captures the decisions
already settled (fat function with internal routing, a singleton
app_config table rather than env secrets or Vault, app stays Node while
functions are a Deno island, Deno pinned in mise) so they are not
re-litigated each session, and the five-phase strangler-fig shape.

Embeddings is the first milestone and is fleshed out: the shared-config
track, the consumer-migration-as-static-check tactic, the concrete step
sequence, the re-verified surface area, and the testing strategy. The
chat-completions, billing-usage, and text-parser sub-plans are
skeletons; the final step of the embeddings milestone is to fold its
lessons back into them, which is why text-parser (the fifth Venice
endpoint, easy to overlook) is captured now rather than discovered late.
First implementation chunk of the Venice edge-functions embeddings
milestone (steps 1-2): stand up the project-global config store and the
tooling to populate it. No client wiring yet, so nothing reads the table
at runtime - that is the next chunk.

app_config is a singleton table (one boolean primary key constrained to
true, so it holds at most one row) holding the Venice API key shared by
every member of the Supabase project. Its RLS deliberately diverges from
the per-user sibling tables: any authenticated member may SELECT, and
there is no write policy at all - writes go through the service role
(the Management API today, the edge function later), which bypasses RLS.
The divergence is commented inline so a future reader does not read the
missing write policy as an oversight.

mise run config-set (scripts/config-set.mjs) upserts the row via the
existing runSql Management-API helper, resolving project ref and access
token the same way sync.mjs and user-edit.mjs do, and honoring
SUPABASE_PROJECT_REF / SUPABASE_ACCESS_TOKEN for automation. The key is
collected from --key, VENICE_API_KEY, or a masked prompt; single quotes
are escaped for the SQL literal and control characters are rejected.

The schema is not applied here - it lands on the linked project via the
sync-on-deploy workflow when this merges, or via mise run sync to test
ahead of merge.
Replaces the standalone `mise run config-set` task with an interactive
config editor inside `mise run supabase-init`, so config management lives
in one re-runnable wizard rather than a separate command the user has to
remember.

gum (charmbracelet/gum) is pinned in .mise.toml via the aqua backend,
alongside the gh and supabase CLIs, and provisioned by `mise install`. A
small wrapper (scripts/lib/gum.mjs) shells out to it: stdin and stderr are
inherited so gum can draw its TUI and read the keyboard, while stdout is
piped to capture the selection; a non-zero exit (esc/ctrl-c) maps to null.

The new step runs after schema apply, since app_config must exist first.
When app_config is empty it walks the owner through each field; once
values exist it shows a gum menu of the fields - each with its set/unset
state and a one-line description - and opens the right input for the one
picked, looping until Done. The editable fields are data-driven in
CONFIG_FIELDS (one entry today: venice_api_key), so adding a shared
setting later is a single entry, not new branching. Column names come
from that list, never user input, so they are safe to interpolate;
values are single-quote escaped and control characters rejected.

scripts/config-set.mjs and its task are deleted (folded in, no orphan);
the schema comment and the planning docs now point at the wizard step.
The supervisor meta-worker (src/lib/agents/supervisor/) imports each
sub-agent's runOneCycle and owns a single nap policy for the whole
supervised batch via its own napForResult. The per-agent napForResult in
auto_title, summary, topics, memory_topics, and recipe_topics is a
leftover from before that consolidation - nothing imports it (no worker,
no test), and each file's local NapConfig interface existed only as that
dead function's parameter type, so both go.

Also repairs the auto_title and summary file-headers, which still claimed
a `./worker.ts` maps results to sleeps via napForResult; those workers no
longer exist - the supervisor drives these units and owns the timing.

knip now reports a clean tree. Unrelated to the Venice edge-functions
work on this branch; folded in as drive-by cleanup.
Completes the previous napForResult cleanup. Like the five supervised
loops cleaned up earlier, reflection and attachment_expiry have no worker
of their own - the supervisor drives their runOneCycle and owns the nap
policy - so their napForResult and NapConfig are dead in production. The
difference is these two were kept alive by their own test files rather
than by any caller, so knip stayed quiet while the tests exercised code
nothing ships.

Removes napForResult + NapConfig from both loops and the describe blocks
(plus the now-unused imports) that only existed to test them. Also fixes
the two file-headers, which described a `./worker.ts` thin-wrapper entry
point that does not exist; the supervisor is the entry point now.

CycleResult stays - runOneCycle returns it and the remaining tests assert
on it. Gate green, knip clean.
Milestone-1 step 3 of the Venice edge-functions work: wire the client to
read the project-global app_config row, kept distinct from the local
encrypted config so consumers can be migrated to the shared key one at a
time (and the local field deleted last as a static completeness check).

SupabaseService.getAppConfig reads the single app_config row via the
authenticated client (RLS allows any member to SELECT) and returns a
ServerConfig, or null when the row has not been seeded. state.svelte.ts
gains app.serverConfig, fetched in loadSettingsThenStartWorkers *before*
startBackgroundWorkers - this resolves the sequencing gotcha: the local
config is available synchronously at unlock, but serverConfig is an async
post-auth fetch, so a worker migrated to read it would otherwise race an
unresolved value. The fetch is best-effort (null on failure, consumers
fall back to the local key) so a degraded Supabase does not gate boot.
lock() clears serverConfig so an unlock as a different account re-fetches
rather than inheriting the previous project's key.

Nothing reads app.serverConfig yet - step 4 migrates the embeddings
manager as the first consumer and the vet point.
Milestone-1 step 4 and the vet point: the embeddings manager is the first
consumer pointed at the project-global shared key (app.serverConfig),
falling back to the local config key when serverConfig is null (unseeded
project or a failed fetch). The other eight veniceApiKey consumers stay on
the local key and migrate in later milestones.

The manager gains its own EmbeddingsStartOpts (BaseWorkerManager is generic
over its opts), so only this worker takes the extra serverConfig field -
the shared base and the other managers are untouched. Because lazyManager
preserves the start opts type, the state.svelte.ts call site is required by
the compiler to pass serverConfig, so the wiring can't silently regress.

Keeping the local key as a fallback is what makes the parallel phase safe:
nothing breaks if app_config isn't seeded yet. Once every consumer is
migrated, the local veniceApiKey field gets deleted and svelte-check
enumerates any stragglers - see docs/dev/in-progress/venice-edge-functions/.
First edge-function milestone-1 step: stand up the single `venice` Deno
function (Supabase "fat function" - one deployed function, internal
routing by trailing path segment; /embed today, /complete /usage
/text-parser later) plus its Deno toolchain, without disturbing the app
gate.

- supabase/functions/_shared/venice.ts: the embed wire-shape, a deliberate
  duplicate of src/lib/venice.ts's embed half (Node app vs Deno island -
  sharing is a consolidation-phase call). Pure and fetch-injectable so the
  request-shaping / response-parsing / error-mapping is unit-tested offline.
- supabase/functions/venice/index.ts: thin handler - CORS, route /embed,
  read the shared key from app_config via the service role, call Venice,
  map 429 -> 429 and other failures -> 502. deno check passes.
- supabase/functions/tests/venice-embed.test.ts: 3 offline deno tests (fake
  fetch, no network/Supabase). Named .test.ts so deno's dir scan finds it;
  it sits under supabase/functions/ so vitest's tests/** glob never does.
- deno.json import map (npm:@supabase/supabase-js, jsr:@std/assert).

Toolchain boundary - the Deno island stays out of the app gate:
- eslint.config.js ignores supabase/functions/** (Deno globals / URL imports
  would false-flag under the Node config).
- tsconfig already scopes svelte-check to src/ + tests/, and vitest globs
  tests/**, so neither touches the function.
- mise tasks functions-test / functions-serve / functions-deploy drive the
  Deno/CLI side. dev-start already auto-serves the function once index.ts
  exists.

Not yet wired to the worker - that's step 6 (route runOneCycle's embed
through this function). Embeddings still call Venice directly from the
browser until then.

Notes: also adds scripts/dev-local.mjs to knip.json's entry list - main's
dev-stack added the script but not the knip entry, so knip flagged it as an
unused file. Drive-by to keep the tree knip-clean.
The embeddings loop no longer holds a VeniceClient. CycleContext now takes
an injected `Embedder` callback (input -> vector), so the loop is agnostic
about where the vector comes from - the separation lets the worker own the
function-vs-direct decision and keeps the loop's unit tests at a clean
seam.

The worker builds that Embedder to call the venice function via
`client.functions.invoke('venice/embed', ...)`, which authenticates with
the client's live (rotated) session token - the token both the local
gateway and hosted prod verify, so no environment-specific auth handling
is needed. On any function-path failure (not deployed yet, app_config
unseeded, network) it logs a warn and falls back to a direct browser->Venice
call, so embedding generation keeps flowing during rollout and failures are
visible rather than silently stalling the queue. The fallback retires once
cron owns generation (step 7).

Tests move to the new seam: the loop test injects a fake `embed` returning
a vector (or throwing VeniceError) instead of faking a VeniceClient.
verify_jwt stays on; the function still reads the shared key server-side
and does no per-user work.
@sysread sysread force-pushed the claude/supabase-embeddings-edge-function-edsvW branch from 3dfb63d to 9b5ae04 Compare May 28, 2026 17:52
sysread added 4 commits May 28, 2026 15:44
…ep 7)

Milestone-1 step 7: backfill now runs on a pg_cron schedule behind the
venice function instead of a browser Web Worker. The plan said the cron
function would drain through the existing claim/save RPCs, but those were
security-invoker and scoped to auth.uid() - cron has no user session, so
they matched zero rows. All ten claim/save RPCs are converted in place to
security-definer global sweeps (no auth.uid() filter, runs as the owner).
The EXECUTE grant is the new security boundary: revoked from public,
granted to service_role - a definer function with no user filter would
otherwise let any signed-in member claim and read another member's rows.
Safe to convert in place because deleting the browser worker leaves the
cron function as their only caller.

New /backfill route on the venice function, distinct from /embed:
service-role-only (bearer must equal the injected SUPABASE_SERVICE_ROLE_KEY),
runs the claim -> embed -> pad -> save loop across all five sources, bounded
per invocation (50 rows / 25s). Orchestration (_shared/backfill.ts) is
I/O-free with injected deps so it unit-tests offline; the five text builders
are ported to _shared/embed-input.ts in TS, not SQL, so truncation stays
byte-identical to historical rows (JS slice counts UTF-16 units, SQL left()
counts characters - they diverge on an emoji at the boundary).

schema.sql gains a guarded pg_cron/pg_net block (gated on
pg_available_extensions so the local stack, which ships neither, still
applies cleanly) and nak_trigger_embed_backfill(), which reads the project
URL + legacy-JWT service-role key from Vault and POSTs to /backfill every 5
minutes. The legacy JWT is required - the gateway rejects the modern opaque
sb_secret_ key as a non-JWT bearer. mise run supabase-init seeds the two
Vault secrets.

Deletes the browser embeddings worker, manager, loop, sources, and types,
the ten now-orphaned SupabaseService claim/save methods, the four browser
embeddings vitest files (truncation coverage ported to deno test), and the
state.svelte.ts start/stop wiring. Keeps embeddings/lease.ts - the agent
worker fleet imports LeaseCoordinator from it. MAX_MEMORY_DATA_CHARS moves
to memories.ts (its real owner now). app.serverConfig stays fetched (the
shared-key spine) but has no browser consumer until the query-time callers
migrate in a later milestone.

Gate green (1750 tests, svelte-check clean, build); 19 deno tests; knip
clean. The /backfill route is verified end-to-end against the local stack.
The pg_cron schedule is hosted-only (the local stack can't run cron), so it
applies on the next deploy and needs supabase-init to seed Vault.
The hosted pg_cron job that drives embedding backfill has no local
equivalent: the dev-start Supabase stack ships neither pg_cron nor pg_net,
so the schedule in schema.sql is guarded to no-op locally. With the browser
worker gone, nothing drains the embedding queue in local dev.

scripts/dev-backfill-cron.mjs reproduces what the cron job does - every N
seconds it POSTs to the local /backfill route with the legacy service-role
key, the same call pg_net makes in prod. It reads the stack endpoints from
`supabase status` and refuses any non-loopback API target, so a shell
carrying prod creds can't aim it at the hosted project. Loud DEV-only header
and filename so it can't be mistaken for production wiring.

dev-start runs it as a third supervised child (same spawn + killChild
lifecycle as Vite and functions-serve), on by default - set NAK_DEV_BACKFILL=0
to disable. The cost is bounded: an idle tick is free (a local claim query,
no Venice call), and a tick only spends Venice when there are actually
unembedded rows - the same work hosted cron would do. Interval via
NAK_BACKFILL_INTERVAL (default 60s; prod cron is 5 min). The standalone
`mise run dev-backfill-cron` task remains for driving backfill against an
already-running stack.

Listed in knip.json's entry array like the other standalone scripts.
…p 8)

The learning step of the embeddings milestone: each sibling sub-plan
(chat-completions, billing-usage, text-parser) gets a tailored "Lessons
from the embeddings milestone" section so the next endpoint starts from
earned knowledge instead of guesses.

The headline lesson is a scope reducer: embeddings' hardest work - the
security-definer global-sweep RPC conversion, the EXECUTE-grant lockdown,
and the pg_cron/pg_net/Vault stack - existed only because backfill is a
background, user-less job. All three remaining endpoints are user-triggered,
so they copy /embed's per-user-JWT model (verify_jwt on, shared key read
from app_config via the service role) and need none of that machinery. The
per-endpoint sections then note what IS genuinely new: streaming + abort +
venice_parameters for chat (and the sharpened "share src/lib/venice.ts now?"
call), the paging-loop/UsageRow deno-test target plus the real cost of
"cache it" for usage, and multipart bodies + the unmeasured payload-size
limit for text-parser.

Target-state design for each stays deferred to when that endpoint is the
active milestone (as embeddings was) rather than designed speculatively now.
Marks the step-8 box in the embeddings definition of done.
The section described the loop as an embeddings-specific, future-tense
step ("implementing embeddings will surface...") and miscounted "four
sibling sub-plans" (there are three). Now embeddings has shipped, state
the recurring pattern instead: sub-plans stay thin until their endpoint
is the active milestone, and each milestone closes by folding its
learnings into the remaining plans - no waterfalling the design ahead.
Keeps the embeddings instance as the worked example.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants