Venice edge functions: embeddings milestone + shared-config track#197
Draft
sysread wants to merge 14 commits into
Draft
Venice edge functions: embeddings milestone + shared-config track#197sysread wants to merge 14 commits into
sysread wants to merge 14 commits into
Conversation
6e81a88 to
3dfb63d
Compare
The Supabase embeddings work will live in edge functions, which run on Deno - a different runtime from the Node/Vite app. Pinning Deno here (exact version, matching the node/pnpm convention) provisions it on a fresh `mise install` so `deno test` is available for offline unit tests of function logic without a manual install. The Supabase CLI bundles its own edge runtime for `functions serve`, so this pin is specifically for the `deno test` / `deno check` path, not for serving.
This work - moving our Venice.ai API calls behind a Supabase edge function with a project-global key and scheduled background generation - is multi-milestone and will span many sessions, so it needs a durable home that survives the ephemeral cloud containers. docs/dev/in-progress/venice-edge-functions/ holds the overall plan plus one sub-plan per Venice endpoint. The README captures the decisions already settled (fat function with internal routing, a singleton app_config table rather than env secrets or Vault, app stays Node while functions are a Deno island, Deno pinned in mise) so they are not re-litigated each session, and the five-phase strangler-fig shape. Embeddings is the first milestone and is fleshed out: the shared-config track, the consumer-migration-as-static-check tactic, the concrete step sequence, the re-verified surface area, and the testing strategy. The chat-completions, billing-usage, and text-parser sub-plans are skeletons; the final step of the embeddings milestone is to fold its lessons back into them, which is why text-parser (the fifth Venice endpoint, easy to overlook) is captured now rather than discovered late.
First implementation chunk of the Venice edge-functions embeddings milestone (steps 1-2): stand up the project-global config store and the tooling to populate it. No client wiring yet, so nothing reads the table at runtime - that is the next chunk. app_config is a singleton table (one boolean primary key constrained to true, so it holds at most one row) holding the Venice API key shared by every member of the Supabase project. Its RLS deliberately diverges from the per-user sibling tables: any authenticated member may SELECT, and there is no write policy at all - writes go through the service role (the Management API today, the edge function later), which bypasses RLS. The divergence is commented inline so a future reader does not read the missing write policy as an oversight. mise run config-set (scripts/config-set.mjs) upserts the row via the existing runSql Management-API helper, resolving project ref and access token the same way sync.mjs and user-edit.mjs do, and honoring SUPABASE_PROJECT_REF / SUPABASE_ACCESS_TOKEN for automation. The key is collected from --key, VENICE_API_KEY, or a masked prompt; single quotes are escaped for the SQL literal and control characters are rejected. The schema is not applied here - it lands on the linked project via the sync-on-deploy workflow when this merges, or via mise run sync to test ahead of merge.
Replaces the standalone `mise run config-set` task with an interactive config editor inside `mise run supabase-init`, so config management lives in one re-runnable wizard rather than a separate command the user has to remember. gum (charmbracelet/gum) is pinned in .mise.toml via the aqua backend, alongside the gh and supabase CLIs, and provisioned by `mise install`. A small wrapper (scripts/lib/gum.mjs) shells out to it: stdin and stderr are inherited so gum can draw its TUI and read the keyboard, while stdout is piped to capture the selection; a non-zero exit (esc/ctrl-c) maps to null. The new step runs after schema apply, since app_config must exist first. When app_config is empty it walks the owner through each field; once values exist it shows a gum menu of the fields - each with its set/unset state and a one-line description - and opens the right input for the one picked, looping until Done. The editable fields are data-driven in CONFIG_FIELDS (one entry today: venice_api_key), so adding a shared setting later is a single entry, not new branching. Column names come from that list, never user input, so they are safe to interpolate; values are single-quote escaped and control characters rejected. scripts/config-set.mjs and its task are deleted (folded in, no orphan); the schema comment and the planning docs now point at the wizard step.
The supervisor meta-worker (src/lib/agents/supervisor/) imports each sub-agent's runOneCycle and owns a single nap policy for the whole supervised batch via its own napForResult. The per-agent napForResult in auto_title, summary, topics, memory_topics, and recipe_topics is a leftover from before that consolidation - nothing imports it (no worker, no test), and each file's local NapConfig interface existed only as that dead function's parameter type, so both go. Also repairs the auto_title and summary file-headers, which still claimed a `./worker.ts` maps results to sleeps via napForResult; those workers no longer exist - the supervisor drives these units and owns the timing. knip now reports a clean tree. Unrelated to the Venice edge-functions work on this branch; folded in as drive-by cleanup.
Completes the previous napForResult cleanup. Like the five supervised loops cleaned up earlier, reflection and attachment_expiry have no worker of their own - the supervisor drives their runOneCycle and owns the nap policy - so their napForResult and NapConfig are dead in production. The difference is these two were kept alive by their own test files rather than by any caller, so knip stayed quiet while the tests exercised code nothing ships. Removes napForResult + NapConfig from both loops and the describe blocks (plus the now-unused imports) that only existed to test them. Also fixes the two file-headers, which described a `./worker.ts` thin-wrapper entry point that does not exist; the supervisor is the entry point now. CycleResult stays - runOneCycle returns it and the remaining tests assert on it. Gate green, knip clean.
Milestone-1 step 3 of the Venice edge-functions work: wire the client to read the project-global app_config row, kept distinct from the local encrypted config so consumers can be migrated to the shared key one at a time (and the local field deleted last as a static completeness check). SupabaseService.getAppConfig reads the single app_config row via the authenticated client (RLS allows any member to SELECT) and returns a ServerConfig, or null when the row has not been seeded. state.svelte.ts gains app.serverConfig, fetched in loadSettingsThenStartWorkers *before* startBackgroundWorkers - this resolves the sequencing gotcha: the local config is available synchronously at unlock, but serverConfig is an async post-auth fetch, so a worker migrated to read it would otherwise race an unresolved value. The fetch is best-effort (null on failure, consumers fall back to the local key) so a degraded Supabase does not gate boot. lock() clears serverConfig so an unlock as a different account re-fetches rather than inheriting the previous project's key. Nothing reads app.serverConfig yet - step 4 migrates the embeddings manager as the first consumer and the vet point.
Milestone-1 step 4 and the vet point: the embeddings manager is the first consumer pointed at the project-global shared key (app.serverConfig), falling back to the local config key when serverConfig is null (unseeded project or a failed fetch). The other eight veniceApiKey consumers stay on the local key and migrate in later milestones. The manager gains its own EmbeddingsStartOpts (BaseWorkerManager is generic over its opts), so only this worker takes the extra serverConfig field - the shared base and the other managers are untouched. Because lazyManager preserves the start opts type, the state.svelte.ts call site is required by the compiler to pass serverConfig, so the wiring can't silently regress. Keeping the local key as a fallback is what makes the parallel phase safe: nothing breaks if app_config isn't seeded yet. Once every consumer is migrated, the local veniceApiKey field gets deleted and svelte-check enumerates any stragglers - see docs/dev/in-progress/venice-edge-functions/.
First edge-function milestone-1 step: stand up the single `venice` Deno function (Supabase "fat function" - one deployed function, internal routing by trailing path segment; /embed today, /complete /usage /text-parser later) plus its Deno toolchain, without disturbing the app gate. - supabase/functions/_shared/venice.ts: the embed wire-shape, a deliberate duplicate of src/lib/venice.ts's embed half (Node app vs Deno island - sharing is a consolidation-phase call). Pure and fetch-injectable so the request-shaping / response-parsing / error-mapping is unit-tested offline. - supabase/functions/venice/index.ts: thin handler - CORS, route /embed, read the shared key from app_config via the service role, call Venice, map 429 -> 429 and other failures -> 502. deno check passes. - supabase/functions/tests/venice-embed.test.ts: 3 offline deno tests (fake fetch, no network/Supabase). Named .test.ts so deno's dir scan finds it; it sits under supabase/functions/ so vitest's tests/** glob never does. - deno.json import map (npm:@supabase/supabase-js, jsr:@std/assert). Toolchain boundary - the Deno island stays out of the app gate: - eslint.config.js ignores supabase/functions/** (Deno globals / URL imports would false-flag under the Node config). - tsconfig already scopes svelte-check to src/ + tests/, and vitest globs tests/**, so neither touches the function. - mise tasks functions-test / functions-serve / functions-deploy drive the Deno/CLI side. dev-start already auto-serves the function once index.ts exists. Not yet wired to the worker - that's step 6 (route runOneCycle's embed through this function). Embeddings still call Venice directly from the browser until then. Notes: also adds scripts/dev-local.mjs to knip.json's entry list - main's dev-stack added the script but not the knip entry, so knip flagged it as an unused file. Drive-by to keep the tree knip-clean.
The embeddings loop no longer holds a VeniceClient. CycleContext now takes
an injected `Embedder` callback (input -> vector), so the loop is agnostic
about where the vector comes from - the separation lets the worker own the
function-vs-direct decision and keeps the loop's unit tests at a clean
seam.
The worker builds that Embedder to call the venice function via
`client.functions.invoke('venice/embed', ...)`, which authenticates with
the client's live (rotated) session token - the token both the local
gateway and hosted prod verify, so no environment-specific auth handling
is needed. On any function-path failure (not deployed yet, app_config
unseeded, network) it logs a warn and falls back to a direct browser->Venice
call, so embedding generation keeps flowing during rollout and failures are
visible rather than silently stalling the queue. The fallback retires once
cron owns generation (step 7).
Tests move to the new seam: the loop test injects a fake `embed` returning
a vector (or throwing VeniceError) instead of faking a VeniceClient.
verify_jwt stays on; the function still reads the shared key server-side
and does no per-user work.
3dfb63d to
9b5ae04
Compare
…ep 7) Milestone-1 step 7: backfill now runs on a pg_cron schedule behind the venice function instead of a browser Web Worker. The plan said the cron function would drain through the existing claim/save RPCs, but those were security-invoker and scoped to auth.uid() - cron has no user session, so they matched zero rows. All ten claim/save RPCs are converted in place to security-definer global sweeps (no auth.uid() filter, runs as the owner). The EXECUTE grant is the new security boundary: revoked from public, granted to service_role - a definer function with no user filter would otherwise let any signed-in member claim and read another member's rows. Safe to convert in place because deleting the browser worker leaves the cron function as their only caller. New /backfill route on the venice function, distinct from /embed: service-role-only (bearer must equal the injected SUPABASE_SERVICE_ROLE_KEY), runs the claim -> embed -> pad -> save loop across all five sources, bounded per invocation (50 rows / 25s). Orchestration (_shared/backfill.ts) is I/O-free with injected deps so it unit-tests offline; the five text builders are ported to _shared/embed-input.ts in TS, not SQL, so truncation stays byte-identical to historical rows (JS slice counts UTF-16 units, SQL left() counts characters - they diverge on an emoji at the boundary). schema.sql gains a guarded pg_cron/pg_net block (gated on pg_available_extensions so the local stack, which ships neither, still applies cleanly) and nak_trigger_embed_backfill(), which reads the project URL + legacy-JWT service-role key from Vault and POSTs to /backfill every 5 minutes. The legacy JWT is required - the gateway rejects the modern opaque sb_secret_ key as a non-JWT bearer. mise run supabase-init seeds the two Vault secrets. Deletes the browser embeddings worker, manager, loop, sources, and types, the ten now-orphaned SupabaseService claim/save methods, the four browser embeddings vitest files (truncation coverage ported to deno test), and the state.svelte.ts start/stop wiring. Keeps embeddings/lease.ts - the agent worker fleet imports LeaseCoordinator from it. MAX_MEMORY_DATA_CHARS moves to memories.ts (its real owner now). app.serverConfig stays fetched (the shared-key spine) but has no browser consumer until the query-time callers migrate in a later milestone. Gate green (1750 tests, svelte-check clean, build); 19 deno tests; knip clean. The /backfill route is verified end-to-end against the local stack. The pg_cron schedule is hosted-only (the local stack can't run cron), so it applies on the next deploy and needs supabase-init to seed Vault.
The hosted pg_cron job that drives embedding backfill has no local equivalent: the dev-start Supabase stack ships neither pg_cron nor pg_net, so the schedule in schema.sql is guarded to no-op locally. With the browser worker gone, nothing drains the embedding queue in local dev. scripts/dev-backfill-cron.mjs reproduces what the cron job does - every N seconds it POSTs to the local /backfill route with the legacy service-role key, the same call pg_net makes in prod. It reads the stack endpoints from `supabase status` and refuses any non-loopback API target, so a shell carrying prod creds can't aim it at the hosted project. Loud DEV-only header and filename so it can't be mistaken for production wiring. dev-start runs it as a third supervised child (same spawn + killChild lifecycle as Vite and functions-serve), on by default - set NAK_DEV_BACKFILL=0 to disable. The cost is bounded: an idle tick is free (a local claim query, no Venice call), and a tick only spends Venice when there are actually unembedded rows - the same work hosted cron would do. Interval via NAK_BACKFILL_INTERVAL (default 60s; prod cron is 5 min). The standalone `mise run dev-backfill-cron` task remains for driving backfill against an already-running stack. Listed in knip.json's entry array like the other standalone scripts.
…p 8) The learning step of the embeddings milestone: each sibling sub-plan (chat-completions, billing-usage, text-parser) gets a tailored "Lessons from the embeddings milestone" section so the next endpoint starts from earned knowledge instead of guesses. The headline lesson is a scope reducer: embeddings' hardest work - the security-definer global-sweep RPC conversion, the EXECUTE-grant lockdown, and the pg_cron/pg_net/Vault stack - existed only because backfill is a background, user-less job. All three remaining endpoints are user-triggered, so they copy /embed's per-user-JWT model (verify_jwt on, shared key read from app_config via the service role) and need none of that machinery. The per-endpoint sections then note what IS genuinely new: streaming + abort + venice_parameters for chat (and the sharpened "share src/lib/venice.ts now?" call), the paging-loop/UsageRow deno-test target plus the real cost of "cache it" for usage, and multipart bodies + the unmeasured payload-size limit for text-parser. Target-state design for each stays deferred to when that endpoint is the active milestone (as embeddings was) rather than designed speculatively now. Marks the step-8 box in the embeddings definition of done.
The section described the loop as an embeddings-specific, future-tense
step ("implementing embeddings will surface...") and miscounted "four
sibling sub-plans" (there are three). Now embeddings has shipped, state
the recurring pattern instead: sub-plans stay thin until their endpoint
is the active milestone, and each milestone closes by folding its
learnings into the remaining plans - no waterfalling the design ahead.
Keeps the embeddings instance as the worked example.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Lands the first milestone of the Venice edge functions project: moving embedding backfill from the browser Web Worker to a
pg_cron-scheduled edge function, plus the shared-config infrastructure every later endpoint reuses.Summary
Embedding generation is the natural first mover - it's already background work (no UI latency), the database side is already claim-RPC structured, and there are no streaming or file-upload complications. This milestone also establishes the shared-config pattern: a project-global Venice API key stored server-side in a singleton
app_configtable, seeded once by the project owner viamise run config-set, and read by both the edge function and the browser.Key changes
app_configtable + RLS insupabase/schema.sql: singleton table (one row viaid boolean primary key) holding the shared Venice API key. Authenticated users maySELECT; writes happen only through the service role (viamise run config-setor later the edge function).mise run config-settarget (scripts/config-set.mjs): prompts for or reads the Venice API key fromVENICE_API_KEYenv, then upserts it intoapp_configvia the Management API. No dashboard clicking, no per-user key entry.Deno tooling in mise (
.mise.toml): pins Deno 2.8.0 sodeno testanddeno lint/deno fmtwork offline for the edge function island.Planning documents (
docs/dev/in-progress/venice-edge-functions/):README.md: overall project shape, architecture decisions (one fatvenicefunction with internal routing, shared config in a table not env secrets, app stays Node/Vite while functions are a Deno island), and the five-phase migration pattern.embeddings.md: fleshed-out first milestone with scope boundary (backfill only, not query-time), current state, target state, the shared-config track in detail, step-by-step implementation plan, surface-area inventory, testing strategy, and definition of done.chat-completions.md,billing-usage.md,text-parser.md: skeleton sub-plans to be informed by embeddings lessons.Notable implementation details
Shared-config migration as a static check: keep both
app.config(local encrypted) andapp.serverConfig(server-side) as distinct in-memory values. Migrate eachveniceApiKeyconsumer one at a time. When the last one is migrated, delete the field from the localAppConfigtype -svelte-check/tscthen enumerates every remaining reader as a compile error. This turns "did we get them all" into a static enumeration, not a runtime hunt.Deno island isolation:
supabase/functions/is excluded from the app tsconfig and carries its owndeno.json.deno lint/deno fmtcover it; eslint/knip stay scoped tosrc/. No migration of the app to Deno - the cost/benefit is lopsided.Deferred code sharing:
src/lib/venice.ts(which already takes afetchImpl) is tempting to share with the function, but Deno's import resolution differs from Vite's. Duplicate the minimal wire-shape into_shared/for now; revisit in the consolidation phase.Learning loop: the final step of the embeddings milestone is to fold lessons back into the four sibling sub-plans, so the next endpoint starts from earned knowledge rather than best guesses.
This is a planning + infrastructure commit. The actual edge function code, the
pg_cronscheduler, and the browser worker deletion follow in subsequent commits once the shared-config table and seeder are proven.https://claude.ai/code/session_01PxPaatuMm1kfxPcqh59LTx