Conversation
Ships the final roadmap wave: opt-in `ivf: { nlist, nprobe? }` on
TurboQuantIndex, flowing through IdMapIndex and Collection.
- core/kmeans: seeded k-means++ + Lloyd (spherical for cosine/dot, L2 for
euclidean), empty-cluster repair, deterministic per (data, seed)
- core/search: searchSlots probed-subset scan sharing prepareScan validation
with searchFlat — nprobe = nlist reproduces the flat scan bit-for-bit
- index/coarse: CoarseQuantizer with posting lists in lockstep with
swap-remove (O(1) membership patching; full remove parity)
- training mirrors calibration: fit-and-freeze from the first ≥ nlist batch
- io/serialize: format VERSION 2 with hardened ivf section (BAD_IVF)
- bench:ivf: 11.4x QPS at the flat scan's recall (nprobe = nlist/16),
22.8x at nprobe = 1 (20k x 768-d clustered, 4-bit)
- docs: roadmap closed out, guide/api-reference/architecture/serialization
updated; ADR D-016; version 0.0.3
408 tests, coverage 98.9/95.8/100/99.7 (gate 90).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…alse positive CodeQL (js/implicit-operand-conversion) failed to resolve the module-level bigint const and flagged the XOR operand as possibly undefined. Inlining the literal also lets us drop the explicit 64-bit mask: createRng masks bigint seeds itself and the separator fits in the mask, so XOR-then-mask equals mask-then-XOR — the derived seed is bit-identical. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR adds an opt-in IVF (inverted-file) coarse quantizer to scale search to larger corpora by partitioning vectors into k-means cells and scanning only the probed cells’ slots, while preserving parity across TurboQuantIndex, IdMapIndex, and Collection. It also bumps the serialization format to v2 to persist IVF state (plus docs/benchmarks) and releases quantvec@0.0.3.
Changes:
- Introduces seeded k-means++/Lloyd training (
core/kmeans) and an IVF coarse quantizer with posting-list bookkeeping (index/coarse), integrated intoTurboQuantIndexsearch/add/remove/serialize flows. - Adds
searchSlots(subset scan) sharing the same validation path assearchFlat, enabling probed-slot scanning with flat-scan parity whennprobe = nlist. - Bumps serialization to format v2 with an always-present IVF section flag + IVF state; updates tests, docs, benchmarks, and package version/scripts.
Reviewed changes
Copilot reviewed 28 out of 29 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/io/serialize.ts | Serialization format v2: adds IVF section (flag + nlist/nprobe + centroids + listForSlot) and BAD_IVF validation. |
| src/io/serialize.test.ts | Tests for v2 IVF round-trip and crafted-buffer corruption branches; adjusts offsets for new IVF flag byte. |
| src/index/turboquant-index.ts | Adds IVF option types/validation, coarse-quantizer training/freezing, IVF search path (probe + searchSlots), payload/bytes round-trip. |
| src/index/turboquant-index.test.ts | End-to-end IVF correctness/parity tests: training semantics, nprobe=nlist oracle, recall sanity, remove parity, mask behavior, ser/de. |
| src/index/id-map-index.ts | Plumbs IVF training + per-query nprobe through the id-mapped wrapper; exposes ivfActive. |
| src/index/id-map-index.test.ts | Tests IVF passthrough: training via addWithIds, remove parity, filter composition, ser/de preserves IVF + ids. |
| src/index/coarse.ts | New CoarseQuantizer: trains centroids, maintains postings with O(1) swap-remove bookkeeping, probes nearest cells. |
| src/index/coarse.test.ts | Determinism/invariants/fuzz tests for coarse quantizer; probe behavior; fromState round-trip. |
| src/index.ts | Exports IvfOptions type from the public entrypoint. |
| src/ergonomic/types.ts | Adds ivf config and per-search nprobe to the ergonomic Collection API surface. |
| src/ergonomic/collection.ts | Passes IVF config into IdMapIndex; forwards per-query nprobe into search options. |
| src/ergonomic/collection.test.ts | Tests Collection IVF config passthrough and composition with filters + deletes. |
| src/core/search.ts | Refactors shared validation/query prep into prepareScan; adds searchSlots and INVALID_SLOT. |
| src/core/search.test.ts | Adds oracle tests: searchSlots(allSlots) ≡ searchFlat, subset-only behavior, mask handling, validation parity. |
| src/core/kmeans.ts | New deterministic seeded k-means++ + Lloyd implementation with spherical mode and empty-cluster repair. |
| src/core/kmeans.test.ts | Tests validation, determinism, clustering behavior, spherical norms, empty-cluster repair, duplicates. |
| README.md | Updates scope/quickstart/roadmap to document IVF as shipped and provide usage example. |
| package.json | Bumps version to 0.0.3; adds bench:ivf script. |
| package-lock.json | Lockfile version bump to 0.0.3. |
| docs/worklog/DECISIONS.md | Adds ADR D-016 documenting IVF design, training/freeze semantics, v2 serialization decision. |
| docs/serialization.md | Updates spec to v2 body layout including calibration + IVF sections and validation codes. |
| docs/roadmap.md | Moves IVF to shipped; planned section now empty. |
| docs/guide.md | Adds IVF guide section and updates error-code tables for new IVF/search errors. |
| docs/benchmarks.md | Adds IVF benchmark results and interpretation (recall/QPS tradeoff). |
| docs/architecture.md | Updates module map and scope statement to include IVF coarse quantizer. |
| docs/api-reference.md | Documents ivf options, ivfActive, and per-query nprobe in public APIs. |
| benchmarks/results/ivf-d768.json | Commits IVF benchmark results JSON for the documented run. |
| benchmarks/ivf.ts | New deterministic IVF benchmark harness and results writer. |
| .agents/plans/2026-06-10-ivf-wave.md | Implementation plan artifact for the IVF wave. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… fitting Addresses the two Copilot review findings on #2: - Extract the query validation (length / finiteness / non-zero) from the shared scan preamble into an exported validateQuery, and run it in the IVF search branch BEFORE centroid probing — a malformed query now throws the same typed SearchError as the flat path without ever reaching the probe arithmetic. - TurboQuantIndex.add() now validates the batch atomically (validateVectorBatch, as IdMapIndex/Collection already do) whenever a calibration/IVF training decision is pending: a non-finite or zero row can no longer poison the about-to-be-frozen calibration fit or k-means centroids, and a failed first batch leaves the index completely unchanged (nothing appended, decision not frozen — a later clean batch still trains). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Both review findings addressed in d6a64ee:
411 tests green, coverage gate unchanged. |
…e by construction) The probed scan visits slots in posting-list order and the top-k heap keeps tied candidates by visit order, so the "nprobe = nlist == flat scan" oracle held by tie-dynamics luck rather than by construction (cross-cell exact-score ties could in principle diverge at the k boundary). Special-case nprobe === nlist to searchFlat: canonical slot order, identical validation, and it skips a pointless full centroid probe. Adds a duplicate-vector boundary-tie regression test guarding the routing. Addresses the third Copilot finding on #2. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Third finding addressed in the latest commit: One honest note on severity: identical vectors always assign to the same cell (deterministic nearest-centroid), and within a cell posting order preserves ascending slots, so the realistic duplicate-vector case did not actually diverge (verified by running the new boundary-tie test against the pre-fix code — it passed there too). The divergence requires exact f32 score ties across different cells, which is essentially impossible to construct through the rotation+quantization pipeline. Still worth fixing exactly as suggested: the guarantee should not depend on that reasoning. The new duplicate-vector regression test guards the routing. |
Summary
Ships the final roadmap wave: an opt-in IVF coarse quantizer (
ivf: { nlist, nprobe? }) onTurboQuantIndex, with full parity throughIdMapIndexandCollection(config + per-querynprobe+ full remove support + serialization), and bumps the version to 0.0.3 (merging this PR auto-publishes).What's inside
core/kmeans— seeded k-means++ init + Lloyd iterations (Lloyd 1982; Arthur & Vassilvitskii 2007), spherical mode for cosine/dot, plain L2 for euclidean, deterministic empty-cluster repair. Bit-reproducible per (data, seed).core/search.searchSlots— probed-subset scan sharing oneprepareScanvalidation path withsearchFlat, so both throw identical typed errors;nprobe = nlistreproduces the flat scan bit-for-bit (the IVF analog of the WASM ≡ scalar oracle, asserted in tests).index/coarse.CoarseQuantizer— centroids + posting lists kept in lockstep with the index's swap-remove storage via slot→list/slot→position arrays (O(1) removal patching; invariant-fuzzed).nlistvectors; a smaller first batch freezes the index flat forever.ivfActivegetter on all layers.nlist/nprobe/centroids/listForSlot(postings rebuilt on load); hardenedBAD_IVFvalidation, bounds-checked before any allocation; v2-only readers per D-010 (ADR D-016 added).bench:ivf— 20k × 768-d clustered, 4-bit: 11.4× QPS at the flat scan's recall (nprobe = nlist/16), 22.8× at nprobe = 1; results JSON committed.Verification
🤖 Generated with Claude Code