Skip to content

Filter safety#250

Open
shaleenji wants to merge 28 commits into
masterfrom
filter_safety
Open

Filter safety#250
shaleenji wants to merge 28 commits into
masterfrom
filter_safety

Conversation

@shaleenji
Copy link
Copy Markdown
Collaborator

Pull Request

Summary

Splits the work from filter_pass into the portion that is byte-compatible with filter indexes built by master and ships it as filter_safety. The portion that changes the on-disk bucket layout, the numeric sortable-key domain, and the upsert semantics — and therefore requires a reindex — is deliberately deferred to a follow-up PR ("Part 2"). The split is documented in docs/filter_bucket_format_followup.md.

Net effect on an existing deployment: drop in, restart, queries continue to return the same answers, plus the new $gt / $gte / $lt / $lte operators, filter input validation, defensive bitmap deserialization, and a batch of perf and code-hygiene work. No rebuild required.

Headline items in this PR:

  • New numeric query operators: $gt, $gte, $lt, $lte.
  • Filter parameter validation (server side) + reject : inside filter keys / values.
  • Safe roaring-bitmap deserialization (readSafe + internal_validate + payload-size check). Bitmap byte format unchanged — valid master-built bitmaps still parse.
  • Perf: zero-copy reads from MDBX, meta fetched only for filter-matching vectors, batched / addMany numeric inserts, bounded MDBX write transactions.
  • Refactors: OperationResult return type plumbed through filter call sites; unified add_filters_from_json.
  • Build: macOS-friendly clang detection via xcrun.
  • Tests: new ndd_request_validation_test covering the validation work. 42 pass, 7 skip (3 Part-2 regression alarms guarded by GTEST_SKIP, 4 benchmarks gated on ENDEE_BENCH_DB), 0 fail.
  • Docs: filter.md expanded; new docs/filter_bucket_format_followup.md enumerates exactly what Part 2 must do to remove the Part-1 carry-forwards (Bucket::count field, is_number_integer branch in sortable_from_json, etc.).

Type of Change

  • Bug fix (safe bitmap deserialization hardens against truncated / garbage payloads)
  • New feature ($gt / $gte / $lt / $lte operators, filter parameter validation)
  • Breaking change
  • Documentation update (filter.md + new follow-up doc)
  • Refactor / cleanup (OperationResult plumbing, unified add_filters_from_json, batched inserts)

Related Issue

Closes # N/A

Checklist

  • Code compiles and tests pass — 42 pass, 7 skip, 0 fail in ndd_filter_test + ndd_request_validation_test. Skips are intentional (3 Part-2 regression alarms with explanatory GTEST_SKIP messages, 4 benchmarks gated on ENDEE_BENCH_DB).
  • New tests added where applicable — tests/request_validation_test.cpp covers the new validation path.
  • Documentation updated if needed — docs/filter.md expanded; docs/filter_bucket_format_followup.md lists everything deferred to Part 2 and the Part-1 carry-forwards Part 2 must remove.
  • No unintended breaking changes — Part-1 invariants verified at branch tip: Bucket::is_empty() still checks only ids.empty(); Bucket::serialize / deserialize still write/read the count field; Filter::sortable_from_json still branches on is_number_integer()int_to_sortable; store_vectors_batch does NOT take is_new_to_db. Indexes built on master remain readable byte-for-byte.

Cherry-pick of 02acc13 limited to Part-1-compatible tests. Adds
request_validation_test.cpp covering filter parameter validation from
3e33557 and wires it into tests/CMakeLists.txt.

The remaining contents of 02acc13 (vector_storage_test.cpp,
numeric_index_stress_test.cpp, tests/repo_filter.py, and the new
TEST_F additions in filter_test.cpp) exercise Part-2 behavior
(bitmap-only bucket state, unified float numeric encoding, upsert
cleanup, deleteFilter meta sync) and are deferred to Part 2.
Records the four filter_pass commits skipped from the Part-1 split
(546430d, b0e8425, e9cca02, 4cb445d), the hpp->cpp refactor (7743296)
deferred to be bundled with the bucket layout change, and the Part-2
test files split out of 02acc13. Documents the Part-1 carry-forwards
(Bucket count field, sortable_from_json int branch) that exist to keep
filter_safety byte-compatible with master-built indexes and that
Part 2 should remove.
Three Hypothesis tests from a46d0b8 (safe filter bitmap
deserialization) assert behavior that only exists after Part 2:

  - Hypothesis2.SaturationCreatesBitmapOnlyEntries — expects
    Bucket::add to route delta-0 inserts past MAX_SIZE into the
    summary bitmap (546430d).
  - Hypothesis4.DeserializeRejectsLegacyCountFormat — expects the
    count-less deserializer to reject the legacy on-disk shape
    (546430d).
  - Hypothesis4.ReadSummaryBitmapRejectsLegacyCountFormat — expects
    read_summary_bitmap to reject the same shape via an alignment
    check; Part 1 intentionally removed that check because the count
    field is still part of the layout.

Each test now calls GTEST_SKIP() with a message pointing at
docs/filter_part2_followups.md. Part 2 must remove these skips when
the underlying fixes land.
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

VectorDB Benchmark - Ready To Run

CI Passed ([lint + unit tests] (https://github.com/endee-io/endee/actions/runs/25908084844)) - benchmark options unlocked.

Post one of the command below. Only members with write access can trigger runs.


Available Modes

Mode Command What runs
Dense /correctness_benchmarking dense HNSW insert throughput · query P50/P95/P99 · recall@10 · concurrent QPS
Hybrid /correctness_benchmarking hybrid Dense + sparse BM25 fusion · same suite + fusion latency overhead

Infrastructure

Server Role Instance
Endee Server Endee VectorDB — code from this branch t2.large
Benchmark Server Benchmark runner t3a.large

Both servers start on demand and are always terminated after the run — pass or fail.


How Correctness Benchmarking Works

1. Post /correctness_benchmarking <mode>
2. Endee Server Create  →  this branch's code deployed  →  Endee starts in chosen mode
3. Benchmark Server Create  →  benchmark suite transferred
4. Benchmark Server runs correctness benchmarking against Endee Server
5. Results posted back here  →  pass/fail + full metrics table
6. Both servers terminated   →  always, even on failure

After a new push, CI must pass again before this menu reappears.

Move the implementations of CategoryIndex, NumericIndex, Bucket, and
Filter from their respective headers into new translation units. The
headers now expose only types, declarations, and the tiny inline
accessors (sortable_from_float family, Bucket::get_value /
is_full / is_empty). Behavior is unchanged; this is a build-time
refactor.

Define NDD_FILTER_SOURCES once in the root CMakeLists.txt and pull it
into both NDD_CORE_SOURCES (for the main binary) and the
ndd_filter_test target so the implementations are linked in both
places.

Add #include <thread> to settings.hpp. It uses
std::thread::hardware_concurrency() but was relying on a transitive
include from the old filter.hpp; the trimmed filter.hpp no longer pulls
in <thread>, so the test build broke without this fix.

Verified: ndd_filter_test (42 pass, 7 skip, 0 fail) and
ndd_request_validation_test (6 pass, 0 fail) match the pre-split
results; ndd-avx2 builds clean.
@shaleenji
Copy link
Copy Markdown
Collaborator Author

================================================================================
Commit : 15192bd
Short : 15192bd
Subject: Fix FP16 NEON build on AArch64 CPUs without FP16FML support (#168)
When : 2026-05-15 09:38:04 +0000

LABEL FILTER RESULTS
FilterBoost(%) LabelPct Concurrency Test QPS P99(s) P95(s) Recall
0 0.5 16 test_1 1385.2552 0.0085 0.0078 0.9778
0 0.5 16 test_2 1441.1741 0.0083 0.0075 0.9778
0 0.5 16 test_3 1411.8426 0.008 0.0075 0.9778
0 0.2 16 test_1 728.8524 0.0143 0.0132 0.978
0 0.2 16 test_2 738.2412 0.0145 0.0133 0.978
0 0.2 16 test_3 729.5199 0.0146 0.0133 0.978
0 0.1 16 test_1 459.5179 0.021 0.0198 0.9793
0 0.1 16 test_2 464.8781 0.021 0.0198 0.9793
0 0.1 16 test_3 459.6361 0.0205 0.0193 0.9793
0 0.05 16 test_1 269.6389 0.0345 0.0317 0.9785
0 0.05 16 test_2 273.076 0.033 0.0315 0.9785
0 0.05 16 test_3 271.1142 0.0346 0.0318 0.9785
0 0.02 16 test_1 160.6592 0.0545 0.0506 0.9762
0 0.02 16 test_2 159.4784 0.0551 0.0512 0.9762
0 0.02 16 test_3 161.0747 0.0525 0.0498 0.9762
0 0.01 16 test_1 119.5435 0.0725 0.069 0.9659
0 0.01 16 test_2 117.2639 0.0729 0.0693 0.9659
0 0.01 16 test_3 118.2296 0.0713 0.0681 0.9659
0 0.002 16 test_1 2263.0854 0.005 0.0045 0.9999
0 0.002 16 test_2 2282.6694 0.005 0.0044 0.9999
0 0.002 16 test_3 2221.3103 0.0059 0.0048 0.9999
0 0.001 16 test_1 2908.372 0.0029 0.0029 0.9998
0 0.001 16 test_2 2918.447 0.0037 0.0031 0.9998
0 0.001 16 test_3 2913.0584 0.0032 0.003 0.9998

INT FILTER RESULTS
FilterBoost(%) IntFilterRate Concurrency Test QPS P99(s) P95(s) Recall
0 0.99 16 test_1 118.8615 0.0694 0.0659 0.9652
0 0.99 16 test_2 120.7722 0.0706 0.0671 0.9652
0 0.99 16 test_3 124.4392 0.0712 0.0677 0.9652
0 0.80 16 test_1 538.3185 0.019 0.0178 0.9793
0 0.80 16 test_2 534.2095 0.0201 0.0178 0.9793
0 0.80 16 test_3 524.4287 0.0203 0.0179 0.9793
0 0.50 16 test_1 703.7076 0.0145 0.0134 0.9783
0 0.50 16 test_2 702.1918 0.0151 0.0138 0.9783
0 0.50 16 test_3 715.0694 0.0173 0.015 0.9783
0 0.01 16 test_1 742.828 0.0158 0.0133 0.974
0 0.01 16 test_2 752.7353 0.0155 0.0136 0.974
0 0.01 16 test_3 750.9379 0.0142 0.0133 0.974

@shaleenji
Copy link
Copy Markdown
Collaborator Author

================================================================================
Commit : d1a5522
Short : d1a5522
Subject: filter: split headers into hpp + cpp
When : 2026-05-15 10:39:01 +0000

LABEL FILTER RESULTS
FilterBoost(%) LabelPct Concurrency Test QPS P99(s) P95(s) Recall
0 0.5 16 test_1 1382.0941 0.0084 0.0075 0.9787
0 0.5 16 test_2 1383.0399 0.0081 0.0073 0.9787
0 0.5 16 test_3 1430.8326 0.0083 0.0076 0.9787
0 0.2 16 test_1 750.5759 0.0142 0.013 0.9773
0 0.2 16 test_2 747.8254 0.0147 0.0135 0.9773
0 0.2 16 test_3 741.3709 0.0137 0.0126 0.9773
0 0.1 16 test_1 454.8632 0.0203 0.0193 0.9801
0 0.1 16 test_2 463.8765 0.0206 0.0191 0.9801
0 0.1 16 test_3 461.1711 0.0204 0.0189 0.9801
0 0.05 16 test_1 276.1424 0.0327 0.0308 0.9782
0 0.05 16 test_2 269.995 0.0336 0.0314 0.9782
0 0.05 16 test_3 271.265 0.0326 0.0305 0.9782
0 0.02 16 test_1 162.0203 0.0526 0.0499 0.9739
0 0.02 16 test_2 159.6628 0.0534 0.0498 0.9739
0 0.02 16 test_3 163.1439 0.0552 0.0508 0.9739
0 0.01 16 test_1 117.433 0.0707 0.0677 0.9645
0 0.01 16 test_2 118.8786 0.0708 0.0674 0.9645
0 0.01 16 test_3 118.2547 0.0731 0.07 0.9645
0 0.002 16 test_1 2915.8864 0.0045 0.0036 0.9999
0 0.002 16 test_2 2928.9737 0.0038 0.0034 0.9999
0 0.002 16 test_3 2934.104 0.0033 0.0032 0.9999
0 0.001 16 test_1 2864.5193 0.0029 0.0027 0.9998
0 0.001 16 test_2 2907.3291 0.0027 0.0024 0.9998
0 0.001 16 test_3 2909.2607 0.0025 0.0023 0.9998

INT FILTER RESULTS
FilterBoost(%) IntFilterRate Concurrency Test QPS P99(s) P95(s) Recall
0 0.99 16 test_1 118.0655 0.0703 0.067 0.9651
0 0.99 16 test_2 118.9901 0.0707 0.0668 0.9651
0 0.99 16 test_3 120.5183 0.0707 0.0678 0.9651
0 0.80 16 test_1 572.2425 0.0176 0.0161 0.9802
0 0.80 16 test_2 578.8009 0.0174 0.0163 0.9802
0 0.80 16 test_3 582.8413 0.0171 0.0157 0.9802
0 0.50 16 test_1 798.4126 0.0137 0.0121 0.9784
0 0.50 16 test_2 783.5072 0.0151 0.0126 0.9784
0 0.50 16 test_3 811.5474 0.0128 0.012 0.9784
0 0.01 16 test_1 832.6413 0.0132 0.0118 0.9746
0 0.01 16 test_2 828.8141 0.0132 0.0118 0.9746
0 0.01 16 test_3 828.8333 0.0126 0.0118 0.9746

@shaleenji
Copy link
Copy Markdown
Collaborator Author

Server: 8 CPU, 32GB RAM
Client: 4CPU, 16GB RAM (client concurrency for vectordbbench: 16)

Screenshot 2026-05-15 at 16 28 21 Screenshot 2026-05-15 at 16 28 29

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants