test(writer): mutation-cover VortexWriter global-dict + factory paths (74% → 89%)#99
Merged
Conversation
… (74% -> 89%) Add VortexWriter to the writer pitest scope and close the bulk of its gaps — chiefly the global-dictionary feature, which was barely tested end to end. - VortexWriterDictDecisionTest unit-tests the pure dict-decision helpers directly (made package-private): isDictCandidate / isUtf8DictCandidate (cardinality + 50%-ratio gates, empty, single-value, F16/F32 exclusion), codePTypeForSize (U8/U16/U32 width boundaries), primitiveArrayLen, readPrimitiveElement. These choices only affect encoding, not the decoded values, so a round-trip cannot pin them. - GlobalDictPrimitiveTest round-trips low-cardinality I32/I64/F64 columns through the global dict build (the integer/float counterpart to the existing GlobalDictUtf8Test), plus high-cardinality and single-value fallbacks and the globalDict-disabled opt-out. - VortexWriterTest covers the create(WriteRegistry) factory overload. Finding: mutation coverage surfaced a real write/read incompatibility — the writer's global dict admits I8/I16 columns, but the reader's lazy dict decode only supports I32/I64/F64 and throws "unsupported ptype for lazy dict: I16" on read-back. Pinned by lowCardinality_i16_globalDict_readerRejects; the fix belongs to the reader's dict decode. VortexWriter 74% -> 89% killed (strength 92%). Remaining survivors are structural (segment alignment math, layout/flatbuffer build), impractical (U32 codes need 65k+ distinct), or unreachable (F32 dict case, the I8 reader gap above). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
dfa1
added a commit
that referenced
this pull request
Jun 20, 2026
…rrow-int dict Mutation coverage (PR #99) surfaced that the writer's global dict admitted I8/U8/I16/U16 columns, but the reader's lazy dict decode only supports I32/I64/F32/F64 — a low-cardinality I16 column wrote a vortex.dict the reader rejected with "unsupported ptype for lazy dict: I16", i.e. an unreadable file. Cross-checked against the Rust reference: the JNI writer does NOT dict-encode a low-cardinality I16 column, and the Java reader reads its output back exactly (new RustWritesJavaReadsIntegrationTest#jniWriter_javaReader_lowCardinalityI16). So the Java reader is already Rust-conformant; the Java writer was the outlier. Fix: exclude I8/U8/I16/U16 from isDictCandidate (alongside F16/F32). Narrow-int dict gives no real benefit (a U8/U16 code is no smaller than the value), matches Rust, and only ever produced files the reader couldn't read — so excluding it is not a regression. A low-card I16 column now encodes via the cascade and round-trips (GlobalDictPrimitiveTest#lowCardinality_i16_notDicted_roundTrips). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds
VortexWriterto the writer pitest scope and closes the bulk of its gaps — chiefly the global-dictionary feature, which was barely tested end to end (74% → 89% killed, strength 92%).What
VortexWriterDictDecisionTest— unit-tests the pure dict-decision helpers directly (made package-private):isDictCandidate/isUtf8DictCandidate(cardinality + 50%-ratio gates, empty, single-value, F16/F32 exclusion),codePTypeForSize(U8/U16/U32 width boundaries),primitiveArrayLen,readPrimitiveElement. These choices only affect encoding, not the decoded values, so a round-trip can't pin them.GlobalDictPrimitiveTest— round-trips low-cardinality I32/I64/F64 columns through the global dict build (the integer/float counterpart to the existingGlobalDictUtf8Test), plus high-cardinality and single-value fallbacks and theglobalDict-disabled opt-out.VortexWriterTest— covers thecreate(WriteRegistry)factory overload.Finding (real bug)
Mutation coverage surfaced a write/read incompatibility: the writer's global dict admits I8/I16 columns, but the reader's lazy dict decode only supports I32/I64/F64 — read-back throws
"unsupported ptype for lazy dict: I16". Pinned bylowCardinality_i16_globalDict_readerRejects. The fix belongs to the reader's dict decode; flagging it here.Remaining survivors (29)
Honest accounting — not chased in this PR:
Result
VortexWriter 74% → 89%; the global-dict decision layer is now fully and directly pinned.
Verify
🤖 Generated with Claude Code