Skip to content

test(writer): mutation-cover VortexWriter global-dict + factory paths (74% → 89%)#99

Merged
dfa1 merged 1 commit into
mainfrom
pitest-vortexwriter
Jun 20, 2026
Merged

test(writer): mutation-cover VortexWriter global-dict + factory paths (74% → 89%)#99
dfa1 merged 1 commit into
mainfrom
pitest-vortexwriter

Conversation

@dfa1

@dfa1 dfa1 commented Jun 20, 2026

Copy link
Copy Markdown
Owner

Adds VortexWriter to the writer pitest scope and closes the bulk of its gaps — chiefly the global-dictionary feature, which was barely tested end to end (74% → 89% killed, strength 92%).

What

  • VortexWriterDictDecisionTest — unit-tests the pure dict-decision helpers directly (made package-private): isDictCandidate / isUtf8DictCandidate (cardinality + 50%-ratio gates, empty, single-value, F16/F32 exclusion), codePTypeForSize (U8/U16/U32 width boundaries), primitiveArrayLen, readPrimitiveElement. These choices only affect encoding, not the decoded values, so a round-trip can't pin them.
  • GlobalDictPrimitiveTest — round-trips low-cardinality I32/I64/F64 columns through the global dict build (the integer/float counterpart to the existing GlobalDictUtf8Test), plus high-cardinality and single-value fallbacks and the globalDict-disabled opt-out.
  • VortexWriterTest — covers the create(WriteRegistry) factory overload.

Finding (real bug)

Mutation coverage surfaced a write/read incompatibility: the writer's global dict admits I8/I16 columns, but the reader's lazy dict decode only supports I32/I64/F64 — read-back throws "unsupported ptype for lazy dict: I16". Pinned by lowCardinality_i16_globalDict_readerRejects. The fix belongs to the reader's dict decode; flagging it here.

Remaining survivors (29)

Honest accounting — not chased in this PR:

  • Structural (~14): segment 64-byte alignment math, layout / Array-FlatBuffer build, dict-flush internals — killable only with file-structure/size assertions.
  • Impractical (~2): the U32 code path needs a 65k+-distinct dictionary.
  • Unreachable/equivalent (~3): the F32 case in the dict build (F32 is excluded from dict), the I8 build path (reader gap above).

Result

VortexWriter 74% → 89%; the global-dict decision layer is now fully and directly pinned.

Verify

./mvnw -pl writer -am -P pitest verify

🤖 Generated with Claude Code

… (74% -> 89%)

Add VortexWriter to the writer pitest scope and close the bulk of its gaps —
chiefly the global-dictionary feature, which was barely tested end to end.

- VortexWriterDictDecisionTest unit-tests the pure dict-decision helpers
  directly (made package-private): isDictCandidate / isUtf8DictCandidate
  (cardinality + 50%-ratio gates, empty, single-value, F16/F32 exclusion),
  codePTypeForSize (U8/U16/U32 width boundaries), primitiveArrayLen,
  readPrimitiveElement. These choices only affect encoding, not the decoded
  values, so a round-trip cannot pin them.
- GlobalDictPrimitiveTest round-trips low-cardinality I32/I64/F64 columns
  through the global dict build (the integer/float counterpart to the existing
  GlobalDictUtf8Test), plus high-cardinality and single-value fallbacks and the
  globalDict-disabled opt-out.
- VortexWriterTest covers the create(WriteRegistry) factory overload.

Finding: mutation coverage surfaced a real write/read incompatibility — the
writer's global dict admits I8/I16 columns, but the reader's lazy dict decode
only supports I32/I64/F64 and throws "unsupported ptype for lazy dict: I16" on
read-back. Pinned by lowCardinality_i16_globalDict_readerRejects; the fix
belongs to the reader's dict decode.

VortexWriter 74% -> 89% killed (strength 92%). Remaining survivors are
structural (segment alignment math, layout/flatbuffer build), impractical
(U32 codes need 65k+ distinct), or unreachable (F32 dict case, the I8 reader
gap above).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@dfa1 dfa1 merged commit f646ce1 into main Jun 20, 2026
3 of 6 checks passed
@dfa1 dfa1 deleted the pitest-vortexwriter branch June 20, 2026 17:11
dfa1 added a commit that referenced this pull request Jun 20, 2026
…rrow-int dict

Mutation coverage (PR #99) surfaced that the writer's global dict admitted
I8/U8/I16/U16 columns, but the reader's lazy dict decode only supports
I32/I64/F32/F64 — a low-cardinality I16 column wrote a vortex.dict the reader
rejected with "unsupported ptype for lazy dict: I16", i.e. an unreadable file.

Cross-checked against the Rust reference: the JNI writer does NOT dict-encode a
low-cardinality I16 column, and the Java reader reads its output back exactly
(new RustWritesJavaReadsIntegrationTest#jniWriter_javaReader_lowCardinalityI16).
So the Java reader is already Rust-conformant; the Java writer was the outlier.

Fix: exclude I8/U8/I16/U16 from isDictCandidate (alongside F16/F32). Narrow-int
dict gives no real benefit (a U8/U16 code is no smaller than the value), matches
Rust, and only ever produced files the reader couldn't read — so excluding it is
not a regression. A low-card I16 column now encodes via the cascade and
round-trips (GlobalDictPrimitiveTest#lowCardinality_i16_notDicted_roundTrips).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant