How vortex-java is tested, layer by layer, and why each layer exists. Counts are a
snapshot of the current main (test executions, i.e. after @ParameterizedTest
expansion) and move with the code; treat them as orders of magnitude, not contracts.
The reader memory-maps and parses untrusted binary input, and the writer must produce files a different implementation (the Rust reference) can read back exactly. Two properties dominate the strategy:
- Correctness is defined by interop, not by us. There is no separate written spec; the Rust implementation is ground truth. Anything that crosses the wire format is pinned by a Java↔Rust round-trip.
- Malformed input must fail safely. Every bad file throws
VortexException, never a rawIndexOutOfBoundsException,OutOfMemoryError, or FlatBuffer/Protobuf runtime exception. The bounds/parse paths that enforce this are the most heavily verified code in the project.
The layers below go from fast-and-narrow to slow-and-end-to-end. Most defects should be caught at the lowest layer that can see them.
| Layer | Runner | ~Executions | Scope |
|---|---|---|---|
| Unit | surefire | ~2,690 | One class/behaviour, in-memory, no I/O |
| Property-based | surefire | (subset of unit) | Seeded-random sweeps over encode/decode |
| Integration | failsafe | ~271 | Java↔Rust interop + real files + CLI end-to-end |
| Mutation | PIT (opt-in) | — | Adequacy of tests for bounds/parse classes |
| Benchmarks | JMH (./bench) |
— | Performance, not correctness |
Per-module unit counts: core 256, proto-gen 9, reader 780, writer 1,419, cli 154, inspector 34, parquet 24, jdbc 9, csv 7. ~174 test classes total.
The base of the pyramid. JUnit 5 + Mockito (BDDMockito) + AssertJ. Rules: fast, no file
I/O, no network, no sleep — mock or use in-memory MemorySegments. Each test follows
// Given / // When / // Then; the class under test is sut.
What they cover, by module:
- core (256) —
DType/PTypemodelling,IoBoundsguards,PTypeIOlittle-endian segment reads/writes, proto record encode/decode. - reader (780) — every
EncodingDecoderandArraysubtype, the file-structure parsers (Footer,Trailer,PostscriptParser,Layout), and the lazy/chunked/dict array families. Largest suite because decode has the most branches. - writer (1,419) — every
EncodingEncoder, theCascadingCompressorselection logic,WriteRegistry, and the extension encoders. Largest module overall. - cli / inspector / parquet / jdbc / csv — command parsing, TUI rendering, importers.
Each encoding aims to cover the happy path, negative cases (invalid input → VortexException),
and corners (empty, zero, max, boundaries).
A subset of the unit layer that replaces hand-picked cases with seeded-random
generators, because example-based tests miss corners. The generators target the
distributions that exercise distinct code paths (constant runs, low cardinality, tight
clusters, monotone, full-range, sparse outliers) and assert lossless round-trips
(decode(encode(x)) == x, bit-exact for floats via ±0-collapse canonicalisation).
Seeds are fixed so any failure reproduces. Current property suites:
RoundTripPropertyTest— Delta, FrameOfReference, ZigZag, AlpRd (i32/i64/f32/f64).BitpackedEncodingEncoderTest— bit-width sweep across all widths.CascadingCompressorTest.RoundTripProperty— the full encoder-selection + nesting pipeline, every codec at cascade depth 0–3.PcoEncodingEncoderTest/PcoEncodingDecoderTest— Pco mode pickers (delta, IntMult), bin optimiser, and ANS/patch paths over mixed distributions.
The ground truth layer: failsafe (*IntegrationTest), not surefire. These cross the
JNI boundary to the Rust reference and read/write real files.
RustWritesJavaReadsIntegrationTest— Rust writes, Java reads; verifies our decoders against the canonical writer.JavaWritesRustReadsIntegrationTest(212 cases) — Java writes, Rust reads; verifies our encoders produce spec-correct files. Per-encoding round-trips are generated from the seededRandomArrayssource.RustJavaReaderComparisonIntegrationTest— both read the same file; values must match.Variant…,Parquet…,Taxi…Oracle…— variant interop, Parquet import, and a real-world dataset (NYC taxi / ClickBench-shaped) oracle comparison.CliIT,VortexInspector…— the built CLI/inspector exercised end-to-end.
There is one integration round-trip per encoding and per file-format boundary — this is where a wire-format regression surfaces.
Opt-in, bound to verify, scoped via <targetClasses> to the security-critical
bounds/parse classes — not the whole codebase. It measures whether the tests actually
catch faults, not just execute lines.
./mvnw -pl core -P pitest verify # IoBounds, PTypeIO
./mvnw -pl reader -am -P pitest verify -DskipITs # Footer, Trailer, PostscriptParser,
# SegmentSpec, Layout, FlatSegmentDecoder
./mvnw -pl writer -am -P pitest verify -DskipITs # ChunkImpl, WriteRegistryReports land in <module>/target/pit-reports/. Read a surviving mutant as a
simplify-first signal: an equivalent mutant often marks a clause that can never change
the outcome (dead code) — delete it rather than writing an unkillable test. Only add a test
when the mutated bound is a genuine independent edge. These classes currently sit at
99–100% kill rate.
JMH benchmarks under performance/ measure throughput against the Rust reference
(RustVsJavaReadBenchmark, …WriteBenchmark, …FilterBenchmark, ParquetVsVortexReadBenchmark).
They are performance signal, not correctness — never gated in CI, always run with an
explicit ClassName.methodName filter.
Coverage (JaCoCo, aggregated across surefire + failsafe) is ~81% and is reported to
SonarCloud daily. Generated fbs//proto/ sources and the performance/ benchmark module
are excluded — they have no hand-written behaviour worth covering. The quality gate requires
zero bugs and zero vulnerabilities; the build itself fails on any javac warning
(-Xlint:all -Werror), zero Checkstyle violations, and zero Javadoc warnings.