Summary
vortex.zstd segments compressed with a trained dictionary cannot be decoded. ZstdEncodingDecoder fails fast when dictionary_size != 0:
io.github.dfa1.vortex.core.VortexException:
vortex.zstd: dictionary-compressed Zstd segments are not supported (pure-Java decoder)
This is hit by the upstream compatibility fixture zstd.vortex (v0.75.0).
Why
The decode backend (io.airlift:aircompressor-v3) has no Zstd dictionary support:
ZstdFrameDecompressor exposes no API to preload a dictionary's entropy tables or seed the back-reference window; reset() only clears state.
- Rust uses
zstd::bulk::Decompressor::with_dictionary(dict) (native libzstd).
The published zstd.vortex fixture genuinely requires it:
- dict buffer begins with
0xEC30A437 → trained dictionary (preset Huffman + 3 FSE tables + content), Dictionary_ID 0x22a28c3d.
- frames carry the Dictionary_ID flag referencing that same ID — preset entropy tables and window seeding are mandatory.
Buffer layout (per Rust encodings/zstd/src/array.rs): buffer[0] = dictionary, buffer[1..] = frames.
Options considered
- Pure-Java dict decoder from scratch on
MemorySegment (~1500–2500 LOC: dict parse, Huffman incl. Treeless, 3 FSE tables incl. Repeat mode, sequence execution with window-prefix matches). Verifiable against one fixture only — high correctness risk.
- Vendor/extend aircompressor's frame decompressor — rejected: its core uses
sun.misc.Unsafe (~67 refs), violating the project's no-Unsafe / FFM-only rule.
- zstd-jni (native) — rejected: violates the no-JNI rule.
Current state
Fail-fast guard retained; covered by VortexHttpReaderIT#scan_zstdVortex_rejectsDictionaryCompression, which asserts the clear error. zstd.vortex is excluded from the encoding smoke test. Re-enable once a pure-Java dictionary path exists (or aircompressor ships dict support).
References
- Rust decode:
encodings/zstd/src/array.rs (decompress, line ~888: Decompressor::with_dictionary)
- Buffer layout: same file,
deserialize line ~204
Summary
vortex.zstdsegments compressed with a trained dictionary cannot be decoded.ZstdEncodingDecoderfails fast whendictionary_size != 0:This is hit by the upstream compatibility fixture
zstd.vortex(v0.75.0).Why
The decode backend (
io.airlift:aircompressor-v3) has no Zstd dictionary support:ZstdFrameDecompressorexposes no API to preload a dictionary's entropy tables or seed the back-reference window;reset()only clears state.zstd::bulk::Decompressor::with_dictionary(dict)(native libzstd).The published
zstd.vortexfixture genuinely requires it:0xEC30A437→ trained dictionary (preset Huffman + 3 FSE tables + content), Dictionary_ID0x22a28c3d.Buffer layout (per Rust
encodings/zstd/src/array.rs):buffer[0]= dictionary,buffer[1..]= frames.Options considered
MemorySegment(~1500–2500 LOC: dict parse, Huffman incl. Treeless, 3 FSE tables incl. Repeat mode, sequence execution with window-prefix matches). Verifiable against one fixture only — high correctness risk.sun.misc.Unsafe(~67 refs), violating the project's no-Unsafe / FFM-only rule.Current state
Fail-fast guard retained; covered by
VortexHttpReaderIT#scan_zstdVortex_rejectsDictionaryCompression, which asserts the clear error.zstd.vortexis excluded from the encoding smoke test. Re-enable once a pure-Java dictionary path exists (or aircompressor ships dict support).References
encodings/zstd/src/array.rs(decompress, line ~888:Decompressor::with_dictionary)deserializeline ~204