Summary
Two avoidable costs in ByteStorage:
data.to_vec() eagerly copies the borrowed &[u8] to owned even though LZ4 and xxh3 both accept &[u8] and the Vec is never retained — a full-payload memcpy before compression starts (up to 512 MB).
- GIL held throughout —
store()/retrieve() never call py.allow_threads, so LZ4-compressing/hashing a large payload blocks all Python threads for the full duration.
Evidence
cachekit-core/src/byte_storage.rs:184 — StorageEnvelope::new takes Vec<u8> by value
rust/src/python_bindings.rs:40 — no py.allow_threads around the pure-Rust core
- Also:
rmp_serde::to_vec(&envelope) at byte_storage.rs:193 re-copies compressed_data into the envelope (secondary)
Impact
Directly feeds the large-object write-path peak RSS (9.4x write motivation) and serializes large cache writes against all other Python threads.
Fix
Take &[u8] (avoid the to_vec); wrap the compress/hash core in py.allow_threads. Note: the py crate pulls cachekit-core from crates.io (rust/Cargo.toml:23) — confirm published version matches before shipping.
Summary
Two avoidable costs in
ByteStorage:data.to_vec()eagerly copies the borrowed&[u8]to owned even though LZ4 and xxh3 both accept&[u8]and the Vec is never retained — a full-payload memcpy before compression starts (up to 512 MB).store()/retrieve()never callpy.allow_threads, so LZ4-compressing/hashing a large payload blocks all Python threads for the full duration.Evidence
cachekit-core/src/byte_storage.rs:184—StorageEnvelope::newtakesVec<u8>by valuerust/src/python_bindings.rs:40— nopy.allow_threadsaround the pure-Rust corermp_serde::to_vec(&envelope)atbyte_storage.rs:193re-copiescompressed_datainto the envelope (secondary)Impact
Directly feeds the large-object write-path peak RSS (9.4x write motivation) and serializes large cache writes against all other Python threads.
Fix
Take
&[u8](avoid theto_vec); wrap the compress/hash core inpy.allow_threads. Note: the py crate pullscachekit-corefrom crates.io (rust/Cargo.toml:23) — confirm published version matches before shipping.