Problem
When using Magic Context with the Antigravity proxy (Claude models via Gemini endpoint), the prompt cache is busted on every new assistant turn. The Antigravity provider hashes [tools]+[system]+[messages up to cache breakpoint] for prompt caching — any byte change before a breakpoint invalidates all downstream cache entries.
Observed behavior (from magic context logs):
stripStructuralNoise: 63 → 65 → 67 (incrementing by 2 per turn)
stripReasoningFromMergedAssistants: 16 → 17 → 18 (incrementing by 1 per turn)
Cache results:
Request 1: input=118,894 cache.read=0 (0% — BUST after pending ops applied)
Request 2: input=19,345 cache.read=99,655 (100% — stable, no structural change)
Request 3: input=20,692 cache.read=99,655 (100% — stable, no structural change)
The first request after each new assistant response always busts because the strip counts changed the message array structure.
Root Cause
Four functions in strip-content.ts and strip-structural-noise.ts use .filter() and .splice() to delete elements from message arrays, which shifts indices and changes array lengths:
1. stripStructuralNoise (strip-structural-noise.ts:31)
const keptParts = message.parts.filter((part) => !isStructuralNoisePart(part));
// Deletes parts, changes array length
2. stripReasoningFromMergedAssistants (strip-content.ts:516)
message.parts.splice(i, 1); // Deletes reasoning parts, shifts indices
3. stripClearedReasoning (strip-content.ts:319-343)
const kept = message.parts.filter((part) => { ... });
// Deletes cleared reasoning, changes array length
4. stripDroppedPlaceholderMessages (strip-content.ts:190)
messages.splice(i, 1); // Deletes entire messages, shifts all subsequent indices
Each new assistant turn introduces new reasoning/structural parts that weren't present on the previous pass. The strip functions remove them, producing a different array structure than the previous pass — this changes the serialized message content that feeds into the Antigravity cache hash.
Impact
- ~42-70% cache hit rate ceiling on long sessions instead of potential ~85-95%
- Every new assistant turn forces a full uncached request (118K+ tokens) before cache stabilizes
- Cost and latency increase proportional to conversation length
Proposed Solution: Sentinel Replacement
Replace .filter() deletion with .map() sentinel replacement that preserves array indices:
// BEFORE (shifts indices, busts cache):
const keptParts = message.parts.filter((part) => !isStructuralNoisePart(part));
// AFTER (preserves indices, cache-stable):
const mappedParts = message.parts.map((part) =>
isStructuralNoisePart(part) ? { type: "text", text: "" } : part
);
For message-level deletion (stripDroppedPlaceholderMessages), replace the message content with a minimal empty-text part instead of splicing the message out:
// BEFORE:
messages.splice(i, 1);
// AFTER:
msg.parts.length = 0;
msg.parts.push({ type: "text", text: "" });
Why this works
The sentinel approach was battle-tested in the opencode-antigravity-auth plugin, where the same .filter()→.map() conversion across 15+ sites eliminated alternating BUST/HIT patterns and raised cache hit rates from ~42% to ~70%+ (ceiling determined by content changes, not structural instability).
Key properties:
- Array length stays constant between passes → hash prefix doesn't change
- Empty sentinels are lightweight — minimal token cost vs. full cache miss
- Idempotent — re-running the strip on the same array produces identical output
stripReasoningFromMergedAssistants special case
This function uses backward .splice() to remove interleaved reasoning parts. The sentinel approach would replace stripped reasoning with { type: "text", text: "" } instead of splicing:
// BEFORE:
message.parts.splice(i, 1);
stripped++;
// AFTER:
message.parts[i] = { type: "text", text: "" };
stripped++;
cache_control inheritance (important for Antigravity)
If any stripped part carries a cache_control field, the sentinel should inherit it to preserve prompt cache breakpoint anchoring:
const sentinel: Record<string, unknown> = { type: "text", text: "" };
if (isRecord(part) && part.cache_control) {
sentinel.cache_control = part.cache_control;
}
Environment
- OpenCode with
opencode-antigravity-auth plugin (v1.6.0)
- Antigravity proxy (Claude models via Gemini endpoint)
- Magic Context plugin (latest)
- Long sessions (2500+ messages, 162K+ tokens)
Evidence
The opencode-antigravity-auth plugin already fixed all cache-busting vectors on its side (Phase 3 fetch interception). The remaining cache instability originates from Phase 1 message transforms in Magic Context changing message structure before applyCaching() hashes the content.
Sequential debug payloads confirmed:
- System instruction hash: stable ✅
- Tools hash: stable ✅
- Generation config hash: stable ✅
- Message content: changes due to strip count increments ❌ ← this issue
Problem
When using Magic Context with the Antigravity proxy (Claude models via Gemini endpoint), the prompt cache is busted on every new assistant turn. The Antigravity provider hashes
[tools]+[system]+[messages up to cache breakpoint]for prompt caching — any byte change before a breakpoint invalidates all downstream cache entries.Observed behavior (from magic context logs):
Cache results:
The first request after each new assistant response always busts because the strip counts changed the message array structure.
Root Cause
Four functions in
strip-content.tsandstrip-structural-noise.tsuse.filter()and.splice()to delete elements from message arrays, which shifts indices and changes array lengths:1.
stripStructuralNoise(strip-structural-noise.ts:31)2.
stripReasoningFromMergedAssistants(strip-content.ts:516)3.
stripClearedReasoning(strip-content.ts:319-343)4.
stripDroppedPlaceholderMessages(strip-content.ts:190)Each new assistant turn introduces new reasoning/structural parts that weren't present on the previous pass. The strip functions remove them, producing a different array structure than the previous pass — this changes the serialized message content that feeds into the Antigravity cache hash.
Impact
Proposed Solution: Sentinel Replacement
Replace
.filter()deletion with.map()sentinel replacement that preserves array indices:For message-level deletion (
stripDroppedPlaceholderMessages), replace the message content with a minimal empty-text part instead of splicing the message out:Why this works
The sentinel approach was battle-tested in the opencode-antigravity-auth plugin, where the same
.filter()→.map()conversion across 15+ sites eliminated alternating BUST/HIT patterns and raised cache hit rates from ~42% to ~70%+ (ceiling determined by content changes, not structural instability).Key properties:
stripReasoningFromMergedAssistantsspecial caseThis function uses backward
.splice()to remove interleaved reasoning parts. The sentinel approach would replace stripped reasoning with{ type: "text", text: "" }instead of splicing:cache_controlinheritance (important for Antigravity)If any stripped part carries a
cache_controlfield, the sentinel should inherit it to preserve prompt cache breakpoint anchoring:Environment
opencode-antigravity-authplugin (v1.6.0)Evidence
The
opencode-antigravity-authplugin already fixed all cache-busting vectors on its side (Phase 3 fetch interception). The remaining cache instability originates from Phase 1 message transforms in Magic Context changing message structure beforeapplyCaching()hashes the content.Sequential debug payloads confirmed: