Skip to content

Cache instability: stripping functions use .filter()/.splice() causing Antigravity prompt cache busts on every new turn #35

@expiren

Description

@expiren

Problem

When using Magic Context with the Antigravity proxy (Claude models via Gemini endpoint), the prompt cache is busted on every new assistant turn. The Antigravity provider hashes [tools]+[system]+[messages up to cache breakpoint] for prompt caching — any byte change before a breakpoint invalidates all downstream cache entries.

Observed behavior (from magic context logs):

stripStructuralNoise:                63 → 65 → 67  (incrementing by 2 per turn)
stripReasoningFromMergedAssistants:  16 → 17 → 18  (incrementing by 1 per turn)

Cache results:

Request 1: input=118,894  cache.read=0       (0%  — BUST after pending ops applied)
Request 2: input=19,345   cache.read=99,655  (100% — stable, no structural change)
Request 3: input=20,692   cache.read=99,655  (100% — stable, no structural change)

The first request after each new assistant response always busts because the strip counts changed the message array structure.

Root Cause

Four functions in strip-content.ts and strip-structural-noise.ts use .filter() and .splice() to delete elements from message arrays, which shifts indices and changes array lengths:

1. stripStructuralNoise (strip-structural-noise.ts:31)

const keptParts = message.parts.filter((part) => !isStructuralNoisePart(part));
// Deletes parts, changes array length

2. stripReasoningFromMergedAssistants (strip-content.ts:516)

message.parts.splice(i, 1);  // Deletes reasoning parts, shifts indices

3. stripClearedReasoning (strip-content.ts:319-343)

const kept = message.parts.filter((part) => { ... });
// Deletes cleared reasoning, changes array length

4. stripDroppedPlaceholderMessages (strip-content.ts:190)

messages.splice(i, 1);  // Deletes entire messages, shifts all subsequent indices

Each new assistant turn introduces new reasoning/structural parts that weren't present on the previous pass. The strip functions remove them, producing a different array structure than the previous pass — this changes the serialized message content that feeds into the Antigravity cache hash.

Impact

  • ~42-70% cache hit rate ceiling on long sessions instead of potential ~85-95%
  • Every new assistant turn forces a full uncached request (118K+ tokens) before cache stabilizes
  • Cost and latency increase proportional to conversation length

Proposed Solution: Sentinel Replacement

Replace .filter() deletion with .map() sentinel replacement that preserves array indices:

// BEFORE (shifts indices, busts cache):
const keptParts = message.parts.filter((part) => !isStructuralNoisePart(part));

// AFTER (preserves indices, cache-stable):
const mappedParts = message.parts.map((part) => 
    isStructuralNoisePart(part) ? { type: "text", text: "" } : part
);

For message-level deletion (stripDroppedPlaceholderMessages), replace the message content with a minimal empty-text part instead of splicing the message out:

// BEFORE:
messages.splice(i, 1);

// AFTER:
msg.parts.length = 0;
msg.parts.push({ type: "text", text: "" });

Why this works

The sentinel approach was battle-tested in the opencode-antigravity-auth plugin, where the same .filter().map() conversion across 15+ sites eliminated alternating BUST/HIT patterns and raised cache hit rates from ~42% to ~70%+ (ceiling determined by content changes, not structural instability).

Key properties:

  • Array length stays constant between passes → hash prefix doesn't change
  • Empty sentinels are lightweight — minimal token cost vs. full cache miss
  • Idempotent — re-running the strip on the same array produces identical output

stripReasoningFromMergedAssistants special case

This function uses backward .splice() to remove interleaved reasoning parts. The sentinel approach would replace stripped reasoning with { type: "text", text: "" } instead of splicing:

// BEFORE:
message.parts.splice(i, 1);
stripped++;

// AFTER:  
message.parts[i] = { type: "text", text: "" };
stripped++;

cache_control inheritance (important for Antigravity)

If any stripped part carries a cache_control field, the sentinel should inherit it to preserve prompt cache breakpoint anchoring:

const sentinel: Record<string, unknown> = { type: "text", text: "" };
if (isRecord(part) && part.cache_control) {
    sentinel.cache_control = part.cache_control;
}

Environment

  • OpenCode with opencode-antigravity-auth plugin (v1.6.0)
  • Antigravity proxy (Claude models via Gemini endpoint)
  • Magic Context plugin (latest)
  • Long sessions (2500+ messages, 162K+ tokens)

Evidence

The opencode-antigravity-auth plugin already fixed all cache-busting vectors on its side (Phase 3 fetch interception). The remaining cache instability originates from Phase 1 message transforms in Magic Context changing message structure before applyCaching() hashes the content.

Sequential debug payloads confirmed:

  • System instruction hash: stable
  • Tools hash: stable
  • Generation config hash: stable
  • Message content: changes due to strip count increments ❌ ← this issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions