fix(translate): hoist leading system prefix to top-level, keep non-leading inline#120
Merged
Merged
Conversation
…ading inline 169e592 changed the CC→Messages and Responses→Messages translators to emit all system/developer messages inline as `role: "system"`. Upstream Anthropic endpoints reject position-0 inline system — the most common Chat Completions pattern — so the initial system prompt must live in the top-level `system` field. Hoist the leading contiguous run of system/developer messages to `MessagesPayload.system`. Non-leading system/developer messages stay inline as `MessagesSystemMessage`, preserving their chronological position. Upstreams that reject inline system in non-leading positions can enable the existing `demote-interleaved-system-to-user` interceptor.
… image-in-system content
Tightens the leading system hoist introduced in this PR so the translator's
output stays faithful to the caller's content shape, and turns the
pre-existing silent drop of image parts in system content into an explicit
rejection.
Boundary preservation. The previous hoist concatenated every text part in a
leading system message with empty-string join, then concatenated multiple
leading system messages with `\n\n`, and wrapped the whole result in one
`MessagesTextBlock`. That collapsed two layers of caller-visible structure
(ContentPart array within a message; multiple messages within the prefix)
into a single string — e.g. `[{text:'A'}, {text:'B'}]` became `'AB'`,
glueing text fragments with no separator. Each ContentPart text and each
leading message now contributes its own `MessagesTextBlock` to
`MessagesPayload.system`. The same change applies to the Responses path,
where `payload.instructions` and leading input system messages also stay as
distinct blocks instead of being `\n\n`-joined into one. Downstream prompt
cache and `applyLastSystemCacheBreakpoint` (new helper in
`shared/via-messages/cache-breakpoints.ts`) place the breakpoint on the
last block, mirroring how tool and message breakpoints are applied.
Image-in-system rejection. Anthropic's Messages API only permits text in
the system field — both top-level `MessagesPayload.system` and inline
`MessagesSystemMessage.content`. The Chat Completions and Responses
translators previously filtered non-text parts silently, leaving callers
to wonder why their system-attached image was missing on the wire. They
now throw on `image_url` (Chat Completions) and `input_image` (Responses)
parts in system / developer messages. Upstream-probed wording for
reference: both Bedrock and Vertex return
`system.1.type: Input should be 'text'` for a top-level system image, and
Bedrock additionally returns
`messages.N: role 'system' supports text and tool_change blocks only`
for an inline system image; the translator's error message names the
caller-visible shape rather than the Anthropic field path.
Comment restoration. The original comment explaining `case 'system':
case 'developer':` mapping was deleted by the hoist change. We add a new
comment explicitly noting that this branch only runs for non-leading
system / developer messages (leading prefix is hoisted before
`buildMessagesInput` runs), that developer normalizes to `role:'system'`
because it is the same intent layer on the source wire, and that the
gateway's `demote-interleaved-system-to-user` interceptor flag is the
safety net for upstreams that reject inline `role:'system'` (Vertex
unconditionally; Bedrock when placement rules are violated).
`extractSystemText` in the Responses translator (used the loose
`'text' in block` shape match) is replaced by a single
`responsesSystemBlocks` helper that uses the same `input_text` /
`output_text` whitelist as the inline `translateSystemMessage` and shares
the image-rejection check, so both hoist and inline paths now agree on
which content blocks contribute and on what triggers the throw.
…/non-empty system mix Round 1 of review-and-cleanup. Adds two test gaps for invariants the previous commit documented in code but did not exercise. - `applyLastSystemCacheBreakpoint` now has direct unit coverage (no-op on undefined / empty input; marks only the last block when multiple are present), matching the local convention in cache-breakpoints_test.ts where each helper has its own test. - Chat Completions hoist: a leading empty system followed by a leading non-empty system / developer message — the empty must consume its prefix slot without contributing a block, the non-empty must still be hoisted as part of the same contiguous prefix.
Cleanup round 1. Removes the function-level prose above applyLastSystemCacheBreakpoint that restated the function name and duplicated the file-level "system block when system text is non-empty" placement note (the sibling applyLastToolCacheBreakpoint has no analogous comment), and condenses the empty-content explanation in the hoist loops on both translator pairs to just the one non-obvious clause (that an empty-content leading message still extends the contiguous prefix).
Round 2 cleanup. The dual `case 'system': case 'developer':` labels falling through to a single body already convey that developer normalizes to role:'system' here, so the dedicated comment line restating that fact reads as narration of the case label. Drop it from both translator pairs; the surrounding decision-recording about non-leading inline placement and the demote-interleaved-system safety net stays.
Round 2 cleanup. Drops three more lines of mechanical narration the audit flagged on the second pass: - The hoist comment's "An empty-content leading message still extends the contiguous prefix" sentence narrates the unconditional `prefixEnd++` directly below it. The behavior is visible in the loop and locked in by an explicit unit test, so the comment carries no additional information. Removed on both translator pairs. - The inline-branch comment on both translators re-described the hoist contract (which is already authoritative at the hoist call site) before getting to the Bedrock/Vertex divergence and interceptor-flag pointer that is the only load-bearing content. Compressed to one line. - The responses-via-messages top-level comment ended with "The cache breakpoint lands on the last block via applyLastSystemCacheBreakpoint", restating the next line of code. Removed.
Round 3 cleanup. The hoist comment in chat-completions-via-messages named the demote-interleaved-system-to-user interceptor flag for the non-leading inline case, which the inline-branch comment already covers with full Bedrock/Vertex divergence context — drop the duplicate reference and let each comment own its unique content. The neighboring test name for the leading-empty + leading-non-empty case described the algorithm's internal mechanism (prefix-slot consumption) rather than the observable contract; shorten to the contract-focused form that matches the surrounding test naming style in the file.
…ssagesTextBlock[]
Round 4 review recommendation, applied directly in-PR.
Both translator pairs' system-content helpers (convertSystemContent on
the Chat Completions side, responsesSystemBlocks on the Responses side)
previously had a dual return shape — string passed through verbatim;
content arrays became MessagesTextBlock[] — which forced the hoist call
sites to fork on `typeof converted === 'string'` and re-wrap a non-empty
string as a single text block. Collapse both helpers to a single
MessagesTextBlock[] return (string content "Hello" → [{type:'text',
text:'Hello'}]; empty content → []) so the hoist becomes a one-line
`systemBlocks.push(...convertSystemContent(message.content))` and the
inline path emits the blocks directly (or '' when empty, preserving the
empty-content wire shape).
Wire side-effect: inline non-leading MessagesSystemMessage with a
plain-string source-content now emits `{role:'system',
content:[{type:'text', text:...}]}` instead of `{role:'system',
content:'...'}`. Both shapes are equivalent under
`MessagesSystemMessage.content: string | MessagesTextBlock[]` and any
Anthropic upstream accepts both equally; the per-translator inline
tests are updated to assert the block-array form.
…t round trip
Round 4 review recommendation, applied directly in-PR.
The Messages → Chat Completions translator previously collapsed both
inline `MessagesSystemMessage` content and top-level
`MessagesPayload.system` block arrays into a single Chat Completions
system message with `\n\n`-joined string content — the exact boundary
loss the forward direction fixes in this PR, just on the reverse trip.
A CC→Messages→CC round trip would silently merge per-block segments
that callers placed deliberately and that the prompt cache cares about.
Switch both code paths to emit Chat Completions `ContentPart[]` text
parts when the source is a block array, one part per block. The
string-source branch still emits string content (no shape change for
single-string sources). Also tighten the empty-system guard to skip the
system message entirely when an empty array is passed (previously
`system ?` was truthy on `[]` and emitted `{role:'system', content:''}`,
which is wire-noise downstream models may reject).
The existing "joins in-array system text blocks with double newline" test
enshrined the boundary-loss behaviour; it is replaced by two tests
asserting per-part preservation for both the inline and the top-level
multi-block cases.
…tent blocks Round 4 cleanup audit, applied directly in-PR since responsesSystemBlocks is a helper introduced by this PR. Today ResponsesInputContent is `input_text | input_image | output_text`; the helper already handles all three (image throws; text variants accumulate). A future variant added to the union would have been silently dropped by the prior `if` chain. Replace the silent fall-through with an explicit throw mirroring the `unexpectedResponsesInputItem` exhaustiveness guard already used by `translateResponsesInput`, so any new variant forces an explicit opt-in.
…stem-array skip Post-application review polish. The new exhaustiveness throw in responsesSystemBlocks narrows `block` to ResponsesInputText by the time the .push runs, so the `as ResponsesInputText` cast on the text-extract line is dead. Remove it. Add a unit test asserting the empty-array system-skip behavior that came with the reverse-direction boundary-preservation refactor, so the behavior change is locked in.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
role: "system". Upstream Anthropic endpoints reject position-0 inline system — the most common Chat Completions pattern ([{role:"system",...}, {role:"user",...}]) — causing 400 errors.claude-opus-4-8,claude-opus-4-7, andclaude-opus-4-6via GitHub Copilot upstream: all three reject position-0 inline system.claude-opus-4-8conditionally accepts non-leading inline system (after a user message), whileclaude-opus-4-7andclaude-opus-4-6reject it in all positions.MessagesPayload.system. Non-leading system/developer messages stay inline asMessagesSystemMessage, preserving chronological position. Upstreams that reject non-leading inline system can enable the existingdemote-interleaved-system-to-userinterceptor.Test plan
pnpm --filter @floway-dev/translate run test— 410 tests passpnpm run typecheck— cleanpnpm run test— 3493 tests pass, zero regressionssystemfield and return successfully