Skip to content

fix(translate): hoist leading system prefix to top-level, keep non-leading inline#120

Merged
Menci merged 11 commits into
Menci:mainfrom
yyyr-p:fix/translate-system-hoist
Jun 28, 2026
Merged

fix(translate): hoist leading system prefix to top-level, keep non-leading inline#120
Menci merged 11 commits into
Menci:mainfrom
yyyr-p:fix/translate-system-hoist

Conversation

@yyyr-p

@yyyr-p yyyr-p commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

  • 169e592 changed the Chat Completions→Messages and Responses→Messages translators to emit all system/developer messages inline as role: "system". Upstream Anthropic endpoints reject position-0 inline system — the most common Chat Completions pattern ([{role:"system",...}, {role:"user",...}]) — causing 400 errors.
  • Tested against claude-opus-4-8, claude-opus-4-7, and claude-opus-4-6 via GitHub Copilot upstream: all three reject position-0 inline system. claude-opus-4-8 conditionally accepts non-leading inline system (after a user message), while claude-opus-4-7 and claude-opus-4-6 reject it in all positions.
  • Hoist the leading contiguous run of system/developer messages to MessagesPayload.system. Non-leading system/developer messages stay inline as MessagesSystemMessage, preserving chronological position. Upstreams that reject non-leading inline system can enable the existing demote-interleaved-system-to-user interceptor.

Test plan

  • pnpm --filter @floway-dev/translate run test — 410 tests pass
  • pnpm run typecheck — clean
  • pnpm run test — 3493 tests pass, zero regressions
  • Manual: verified via local Floway dev server that position-0 system messages reach upstream in the top-level system field and return successfully

yyyr-p and others added 11 commits June 27, 2026 04:01
…ading inline

169e592 changed the CC→Messages and Responses→Messages translators to
emit all system/developer messages inline as `role: "system"`. Upstream
Anthropic endpoints reject position-0 inline system — the most common
Chat Completions pattern — so the initial system prompt must live in the
top-level `system` field.

Hoist the leading contiguous run of system/developer messages to
`MessagesPayload.system`. Non-leading system/developer messages stay
inline as `MessagesSystemMessage`, preserving their chronological
position. Upstreams that reject inline system in non-leading positions
can enable the existing `demote-interleaved-system-to-user` interceptor.
… image-in-system content

Tightens the leading system hoist introduced in this PR so the translator's
output stays faithful to the caller's content shape, and turns the
pre-existing silent drop of image parts in system content into an explicit
rejection.

Boundary preservation. The previous hoist concatenated every text part in a
leading system message with empty-string join, then concatenated multiple
leading system messages with `\n\n`, and wrapped the whole result in one
`MessagesTextBlock`. That collapsed two layers of caller-visible structure
(ContentPart array within a message; multiple messages within the prefix)
into a single string — e.g. `[{text:'A'}, {text:'B'}]` became `'AB'`,
glueing text fragments with no separator. Each ContentPart text and each
leading message now contributes its own `MessagesTextBlock` to
`MessagesPayload.system`. The same change applies to the Responses path,
where `payload.instructions` and leading input system messages also stay as
distinct blocks instead of being `\n\n`-joined into one. Downstream prompt
cache and `applyLastSystemCacheBreakpoint` (new helper in
`shared/via-messages/cache-breakpoints.ts`) place the breakpoint on the
last block, mirroring how tool and message breakpoints are applied.

Image-in-system rejection. Anthropic's Messages API only permits text in
the system field — both top-level `MessagesPayload.system` and inline
`MessagesSystemMessage.content`. The Chat Completions and Responses
translators previously filtered non-text parts silently, leaving callers
to wonder why their system-attached image was missing on the wire. They
now throw on `image_url` (Chat Completions) and `input_image` (Responses)
parts in system / developer messages. Upstream-probed wording for
reference: both Bedrock and Vertex return
`system.1.type: Input should be 'text'` for a top-level system image, and
Bedrock additionally returns
`messages.N: role 'system' supports text and tool_change blocks only`
for an inline system image; the translator's error message names the
caller-visible shape rather than the Anthropic field path.

Comment restoration. The original comment explaining `case 'system':
case 'developer':` mapping was deleted by the hoist change. We add a new
comment explicitly noting that this branch only runs for non-leading
system / developer messages (leading prefix is hoisted before
`buildMessagesInput` runs), that developer normalizes to `role:'system'`
because it is the same intent layer on the source wire, and that the
gateway's `demote-interleaved-system-to-user` interceptor flag is the
safety net for upstreams that reject inline `role:'system'` (Vertex
unconditionally; Bedrock when placement rules are violated).

`extractSystemText` in the Responses translator (used the loose
`'text' in block` shape match) is replaced by a single
`responsesSystemBlocks` helper that uses the same `input_text` /
`output_text` whitelist as the inline `translateSystemMessage` and shares
the image-rejection check, so both hoist and inline paths now agree on
which content blocks contribute and on what triggers the throw.
…/non-empty system mix

Round 1 of review-and-cleanup. Adds two test gaps for invariants the
previous commit documented in code but did not exercise.

- `applyLastSystemCacheBreakpoint` now has direct unit coverage (no-op on
  undefined / empty input; marks only the last block when multiple are
  present), matching the local convention in cache-breakpoints_test.ts
  where each helper has its own test.
- Chat Completions hoist: a leading empty system followed by a leading
  non-empty system / developer message — the empty must consume its
  prefix slot without contributing a block, the non-empty must still be
  hoisted as part of the same contiguous prefix.
Cleanup round 1. Removes the function-level prose above
applyLastSystemCacheBreakpoint that restated the function name and
duplicated the file-level "system block when system text is non-empty"
placement note (the sibling applyLastToolCacheBreakpoint has no
analogous comment), and condenses the empty-content explanation in the
hoist loops on both translator pairs to just the one non-obvious clause
(that an empty-content leading message still extends the contiguous
prefix).
Round 2 cleanup. The dual `case 'system': case 'developer':` labels
falling through to a single body already convey that developer
normalizes to role:'system' here, so the dedicated comment line
restating that fact reads as narration of the case label. Drop it from
both translator pairs; the surrounding decision-recording about
non-leading inline placement and the demote-interleaved-system safety
net stays.
Round 2 cleanup. Drops three more lines of mechanical narration the
audit flagged on the second pass:

- The hoist comment's "An empty-content leading message still extends
  the contiguous prefix" sentence narrates the unconditional
  `prefixEnd++` directly below it. The behavior is visible in the loop
  and locked in by an explicit unit test, so the comment carries no
  additional information. Removed on both translator pairs.
- The inline-branch comment on both translators re-described the hoist
  contract (which is already authoritative at the hoist call site)
  before getting to the Bedrock/Vertex divergence and interceptor-flag
  pointer that is the only load-bearing content. Compressed to one
  line.
- The responses-via-messages top-level comment ended with "The cache
  breakpoint lands on the last block via applyLastSystemCacheBreakpoint",
  restating the next line of code. Removed.
Round 3 cleanup. The hoist comment in chat-completions-via-messages
named the demote-interleaved-system-to-user interceptor flag for the
non-leading inline case, which the inline-branch comment already covers
with full Bedrock/Vertex divergence context — drop the duplicate
reference and let each comment own its unique content. The neighboring
test name for the leading-empty + leading-non-empty case described the
algorithm's internal mechanism (prefix-slot consumption) rather than the
observable contract; shorten to the contract-focused form that matches
the surrounding test naming style in the file.
…ssagesTextBlock[]

Round 4 review recommendation, applied directly in-PR.

Both translator pairs' system-content helpers (convertSystemContent on
the Chat Completions side, responsesSystemBlocks on the Responses side)
previously had a dual return shape — string passed through verbatim;
content arrays became MessagesTextBlock[] — which forced the hoist call
sites to fork on `typeof converted === 'string'` and re-wrap a non-empty
string as a single text block. Collapse both helpers to a single
MessagesTextBlock[] return (string content "Hello" → [{type:'text',
text:'Hello'}]; empty content → []) so the hoist becomes a one-line
`systemBlocks.push(...convertSystemContent(message.content))` and the
inline path emits the blocks directly (or '' when empty, preserving the
empty-content wire shape).

Wire side-effect: inline non-leading MessagesSystemMessage with a
plain-string source-content now emits `{role:'system',
content:[{type:'text', text:...}]}` instead of `{role:'system',
content:'...'}`. Both shapes are equivalent under
`MessagesSystemMessage.content: string | MessagesTextBlock[]` and any
Anthropic upstream accepts both equally; the per-translator inline
tests are updated to assert the block-array form.
…t round trip

Round 4 review recommendation, applied directly in-PR.

The Messages → Chat Completions translator previously collapsed both
inline `MessagesSystemMessage` content and top-level
`MessagesPayload.system` block arrays into a single Chat Completions
system message with `\n\n`-joined string content — the exact boundary
loss the forward direction fixes in this PR, just on the reverse trip.
A CC→Messages→CC round trip would silently merge per-block segments
that callers placed deliberately and that the prompt cache cares about.

Switch both code paths to emit Chat Completions `ContentPart[]` text
parts when the source is a block array, one part per block. The
string-source branch still emits string content (no shape change for
single-string sources). Also tighten the empty-system guard to skip the
system message entirely when an empty array is passed (previously
`system ?` was truthy on `[]` and emitted `{role:'system', content:''}`,
which is wire-noise downstream models may reject).

The existing "joins in-array system text blocks with double newline" test
enshrined the boundary-loss behaviour; it is replaced by two tests
asserting per-part preservation for both the inline and the top-level
multi-block cases.
…tent blocks

Round 4 cleanup audit, applied directly in-PR since responsesSystemBlocks
is a helper introduced by this PR.

Today ResponsesInputContent is `input_text | input_image | output_text`;
the helper already handles all three (image throws; text variants
accumulate). A future variant added to the union would have been
silently dropped by the prior `if` chain. Replace the silent fall-through
with an explicit throw mirroring the `unexpectedResponsesInputItem`
exhaustiveness guard already used by `translateResponsesInput`, so any
new variant forces an explicit opt-in.
…stem-array skip

Post-application review polish.

The new exhaustiveness throw in responsesSystemBlocks narrows `block` to
ResponsesInputText by the time the .push runs, so the `as
ResponsesInputText` cast on the text-extract line is dead. Remove it.

Add a unit test asserting the empty-array system-skip behavior that
came with the reverse-direction boundary-preservation refactor, so the
behavior change is locked in.
@Menci Menci merged commit 6b43bcd into Menci:main Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants