codec: expand AnnotatedLlmRequest/Response extraction for OpenAI + Anthropic + hybrid payloads#76
codec: expand AnnotatedLlmRequest/Response extraction for OpenAI + Anthropic + hybrid payloads#76afourniernv wants to merge 14 commits into
Conversation
…le hybrid fixtures
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
DO NOT MERGE
DO NOT MERGE UNTIL LIVE TESTS ARE COMPLETED
Live/integration tests against running providers have NOT been performed yet. Please do not merge this PR until live validation is completed and explicitly signed off.
Summary
This PR expands normalized codec extraction around
AnnotatedLlmRequestandAnnotatedLlmResponsefor:/v1/chat/completions)/v1/responses)/v1/messages)The goal is to extract more meaningful normalized state while preserving unmodeled provider-specific fields losslessly via
extra.Additive Request IR State (
AnnotatedLlmRequest)Added normalized optional fields (additive, backward-compatible at the payload level):
store: Option<bool>previous_response_id: Option<String>truncation: Option<Json>reasoning: Option<Json>include: Option<Json>user: Option<String>metadata: Option<Json>service_tier: Option<String>parallel_tool_calls: Option<bool>max_output_tokens: Option<u64>max_tool_calls: Option<u64>top_logprobs: Option<u64>stream: Option<bool>Multimodal expansion in request content parts:
ContentPart::ImageUrl { image_url: OpenAiImageUrl }OpenAiImageUrl { url, detail }Additive Response IR State (
ApiSpecificResponse)OpenAI Responses variant expanded with:
previous_response_idstoreservice_tiertruncationreasoninginput_tokens_detailsoutput_tokens_detailsAnthropic Messages variant expanded with:
service_tiercontainercontent_blocksOpenAI Responses request-side hardening
inputarrays.extra(_openai_responses_unparsed_input_items) for round-trip safety.Anthropic request-side updates
tool_choice.type == "none"parity in decode/encode.extra.Hybrid payload coverage added
Fixture/test coverage for mixed/provider patterns:
Consumer blast-radius updates
Because request IR added new fields and a new
ContentPartvariant, downstream consumers were updated:crates/adaptive:ContentPartmatch handling + request initializer updatescrates/ffitests: request initializer updatescrates/wasmtests: request initializer updatescrates/python: constructor/coverage request initializer updatesScope note
This PR intentionally avoids a larger architectural shift (e.g., provider sidecar wrapper/unified request abstraction rewrite). It keeps current
AnnotatedLlmRequest/AnnotatedLlmResponseIR approach and expands extraction additively.Validation performed
uv run pre-commit run --all-files(passed)cargo test -p nemo-flow-adaptive(passed)cargo test -p nemo-flow-ffi(passed)cargo test -p nemo-flow-wasm(passed)cargo test -p nemo-flow-python(passed)cargo test -p nemo-flow codec::(passed)Validation pending (required before merge)
Commit stack highlights
adaptive,ffi,wasm,python)rustfmt)