Skip to content

fix: restore chat completion multi-choice support#672

Open
nabinchha wants to merge 2 commits into
mainfrom
nmulepati/fix-620-chat-completion-choices-n
Open

fix: restore chat completion multi-choice support#672
nabinchha wants to merge 2 commits into
mainfrom
nmulepati/fix-620-chat-completion-choices-n

Conversation

@nabinchha
Copy link
Copy Markdown
Contributor

@nabinchha nabinchha commented May 18, 2026

📋 Summary

Restores chat completion multi-choice compatibility by reintroducing the n request field and preserving all returned choices in the canonical response. Existing single-choice callers can continue using response.message, while callers that request multiple completions can use response.choices.

🔗 Related Issue

Fixes #620

🔄 Changes

  • Add ChatCompletionRequest.n and export a new ChatCompletionChoice response type.
  • Preserve every returned chat completion choice in ChatCompletionResponse.choices, while keeping response.message as the first choice for existing callers.
  • Parse all OpenAI-compatible chat completion choices instead of discarding everything after index 0.
  • Forward n through ModelFacade.completion() and OpenAI-compatible transport bodies, while excluding it from Anthropic request forwarding.
  • Strip n from generate() and agenerate() before delegating to completion because those APIs return one parsed result.
  • Add tests for request forwarding, multi-choice parsing, async parsing, compatibility access, Anthropic n exclusion, and generate(..., n=...) stripping.

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

  • types.py — updates the canonical chat completion response contract while preserving response.message.
  • facade.pycompletion(..., n=...) exposes multiple choices, while generate(..., n=...) strips n because it returns a single parsed result.

🧪 Testing

  • make test passes (not run; focused model suite was run instead)
  • PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-engine/tests/engine/models -q passes (532 passed)
  • uv run --group dev ruff check ... passes for touched files
  • Unit tests added/updated
  • E2E tests added/updated (N/A — no E2E surface)

✅ Checklist

  • Follows commit message conventions
  • Commits are signed off (DCO)
  • Architecture docs updated (N/A — no architecture docs needed)

@nabinchha nabinchha requested a review from a team as a code owner May 18, 2026 16:27
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR restores multi-choice support for chat completions by introducing ChatCompletionRequest.n, a new ChatCompletionChoice type, and full preservation of all returned choices in ChatCompletionResponse.choices. Existing single-choice callers are unaffected: response.message continues to return the first choice's message via backward-compatible __post_init__ logic.

  • types.py: Adds ChatCompletionChoice, the n request field, choices list + messages property on ChatCompletionResponse, and a __post_init__ guard that seeds choices from message when none are supplied.
  • parsing.py: Refactors both sync and async parsers to iterate all choices; the new helpers are cleanly symmetric between the sync and async paths.
  • facade.py / adapters/anthropic.py: n is forwarded through completion() and OpenAI-compatible transport, excluded from Anthropic payloads, and stripped in generate()/agenerate() before those single-result methods delegate to completion().

Confidence Score: 5/5

Safe to merge. Changes are additive and backward-compatible: existing callers using response.message are unaffected, new callers can opt into response.choices.

The change is purely additive. n is correctly threaded through the OpenAI-compatible path, excluded from Anthropic, and stripped before generate()/agenerate(). The __post_init__ guard on ChatCompletionResponse ensures single-message construction keeps producing a consistent choices[0]. All changed paths are covered by targeted unit tests that passed at 532/532.

No files require special attention.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/models/clients/types.py Adds ChatCompletionChoice, n field on ChatCompletionRequest, choices list + messages property + __post_init__ on ChatCompletionResponse; backward-compat preserved via __post_init__ seeding choices from message when none are supplied.
packages/data-designer-engine/src/data_designer/engine/models/clients/parsing.py Refactors both sync and async parsers to iterate over all choices; introduces clean symmetric helpers for normalizing, parsing, and extracting choice fields.
packages/data-designer-engine/src/data_designer/engine/models/facade.py Adds n to _COMPLETION_REQUEST_FIELDS so completion() forwards it; strips n in generate() and agenerate() since those methods only consume a single parsed result.
packages/data-designer-engine/src/data_designer/engine/models/clients/adapters/anthropic.py Adds n to the Anthropic exclusion list so it is never forwarded to Anthropic's API.
packages/data-designer-engine/src/data_designer/engine/models/clients/init.py Exports the new ChatCompletionChoice type from the package's public surface.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ModelFacade
    participant ModelClient
    participant Parser as parse_chat_completion_response

    Note over Caller,Parser: completion() path — n is forwarded
    Caller->>ModelFacade: "completion(messages, n=4)"
    ModelFacade->>ModelFacade: consolidate_kwargs → n kept
    ModelFacade->>ModelFacade: "_build_chat_completion_request → ChatCompletionRequest(n=4)"
    ModelFacade->>ModelClient: client.completion(request)
    ModelClient->>Parser: parse_chat_completion_response(raw)
    Parser->>Parser: normalize_choice_list → [c0, c1, c2, c3]
    Parser->>Parser: parse_chat_completion_choice × 4
    Parser-->>ModelClient: "ChatCompletionResponse(message=c0.message, choices=[c0..c3])"
    ModelClient-->>ModelFacade: response
    ModelFacade-->>Caller: response.message (first) or response.choices[i]

    Note over Caller,Parser: generate() path — n is stripped
    Caller->>ModelFacade: "generate(prompt, n=4)"
    ModelFacade->>ModelFacade: completion_kwargs.pop(n) → n removed
    ModelFacade->>ModelFacade: completion(messages) [no n]
    ModelFacade-->>Caller: (parsed_output, messages)

    Note over Caller,Parser: Anthropic path — n excluded from body
    Caller->>ModelFacade: "completion(messages, n=4)"
    ModelFacade->>ModelClient: "AnthropicClient.completion(request with n=4)"
    ModelClient->>ModelClient: "TransportKwargs.from_request(request, exclude={n})"
    ModelClient-->>ModelFacade: response (n never sent to Anthropic API)
Loading

Reviews (3): Last reviewed commit: "strip n from generate requests" | Re-trigger Greptile

@nabinchha nabinchha deployed to agentic-ci May 18, 2026 16:31 — with GitHub Actions Active
@github-actions
Copy link
Copy Markdown
Contributor

Review: PR #672 — fix: restore chat completion multi-choice support

Summary

The PR restores chat-completion multi-choice support (issue #620) by:

  • Adding n to ChatCompletionRequest and forwarding it through ModelFacade.completion() and OpenAI-compatible transport bodies; explicitly excluded from Anthropic forwarding.
  • Adding a new ChatCompletionChoice dataclass and a choices: list[ChatCompletionChoice] field on ChatCompletionResponse. response.message is preserved as the first choice for backward compatibility via __post_init__.
  • Refactoring parse_chat_completion_response / aparse_chat_completion_response to iterate every returned choice (instead of discarding everything after index 0), with shared helpers parse_chat_completion_choice, parse_assistant_message, parse_choice_index, parse_choice_finish_reason, and normalize_choice_list.
  • Adding tests for request n forwarding, multi-choice parsing, single-choice compatibility, and generate(..., n=...) forwarding.

The change is well-scoped and backward-compatible. Existing single-choice callers continue to work via response.message; new callers can iterate response.choices.

Findings

Correctness

  • generate(..., n=...) is a footgun. The n keyword is now on the _COMPLETION_REQUEST_FIELDS allowlist (facade.py:83), so callers can pass it to generate() — and the test test_generate_forwards_n_to_completion exercises that. But generate() only ever consumes completion_response.message.content (facade.py:372, facade.py:475); the remaining n-1 choices are silently dropped. Callers pay for output tokens for no benefit. Options worth considering:
    • Reject n>1 in generate() with a clear error, or
    • Document explicitly in the docstring that n is forwarded but only the first choice is parsed (the PR description notes this, but the public API doesn't), or
    • Pop n out of kwargs in generate() before delegating, since generate semantics don't currently accommodate it.
  • Anthropic silently drops n. n is added to AnthropicClient._TRANSPORT_EXCLUDE (correct — Anthropic Messages API doesn't accept it), so a caller who passes n=4 to an Anthropic-backed model gets a single response with no warning. Consider logging at debug or warning level when n>1 is supplied to a provider that doesn't honor it; otherwise this is an easy-to-miss surprise for users switching providers.
  • generated_images semantics changed for multi-choice. Previously extract_usage received len(images) from the first choice only; now it receives sum(len(c.message.images) for c in choices) (parsing.py:41, parsing.py:55). Identical for n=1 (the only currently exercised path) but worth confirming this aggregation is what billing/usage downstream expects when multi-choice image generation is added.
  • No validation of n. A caller can pass n=0 or a negative integer; the dataclass accepts it and forwards. Fine if the upstream provider is the source of truth, but a small __post_init__ check on ChatCompletionRequest would fail fast.

API design

  • ChatCompletionResponse.message and choices[0].message can drift. If both are passed at construction (as some external callers might), __post_init__ only backfills choices when empty — it does not assert consistency. The internal parsers always keep them in sync, but a stricter invariant (assert/normalize) would prevent subtle bugs later. Lightweight: in __post_init__, after backfill, assert self.message is self.choices[0].message.
  • messages property name collides conceptually with the request messages field. ChatCompletionResponse.messages returns a list[AssistantMessage], while ChatCompletionRequest.messages is the conversation history. assistant_messages or just steering callers toward choices would reduce ambiguity. Minor.

Tests

  • Coverage of the sync path is solid: request forwarding (test_chat_completion_request_n_is_forwarded_into_body), multi-choice parsing (test_parse_chat_completion_response_preserves_all_choices), backward-compat single-message (test_chat_completion_response_exposes_choices_for_single_message), and facade forwarding (test_completion_forwards_n_to_request, test_generate_forwards_n_to_completion).
  • No async multi-choice parsing test. aparse_chat_completion_response and aparse_chat_completion_choice mirror the sync logic but are uncovered. A 1-line async equivalent of test_parse_chat_completion_response_preserves_all_choices would close that gap.
  • No test asserting n is excluded from Anthropic payloads. Adding n to _TRANSPORT_EXCLUDE is a correctness-critical change (Anthropic would 400 on it); a test that builds an AnthropicClient payload from a request with n=4 and asserts "n" not in payload would lock that in.
  • No test for normalize_choice_list edge cases. Specifically, the "single non-list choice" branch (return [raw_choices]) is untested. Worth a unit test if any provider actually returns that shape — otherwise consider deleting the branch and just handling None + list.

Style / conventions

  • Follows project conventions: from __future__ import annotations, modern type syntax (int | None), absolute imports, no comments on obvious code. Good.
  • parse_chat_completion_choice and aparse_chat_completion_choice differ only in sync vs async image extraction — acceptable duplication given the project's existing sync/async split pattern (extract_images_from_chat_message vs aextract_images_from_chat_message).

Performance / security

  • No performance impact — multi-choice parsing iterates choices once. No new I/O.
  • No security implications.

Structural Impact

No pre-computed structural impact analysis was available (/tmp/structural-impact-672.md not present). Manual assessment: the change is contained to data_designer.engine.models.clients and data_designer.engine.models.facade. Import direction (interface → engine → config) is preserved; no new cross-package edges. The ChatCompletionResponse shape change is additive (new field with default factory + backfilling __post_init__), so external consumers reading response.message are unaffected. Risk: low.

Verdict

Approve with minor follow-ups. The core change is clean, backward-compatible, and well-tested for the sync OpenAI-compatible path. The two items I'd most like to see addressed before merge:

  1. Decide on the generate(..., n=...) story — either reject, document, or strip n in generate(). Tests currently encode the silent-drop behavior, which makes it harder to change later.
  2. Add a test asserting n is excluded from the Anthropic payload, since that's the only thing preventing a 400 when an Anthropic-backed model is used.

Nice-to-have: async multi-choice parsing test, soft validation of n>=1, and a clarifying note on ChatCompletionResponse.messages vs choices.

@nabinchha
Copy link
Copy Markdown
Contributor Author

Addressed the feedback in b2e22f7f:

  • generate() and agenerate() now strip n before delegating to completion, since those APIs only expose one parsed result.
  • Replaced the generate(..., n=...) forwarding test with sync/async tests that assert n is dropped.
  • Added async multi-choice parsing coverage.
  • Added Anthropic coverage confirming n is excluded from the payload.

Validation:

  • PYTHONPATH=packages/data-designer-config/src:packages/data-designer-engine/src:packages/data-designer/src uv run --group dev pytest packages/data-designer-engine/tests/engine/models -q532 passed
  • uv run --group dev ruff check ... on touched files → passed

nabinchha added 2 commits May 18, 2026 12:19
Restore the chat completion n request field and preserve all returned choices in the canonical response while keeping response.message as the first choice.

Add coverage for request forwarding, compatibility access, multi-choice parsing, and generate forwarding.

Fixes #620

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
Prevent generate and agenerate from forwarding multi-choice requests that they cannot expose, while keeping completion() multi-choice support intact.

Add coverage for async parsing and Anthropic n exclusion.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
@nabinchha nabinchha force-pushed the nmulepati/fix-620-chat-completion-choices-n branch from b2e22f7 to a0e0fc0 Compare May 18, 2026 18:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ChatCompletionResponse lost .choices and ChatCompletionRequest lost n field

1 participant