Skip to content

Fix AI Gateway model regressions#14

Merged
jhaynie merged 1 commit into
mainfrom
fix-aigateway-model-regressions
May 8, 2026
Merged

Fix AI Gateway model regressions#14
jhaynie merged 1 commit into
mainfrom
fix-aigateway-model-regressions

Conversation

@jhaynie
Copy link
Copy Markdown
Member

@jhaynie jhaynie commented May 8, 2026

Summary

  • strip DeepSeek provider prefixes before forwarding OpenAI-compatible requests
  • pass through upstream non-2xx responses without parsing them as successful completions
  • parse OpenAI-compatible structured array message content for providers such as Mistral
  • add focused unit and live integration coverage for the AI Gateway regression models

Validation

  • go test ./...
  • LLMPROXY_LIVE_AIGATEWAY_REGRESSION=1 GATEWAY_DEEPSEEK_API_KEY="$(gluon secret get GATEWAY_DEEPSEEK_API_KEY --value)" GATEWAY_MISTRAL_API_KEY="$(gluon secret get GATEWAY_MISTRAL_API_KEY --value)" GATEWAY_COHERE_API_KEY="$(gluon secret get GATEWAY_COHERE_API_KEY --value)" go test -tags integration -run TestLiveAIGatewayRegressionModels -count=1 -v

Summary by CodeRabbit

  • New Features

    • Added Deepseek model provider support.
    • Enhanced OpenAI-compatible response handling to accept string or array-based content parts.
  • Bug Fixes

    • Upstream HTTP error responses are now forwarded as-is, preserving original error bodies and avoiding extra parsing.
  • Tests

    • Added live integration regression tests and expanded unit tests for error-paths and provider-prefix handling.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df9102e8-9e66-4e67-bb1d-a572099b1dc1

📥 Commits

Reviewing files that changed from the base of the PR and between 1d7ecb7 and 536df15.

📒 Files selected for processing (6)
  • aigateway_regression_live_test.go
  • autorouter.go
  • autorouter_test.go
  • providers/openai_compatible/extractor.go
  • providers/openai_compatible/extractor_test.go
  • proxy.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • providers/openai_compatible/extractor_test.go
  • proxy.go
  • aigateway_regression_live_test.go
  • providers/openai_compatible/extractor.go
📜 Recent review details
🧰 Additional context used
🪛 golangci-lint (2.12.1)
autorouter.go

[error] 228-228: Error return value of resp.Body.Close is not checked

(errcheck)

🔇 Additional comments (4)
autorouter.go (2)

585-585: Nice catch adding deepseek to known provider prefixes.

This aligns model normalization with the new DeepSeek routing behavior.


226-233: ⚡ Quick win

Repository access required for verification.

Unable to verify the errcheck violation claim at line 228 of autorouter.go due to repository access failure. Verification requires inspection of: (1) actual code content at the specified lines, (2) project lint configuration to confirm errcheck enforcement, and (3) error handling patterns used in the codebase to ensure consistency.

autorouter_test.go (2)

230-283: Strong regression coverage for upstream error pass-through.

These tests clearly lock in the intended behavior: return upstream error bodies as-is and skip extractor parsing on non-2xx responses.

Also applies to: 348-458


285-346: DeepSeek prefix handling is well covered.

The dedicated forwarding test plus table-case extension make the prefix-strip behavior explicit and stable.

Also applies to: 603-603


📝 Walkthrough

Walkthrough

The PR short-circuits upstream HTTP error responses in router and proxy, adds OpenAI-compatible response content normalization, includes deepseek provider prefix stripping, and adds/updates unit and gated live integration tests covering these behaviors.

Changes

Error Handling & Provider Compatibility

Layer / File(s) Summary
OpenAI Content Normalization
providers/openai_compatible/extractor.go, providers/openai_compatible/extractor_test.go
ResponseMessage.UnmarshalJSON normalizes incoming content from either a string or typed-part array into a single concatenated string; imports added for string handling.
Error Response Short-circuiting
autorouter.go, proxy.go
AutoRouter.roundTrip and Proxy.roundTrip now detect upstream status >= 400, read the raw response body, and return immediately without invoking the provider extractor, returning empty metadata and closing resp.Body when present.
DeepSeek Provider Prefix Support
autorouter.go, autorouter_test.go
knownProviderPrefixes extended to include deepseek; tests added to validate stripping of deepseek/ from model identifiers before forwarding.
Unit Test Coverage
autorouter_test.go
Tests verify upstream error-body passthrough without extractor invocation for AutoRouter and Proxy, Deepseek prefix stripping, Cohere empty-error handling, and many tests updated to create requests with contexts.
Live Integration Testing
aigateway_regression_live_test.go
Integration-tagged test gated by LLMPROXY_LIVE_AIGATEWAY_REGRESSION=1 iterates provider scenarios with API-key gating, routes requests through AutoRouter, and validates 2xx responses; includes helper to truncate response bodies for failure messages.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
autorouter_test.go (1)

263-263: 💤 Low value

Static analysis suggests using httptest.NewRequestWithContext.

The noctx linter flags httptest.NewRequest calls. While httptest.NewRequest uses context.Background() internally, using NewRequestWithContext explicitly makes context handling clear.

Note: This pattern is used throughout the existing tests in this file, so this is a low-priority cleanup that could be addressed project-wide.

Example fix (if adopting project-wide)
-	req := httptest.NewRequest("POST", "/v1/chat/completions", bytes.NewReader([]byte(`{"model":"gpt-4"}`)))
+	req := httptest.NewRequestWithContext(context.Background(), "POST", "/v1/chat/completions", bytes.NewReader([]byte(`{"model":"gpt-4"}`)))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@autorouter_test.go` at line 263, Replace the call to httptest.NewRequest in
the test (the variable req in autorouter_test.go) with
httptest.NewRequestWithContext and pass an explicit context (e.g.,
context.Background()) so the request has an explicit context; update the test to
import context if missing and follow the same replacement for other
httptest.NewRequest calls in this file to satisfy the noctx linter.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@autorouter.go`:
- Around line 226-232: The error-handling branch in autorouter.go reads
resp.Body but never closes it; update the block that handles resp.StatusCode >=
http.StatusBadRequest to ensure resp.Body.Close() is called in all returns (both
when io.ReadAll returns data and when it returns readErr) — e.g., call
resp.Body.Close() immediately after io.ReadAll (and also before returning on
readErr) so the response body is always closed (same fix pattern as used in
proxy.go).

In `@proxy.go`:
- Around line 129-135: The error branch reads resp.Body into rawBody but never
closes the original response body, leaking the connection; in the block that
handles resp.StatusCode >= http.StatusBadRequest (the code that reads via
io.ReadAll(resp.Body) and returns rawBody and ResponseMetadata{}), ensure you
call resp.Body.Close() after successfully reading (or use defer immediately
after checking resp.Body != nil) and before returning the wrapped NopCloser so
the original body is released back to the pool; update the error-return path
around resp and rawBody to close resp.Body reliably.

---

Nitpick comments:
In `@autorouter_test.go`:
- Line 263: Replace the call to httptest.NewRequest in the test (the variable
req in autorouter_test.go) with httptest.NewRequestWithContext and pass an
explicit context (e.g., context.Background()) so the request has an explicit
context; update the test to import context if missing and follow the same
replacement for other httptest.NewRequest calls in this file to satisfy the
noctx linter.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a227cfba-0bc9-4b15-8770-49892adbd4f7

📥 Commits

Reviewing files that changed from the base of the PR and between 235565c and 1d7ecb7.

📒 Files selected for processing (6)
  • aigateway_regression_live_test.go
  • autorouter.go
  • autorouter_test.go
  • providers/openai_compatible/extractor.go
  • providers/openai_compatible/extractor_test.go
  • proxy.go
📜 Review details
🧰 Additional context used
🪛 golangci-lint (2.12.1)
autorouter_test.go

[error] 263-263: net/http/httptest.NewRequest must not be called. use net/http/httptest.NewRequestWithContext

(noctx)

🔇 Additional comments (9)
providers/openai_compatible/extractor.go (1)

139-180: LGTM! Clean implementation of multi-format content handling.

The UnmarshalJSON correctly normalizes OpenAI-compatible message content from both string and array-of-parts formats. The fallback to raw string on line 170 provides graceful degradation for unexpected formats.

providers/openai_compatible/extractor_test.go (1)

137-182: LGTM! Good test coverage for the new array content parsing.

The test validates the key scenario (concatenating multiple text parts from an array) and verifies raw body preservation.

autorouter.go (1)

582-582: LGTM! DeepSeek prefix added correctly.

The addition of "deepseek": true to knownProviderPrefixes enables proper stripping of deepseek/ prefixes before forwarding requests.

aigateway_regression_live_test.go (1)

1-102: LGTM! Well-structured live integration test.

The test is properly guarded with environment variables and build tags. The table-driven approach with per-provider skip conditions provides good flexibility for selective testing.

autorouter_test.go (5)

230-283: LGTM! Comprehensive test for upstream error pass-through.

The test correctly verifies that:

  1. The extractor is not called for error responses
  2. The upstream error body is preserved
  3. Metadata remains empty
  4. No error is returned to the caller

285-346: LGTM! Good test for DeepSeek prefix stripping.

The test verifies that deepseek/deepseek-v4-pro is correctly stripped to deepseek-v4-pro before being forwarded upstream.


348-413: LGTM! Good edge case coverage for empty error bodies.

The test confirms that an empty upstream error body (Cohere scenario) doesn't trigger extractor errors.


415-458: LGTM! Proxy error pass-through test matches AutoRouter behavior.

Good coverage ensuring both Proxy and AutoRouter handle upstream errors consistently.


603-603: LGTM! Test case added for deepseek prefix stripping.

Comment thread autorouter.go
Comment thread proxy.go
@jhaynie jhaynie force-pushed the fix-aigateway-model-regressions branch from 1d7ecb7 to 536df15 Compare May 8, 2026 11:11
@jhaynie jhaynie merged commit 6a5084d into main May 8, 2026
2 checks passed
@jhaynie jhaynie deleted the fix-aigateway-model-regressions branch May 8, 2026 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant