[codex] Add cost-aware task model router#205
Conversation
📝 WalkthroughWalkthroughAdds a cost-aware task complexity router and YAML configuration that scores subtasks (risk, file counts, keywords, security bonus) into high/medium/low tiers and selects provider/model/max_tokens; integrates routing into session creation and includes tests and config-loading/validation logic. Changes
Sequence DiagramsequenceDiagram
participant Factory as Agent Factory
participant Router as TaskComplexityRouter
participant Config as Routing Config Loader
participant Models as Model Pricing/Registry
Factory->>Router: route(subtask, agent_type)
Router->>Config: load_model_routing_config()
Config-->>Router: routing config (thresholds, mappings)
Router->>Models: validate provider/model & pricing
Router->>Router: compute complexity score (base, risk, files, keywords, security)
Router->>Router: clamp score → map to tier (low/medium/high)
Router->>Router: select provider/model and estimate cost
Router-->>Factory: TaskRoute(provider, model, complexity, score, cost, reasoning)
Factory->>Factory: apply route to ProviderConfig (clone & mutate)
Factory-->>Factory: return configured session
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/config/model_routing.yaml`:
- Around line 19-25: Remove the duplicate "migrate" entry from the
architecture_keywords list in model_routing.yaml so each keyword is unique;
locate the architecture_keywords array (symbol: architecture_keywords) and
delete the redundant "migrate" string occurrence to avoid skewing scoring and
make tuning clearer.
In `@apps/backend/core/providers/factory.py`:
- Around line 41-42: The code uses str(getattr(route, "provider")) and
str(getattr(route, "model")), which triggers Ruff B009; replace these constant
getattr calls with direct attribute access (e.g., use str(route.provider) and
str(route.model) or simply route.provider and route.model if they are already
strings) to silence the lint error and preserve the same behavior for the
TaskRoute-like object.
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 9-20: The import block at the top of the module is unsorted and
triggers Ruff I001; reorder and format imports into the standard groups (future,
standard library, third-party, local application) and sort each group
alphabetically, ensuring single blank lines between groups; specifically
reorganize the imports that include "from __future__ import annotations", stdlib
names (copy, logging, os, pathlib.Path, typing.Any), third-party names (yaml),
and local providers (from core.providers.config import AIEngineProvider and from
core.providers.cost_calculator import MODEL_PRICING) to comply with isort/Ruff
rules (or run isort/ruff --fix) so the linter passes.
- Around line 177-180: The YAML load can raise yaml.YAMLError but callers expect
ValueError per the function docstring; wrap the yaml.safe_load(config_file) call
in a try/except that catches yaml.YAMLError (and optionally MarkedYAMLError) and
re-raise a ValueError with a clear message including the path to the config file
so the contract is honored; keep the rest of the logic (assigning to loaded and
the isinstance check) unchanged and reference the existing variables
`config_file`, `loaded`, and `path` when creating the error message.
In `@apps/backend/core/providers/task_router.py`:
- Around line 155-162: The substring checks on description cause false
positives; update the any(...) checks that compare description against
architecture_keywords and trivial_keywords to use boundary-aware matching (e.g.,
regex word boundaries or tokenization) instead of simple "in" substring checks
so terms like "platform" won't match "format"; specifically replace the
predicates in the blocks referencing description, architecture_keywords,
trivial_keywords and the scoring updates using architecture_keyword_score and
trivial_keyword_score to use re.search(r'\b' + re.escape(keyword) + r'\b',
description) (or equivalent token-based lookup) before adjusting score.
- Around line 206-208: The current count uses len(files_to_modify) +
len(files_to_create) which double-counts paths appearing in both lists; update
the logic in the function handling subtask (referencing files_to_modify and
files_to_create) to build a deduplicated set of file paths from both lists
(e.g., combine them and take set()) and return the length of that set so shared
paths are counted only once; ensure you still handle missing keys/default empty
lists as before.
In `@tests/test_task_router.py`:
- Around line 23-124: The tests rely on default MODEL_ROUTER_* behavior but
don’t clear environment overrides, so add an autouse pytest fixture (e.g.,
isolate_model_router_env) that uses monkeypatch to remove any environment keys
starting with "MODEL_ROUTER_" before each test runs; implement it in the test
module and ensure it runs automatically (autouse=True) so TaskComplexityRouter
instantiations in tests use the intended defaults and the route/score assertions
remain deterministic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: b1124ae9-4f27-4861-abdd-082f3f7f6508
📒 Files selected for processing (6)
apps/backend/config/model_routing.yamlapps/backend/context/prioritizer.pyapps/backend/core/providers/factory.pyapps/backend/core/providers/task_router.pyapps/backend/core/providers/task_router_config.pytests/test_task_router.py
| architecture_keywords: | ||
| - "architect" | ||
| - "design" | ||
| - "refactor" | ||
| - "migrate" | ||
| - "migrate" | ||
| - "rewrite" |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove duplicate keyword entry in architecture_keywords.
Line 24 repeats "migrate", which is redundant and makes score tuning harder to reason about.
Suggested cleanup
architecture_keywords:
- "architect"
- "design"
- "refactor"
- "migrate"
- - "migrate"
- "rewrite"
- "restructure"
- "overhaul"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend/config/model_routing.yaml` around lines 19 - 25, Remove the
duplicate "migrate" entry from the architecture_keywords list in
model_routing.yaml so each keyword is unique; locate the architecture_keywords
array (symbol: architecture_keywords) and delete the redundant "migrate" string
occurrence to avoid skewing scoring and make tuning clearer.
7081309 to
a5e641e
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 176-183: The loader currently opens the file with path.open(...)
and may let OSError (e.g., permission denied) bubble out, violating the
documented config-load contract; modify the block around path.open and
yaml.safe_load in the function that reads the model routing config so that any
OSError (or other IO-related exceptions from opening/reading the file) is caught
and re-raised as a ValueError with a clear message like "Cannot read model
routing config: {path}" while preserving the original exception as the __cause__
(raise ... from exc), keeping the existing yaml.YAMLError handling and the check
that loaded is a dict.
In `@apps/backend/core/providers/task_router.py`:
- Around line 66-68: In route(), don't assume subtask is dict-like: validate
that the incoming subtask is a mapping (e.g., isinstance(subtask, dict) or has
required attributes) before passing to detect_work_type and
self.risk_analyzer.analyze_subtask_risks; if it's malformed, replace it with an
empty dict (safe_subtask = {}), optionally log a warning/error referencing the
original value, and then call detect_work_type(safe_subtask) and
self.risk_analyzer.analyze_subtask_risks(safe_subtask) to avoid AttributeError
from downstream analyzers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ac303e77-55f7-4e7a-b2f1-6d603f78d225
📒 Files selected for processing (6)
apps/backend/config/model_routing.yamlapps/backend/context/prioritizer.pyapps/backend/core/providers/factory.pyapps/backend/core/providers/task_router.pyapps/backend/core/providers/task_router_config.pytests/test_task_router.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/core/providers/task_router_config.py`:
- Around line 130-155: The _validate_config() routine currently checks provider
and model independently and assumes routing and complexity_thresholds are
mappings; update _validate_config to first assert that config.get("routing") and
config.get("complexity_thresholds") are dicts (raise ValueError if not), then
for each complexity level ("high","medium","low") assert routing[level] is a
dict and contains both "provider" and "model", normalize provider aliases (e.g.,
map "claude" -> "anthropic") and lowercase them, verify the provider is in
AIEngineProvider (use AIEngineProvider values set), verify model exists in
MODEL_PRICING, and finally ensure the MODEL_PRICING entry for that model maps to
the same normalized provider (so invalid pairs like provider: google + model:
gpt-4o fail at load time); also validate complexity_thresholds entries are
numeric and that high > medium > low.
In `@tests/test_task_router.py`:
- Around line 287-370: Add a test that verifies passing an explicit model to
create_agent_session bypasses routing: call factory.create_agent_session with
subtask={"..."} and model="gpt-4o-mini", patch ProviderConfig.from_env, patch
factory.create_engine_provider to return a DummyProvider, and patch
TaskComplexityRouter (or assert its constructor/methods are not called); then
assert create_engine_provider was called with a ProviderConfig whose
openai_model (or model field) equals the explicit model and that the returned
session.config.model equals that model, and assert TaskComplexityRouter was not
invoked (use symbols create_agent_session, ProviderConfig.from_env,
create_engine_provider, TaskComplexityRouter to locate code).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 97d5949a-a288-4c54-ab11-a897443a129e
📒 Files selected for processing (3)
apps/backend/core/providers/task_router.pyapps/backend/core/providers/task_router_config.pytests/test_task_router.py
|



Base Branch
developbranch (required for all feature/fix PRs)main(hotfix only - maintainers)Description
Adds runtime, cost-aware model routing for agent subtasks. The router analyzes work type, predicted risks, file count, and task wording to select a low/medium/high complexity route, then maps that route to a configured provider/model.
Related Issue
Closes #
Type of Change
Area
Commit Message Format
Follow conventional commits:
<type>: <subject>Types: feat, fix, docs, style, refactor, test, chore
Example:
feat: add user authentication systemChecklist
developbranchencoding="utf-8"for text filesPlatform Testing Checklist
CRITICAL: This project supports Windows, macOS, and Linux. Platform-specific bugs are a common source of breakage.
platform/module instead of directprocess.platformchecksPath/ existing config resolution)If you only have access to one OS: CI now tests on all platforms. Ensure all checks pass before submitting.
CI/Testing Requirements
Screenshots
N/A - backend-only change.
Feature Toggle
use_feature_nameConfigured through
apps/backend/config/model_routing.yamlandMODEL_ROUTER_{LEVEL}_PROVIDER/MODEL_ROUTER_{LEVEL}_MODELenvironment overrides.Breaking Changes
Breaking: No
Details:
create_agent_session()remains backward compatible. The newsubtaskparameter is optional, and explicitmodeloverrides still bypass routing.Validation
.venv/bin/python -m ruff --version-ruff 0.14.10.venv/bin/python -m ruff check apps/backend/ tests/test_task_router.py- passed.venv/bin/python -m ruff format apps/backend/ tests/test_task_router.py --check --diff- passed.venv/bin/python -m pytest tests/test_provider_factory.py tests/test_task_router.py- 72 passed.venv/bin/python -m py_compile apps/backend/core/providers/task_router.py apps/backend/core/providers/task_router_config.py apps/backend/core/providers/factory.py apps/backend/context/prioritizer.py tests/test_task_router.py- passedSummary by CodeRabbit
Release Notes
New Features
Tests