Skip to content

Slot config UX Phase 3: non-manual chat templates (model-level + per-slot override)#802

Open
thinmintdev wants to merge 12 commits into
mainfrom
afk/slot-config-phase3-templates
Open

Slot config UX Phase 3: non-manual chat templates (model-level + per-slot override)#802
thinmintdev wants to merge 12 commits into
mainfrom
afk/slot-config-phase3-templates

Conversation

@thinmintdev

Copy link
Copy Markdown
Contributor

Phase 3 (final) of the slot-config UX redesign (spec/plan in docs/superpowers/). Builds on Phase 1 (#796) + Phase 2 (#800).

What changes — no more hand-editing TOML for chat templates

  • Template library + catalog. Bundled jinja templates ship in the package (chatml, llama3) and seed <model-store>/chat-templates/ on startup (absent-only, skipped silently on a read-only store). GET /api/chat-templates lists them + auto; POST accepts a custom paste (id-validated).
  • Model-level default. The model recipe editor gains a Chat template field writing defaults.chat_template.
  • Per-slot override. The edit drawer's Model group gets a Template row showing the model's template read-only with [Override] → a picker writing the slot's chat_template (slot wins). Fires the non-blocking restart.
  • Applied by the container. resolve_chat_template(slot_cfg, model_info) picks the effective id (slot > model > none=auto); the container provider emits --chat-template-file <store>/chat-templates/<id>.jinja. The store is mounted identical-path read-only into the container, so no path translation. auto/none → no flag (GGUF-embedded template).
  • Surfaced chat_template in the slot-list payload so the override seeds from disk.

⚠️ Needs CT105 validation

The wiring is fully unit/e2e tested here, but this VM has no GPU/container runtime — that llama-server in the container actually loads the mounted --chat-template-file must be validated on CT105 before relying on it.

Verification

  • Backend: test_chat_template_resolve + test_chat_templates (catalog, seeding, POST validation incl. path-traversal reject) + test_container_chat_template + existing container (40) + slot_view (with new chat_template assert) — all green. ruff check + format clean.
  • UI: npm run typecheck 0 · npm run build 0 · full Playwright e2e 239 passed / 6 skipped / 0 failed (new recipe-template + slot-override specs).
  • Regression caught + fixed in-PR: the recipe modal is mounted while a model is merely selected, so an ungated useChatTemplates fetched on the models list and a non-array response slipped past || [] → crashed the modal. Fixed with enabled:open gating, retry:false, and Array.isArray guards.

Series complete

Phases 1–3 of the slot-config UX redesign are now all up. Follow-up #799 (MTP bundle bench-retune) remains.

🤖 Generated with Claude Code

thinmintdev and others added 12 commits June 14, 2026 06:32
Per-slot mtp override in container flag-building + capability-gated MTP pill.
2 TDD tasks + a bench-retune issue. Bases on Phase 1 (#796).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add SlotConfig.mtp (bool | None) so a slot can force MTP on/off
independently of its profile.  resolve_profile_flags gains an
mtp_override param (None = inherit profile).  _profile_image_and_flags
accepts the override and recomputes flags from the profile's raw .flags
when set, bypassing any pre-expanded resolved_flags on ResolvedProfile.
container_spec passes slot_cfg.get("mtp") through.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds an MTP PillToggle inside the Inference FieldGroup (after Reasoning).
Renders only when slot.device starts with "gpu-rocm" AND the loaded model's
tags include "mtp". Toggle writes { mtp: next } via PUT /config then fires
a non-blocking restart (mirrors profile-change pattern). Tests C7i/C7j added
to slot-drawer-profile-v3.spec.ts; Page type re-exported from apiMock fixture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Model-level + slot-override chat templates, store-backed library, container
--chat-template-file emission. 5 TDD tasks. Container load needs CT105 validation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add chat_template to ModelDefaults and SlotConfig (slot override wins
over model default; 'auto'/None both resolve to None for GGUF-embedded
template). Add module-level resolve_chat_template() near
resolve_profile_flags. 3 TDD tests green, ruff clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…eeding

Adds GET/POST /api/chat-templates backed by <model_store_root>/chat-templates/,
seeds chatml and llama3 bundled templates at startup (absent-only, non-fatal on
read-only store), and validates custom template ids against a strict regex to
prevent path traversal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…template

Add `chat_template_path: str | None` param to `_llama_launch_plan`; when
set, inserts `["--chat-template-file", path]` into the command after
flag_tokens but before extra_tokens so a manual override in
[server].extra_args still wins on a later duplicate.

In `container_spec`, resolve the effective template id via
`resolve_chat_template(slot_cfg, model_info)` (slot override >
model defaults.chat_template > None/'auto') and compute the host path
as `Path(model_store_root()) / "chat-templates" / f"{tmpl_id}.jinja"`,
then pass it through to `_llama_launch_plan`. The model store is mounted
identical-path into the container so the host path == container path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add a chat_template <select> to the RecipeEditorModal seeded from
GET /api/chat-templates; the selected value is written into
defaults.chat_template on Save (PUT /api/models/{id}).

- NEW ui/src/api/hooks/useChatTemplates.ts — useQuery wrapper
- endpoints.ts — chatTemplates: '/api/chat-templates'
- model-modals.jsx — chatTemplate state + <select> + onSave write
- hooks/index.ts — re-export useChatTemplates
- NEW tests/e2e/specs/model-recipe-template-v3.spec.ts — 2 tests (red→green)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Part A: surface chat_template from TOML config in slot-view aggregator
alongside mtp/enable_thinking (tri-valued: string/None/absent → null).
Tests: TestConfigEnrichment.test_config_fields_surfaced +
test_absent_config_fields_surface_as_defaults.

Part B: Template row in the Model FieldGroup of EditSlotDrawer.
- Read-only display shows model-level default (model.defaults.chat_template
  or "auto") with an [Override] button.
- [Override] reveals a select fed by useChatTemplates() (auto + catalogue).
- Override state seeds from slot.chat_template; starts in override mode
  when the slot has an existing override.
- Save includes chat_template in slotBody only when changed (dirty-tracks
  against slot.chat_template, mirrors profileChanged pattern).
- A chat_template change fires the non-blocking restart (mirrors MTP/profile).
E2E: C7k added to slot-drawer-profile-v3.spec.ts.

All gates: 49 pytest, 39 playwright (chromium), typecheck 0, build 0, ruff clean.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…-storm, Array.isArray guard

The recipe modal is mounted while a model is merely selected, so an ungated
useChatTemplates fired its (unmocked-in-test / failing) fetch on the models
list and detached row controls. And when the endpoint returns a non-array, the
'|| []' / '?? []' guards let a non-array through → '.filter is not a function'
crashed the modal. Fixes: enabled:open gating, retry:false, and Array.isArray
guards at both the recipe-editor and slot-drawer template selects. Restores the
models-v3 recipe + delete-cascade e2e (full suite 239 passed).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant