Skip to content

fix(adapter): tolerate missing role in TraceToMessages prompt entries#523

Open
itsuzef wants to merge 1 commit intomicrosoft:mainfrom
itsuzef:fix/trace-to-messages-missing-role
Open

fix(adapter): tolerate missing role in TraceToMessages prompt entries#523
itsuzef wants to merge 1 commit intomicrosoft:mainfrom
itsuzef:fix/trace-to-messages-missing-role

Conversation

@itsuzef
Copy link
Copy Markdown

@itsuzef itsuzef commented Apr 28, 2026

Summary

Some tracers (notably AgentOps when re-emitting prior turns inside gen_ai.prompt.N.*) drop the role key on assistant tool-call entries and tool-response entries. convert_to_openai_messages (agentlightning/adapter/messages.py:134) then crashed with KeyError: 'role' at the first such entry, taking down the entire rollout's adapter pass and silently dropping the trajectory's gradient signal during APO.

This change makes the adapter recover when the role is unambiguous, and otherwise log a warning and skip just the offending message:

  • tool_calls present, no role → infer assistant
  • tool_call_id present, no role → infer tool
  • otherwise → logger.warning(...) once and skip that single entry, preserving the rest of the trajectory's signal

The adapter is the contract between any OTel-emitting tracer and the algorithm layer, so the right place to absorb this kind of upstream variance is here — even if AgentOps tightens its serialization, third-party tracers (Phoenix, hand-rolled agl.emit_*, etc.) will keep producing the same shape.

No API change. Zero new pyright errors against messages.py.

Fixes #425
Fixes #311

Verification

Validated end-to-end inside a linux/amd64 container running the official ghcr.io/astral-sh/uv:python3.12-bookworm image with the project's own lockfile, mirroring the ubuntu-latest CI runners:

uv sync --group dev                                                              # 569 packages installed from uv.lock
uv run --no-sync pytest tests/adapter/ -v                                        # 23 passed (was 21; +2 new regression tests)
uv run --no-sync pyright agentlightning/adapter/messages.py \
                         tests/adapter/test_messages_adapter.py                  # 0 errors, 0 warnings, 0 informations
uv run --no-sync pre-commit run --files agentlightning/adapter/messages.py \
                                       tests/adapter/test_messages_adapter.py    # all hooks Passed (isort, black, prettier, eslint, stylelint, ...)

I also confirmed both new regression tests genuinely fail when the patch is removed, with the exact KeyError: 'role' at messages.py:134 reported in #425 — so the test coverage is meaningful and pinned to the user-reported repro shape.

Test plan

  • tests/adapter/test_messages_adapter.py::test_trace_messages_adapter_recovers_assistant_role_from_tool_calls — covers the AgentOps-style re-emitted assistant tool-call entry (no role, has tool_calls) and the matching tool response (no role, has tool_call_id)
  • tests/adapter/test_messages_adapter.py::test_trace_messages_adapter_skips_unidentifiable_prompt_entry — covers the unrecoverable case: garbage entry is skipped with a warning, the rest of the trajectory passes through
  • Pre-existing tests in tests/adapter/test_messages_adapter.py still pass (no behavior change for well-formed traces)

Some tracers (notably AgentOps when re-emitting prior turns inside
``gen_ai.prompt.N.*``) drop the ``role`` key on assistant tool-call
entries and tool-response entries. ``convert_to_openai_messages`` then
crashed with ``KeyError: 'role'`` at the first such entry, taking down
the entire rollout's adapter pass and silently dropping the trajectory's
gradient signal during APO.

Recover when the role is unambiguous (``tool_calls`` -> ``assistant``,
``tool_call_id`` -> ``tool``) and otherwise log a warning and skip just
the offending message, preserving the rest of the trajectory's signal.

Adds regression coverage in tests/adapter/test_messages_adapter.py for
both the inferred-role path and the unidentifiable-skip path.

Fixes microsoft#425
Fixes microsoft#311
@itsuzef
Copy link
Copy Markdown
Author

itsuzef commented Apr 28, 2026

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant