fix(server): report idle sessions as not-running on reconnect by dimakis · Pull Request #402 · dimakis/mitzo

dimakis · 2026-06-26T13:34:52Z

Summary

Root cause: handleReconnect used isActive() (session exists in registry) to determine running state, but this returns true for sessions whose query loop is alive but idle between turns. The frontend interprets running=true as "agent is busy" and queues messages behind a 5-second fallback timer instead of sending immediately — causing the persistent "two messages needed after reattach" bug.
Fix: After the existing isActive/storeMeta.isActive checks, also check lastSpeaker from EventStore. lastSpeaker=assistant means the agent completed its turn (idle) → running=false. lastSpeaker=user means the agent hasn't answered yet → running=true.
Adds 2 test cases covering idle (lastSpeaker=assistant) and mid-turn (lastSpeaker=user) scenarios. All 136 tests pass.

Context

This bug has been the target of 13+ PRs (#320, #348, #353, #360, #362, #367, #368, #371, #372, #381, #384, #386, #392) fixing symptoms at different layers (SSE race conditions, iOS WS stale readyState, idempotency, state machine infra). None addressed the semantic mismatch: "query loop alive" ≠ "agent actively processing." The 5-second PENDING_SEND_TIMEOUT_MS timer in store.ts:355 is literally a band-aid for this — its comment says "stale running state after reconnect."

Test plan

All 136 ws-handler-v2 tests pass (134 existing + 2 new)
Manual: close app mid-session, reopen, send message — should respond immediately without needing a second "nudge" message
Verify agent mid-turn reconnect still shows running state correctly

🤖 Generated with Claude Code

handleReconnect conflated "query loop alive" with "agent actively processing a turn". isActive() returns true for sessions whose query loop is alive but idle between turns (waiting in `for await`). This caused the reconnected message to report running=true, making the frontend queue the first user message behind a 5-second fallback timer instead of sending it immediately — the "two messages to get a response after reattach" bug. Fix: check lastSpeaker from EventStore. When lastSpeaker=assistant the agent completed its last turn and is idle, so report running=false. When lastSpeaker=user the agent hasn't answered yet, so running=true. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

dimakis

Centaur Review

Found 3 issue(s) (1 warning).

`server/ws-handler-v2.ts`

Core fix is correct and well-tested, but the same idle-session running-state logic is missing from handleSwitchSession, which sends the same running field to the same client queue mechanism — leaving the 5-second delay bug reachable via the switch_session path.

🟡 regressions: handleSwitchSession (line 429) computes running without the new lastSpeaker idle check. Both reconnected and session_switched drive SET_RUNNING in the client store, which controls whether sendMessage queues behind a 5-second fallback timer. An idle session (lastSpeaker=assistant) will be reported as running=false on reconnect but running=true on switch_session, causing the same queuing delay this PR fixes — just via a different code path. [fixable]

`server/tests/ws-handler-v2.test.ts`

Core fix is correct and well-tested, but the same idle-session running-state logic is missing from handleSwitchSession, which sends the same running field to the same client queue mechanism — leaving the 5-second delay bug reachable via the switch_session path.

🔵 missing_tests (L1148): The new idle-session tests don't verify that reattachChat is NOT called when running is downgraded to false by the lastSpeaker check. The zombie test (line 1117) explicitly asserts expect(reattachChat).not.toHaveBeenCalled(), but the new idle tests skip this. Since the code at line 284 gates reattach on running, and running is now false for idle sessions, it would be good to confirm the reattach is correctly skipped — especially since the session is still alive (unlike zombies where it's removed). [fixable]
🔵 missing_tests (L1148): No test covers the case where storeMeta exists, isActive is true, but lastSpeaker is null (brand-new session that hasn't had any turns yet). The implementation handles it correctly (null !== 'assistant' → running stays true), but an explicit test would document this edge case and guard against future regressions. [fixable]

dimakis · 2026-06-26T13:37:52Z

    expect(reattachChat).not.toHaveBeenCalled();
  });

+  it('reports running=false when session is active but idle (lastSpeaker=assistant)', () => {


🔵 missing_tests: The new idle-session tests don't verify that reattachChat is NOT called when running is downgraded to false by the lastSpeaker check. The zombie test (line 1117) explicitly asserts expect(reattachChat).not.toHaveBeenCalled(), but the new idle tests skip this. Since the code at line 284 gates reattach on running, and running is now false for idle sessions, it would be good to confirm the reattach is correctly skipped — especially since the session is still alive (unlike zombies where it's removed). [fixable]

dimakis · 2026-06-26T13:37:52Z

    expect(reattachChat).not.toHaveBeenCalled();
  });

+  it('reports running=false when session is active but idle (lastSpeaker=assistant)', () => {


🔵 missing_tests: No test covers the case where storeMeta exists, isActive is true, but lastSpeaker is null (brand-new session that hasn't had any turns yet). The implementation handles it correctly (null !== 'assistant' → running stays true), but an explicit test would document this edge case and guard against future regressions. [fixable]

dimakis commented Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(server): report idle sessions as not-running on reconnect#402

fix(server): report idle sessions as not-running on reconnect#402
dimakis wants to merge 1 commit into
mainfrom
fix/reconnect-running-state

dimakis commented Jun 26, 2026

Uh oh!

dimakis left a comment

Uh oh!

dimakis Jun 26, 2026

Uh oh!

dimakis Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dimakis commented Jun 26, 2026

Summary

Context

Test plan

Uh oh!

dimakis left a comment

Choose a reason for hiding this comment

Centaur Review

server/ws-handler-v2.ts

server/__tests__/ws-handler-v2.test.ts

Uh oh!

dimakis Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

dimakis Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`server/ws-handler-v2.ts`

`server/tests/ws-handler-v2.test.ts`