Skip to content

Wyoming TTS: crash when client specifies a voice — SynthesizeVoice object passed where str expected #593

@crudiyay

Description

@crudiyay

Bug

agent_cli/server/tts/wyoming_handler.py (v0.101.0, lines 93/95) passes the raw Wyoming SynthesizeVoice object into the model manager:

await self._synthesize_streaming(manager, text, synthesize.voice)
...
await self._synthesize_complete(manager, text, synthesize.voice)

manager.synthesize(text, voice=...) expects str | None, so any client that specifies a voice crashes synthesis:

RuntimeError: argument should be a str or an os.PathLike object where __fspath__ returns a str, not 'SynthesizeVoice'

The handler's exception path then returns empty audio, so the client just hears nothing.

Impact

Home Assistant sends SynthesizeVoice whenever the user picks a voice in the pipeline settings — i.e., voice selection in HA makes TTS silently fail (works only when no voice is chosen). Combined with #592 this makes the Kokoro Wyoming path effectively broken for HA out of the box.

Repro

  1. agent-cli server tts --backend kokoro --model am_adam --model bm_george (v0.101.0, macOS arm64, uv tool install)
  2. HA Wyoming integration → pipeline TTS = agent-cli-tts, voice = bm_george
  3. Trigger any TTS (pipeline response or assist_satellite.announce) → no audio; server log shows the RuntimeError above

Fix

voice = synthesize.voice.name if synthesize.voice else None

(Verified locally — with this one-liner HA voice selection works, including speaker selection via SynthesizeVoice.speaker being ignored gracefully.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions