Skip to content

Feature Request: Add opus/ogg format support for TTS output #111

@stayif

Description

@stayif

Background

MiniMax TTS API currently supports output formats: \mp3, \wav, \ lac, \pcm. However, \opus\ (and \ogg) is missing, which creates friction when integrating with messaging platforms.

Use Case

Many messaging platforms natively support voice messages using the opus codec:

  • Feishu/Lark: Sends .opus/.ogg\ files as native audio voice bubbles (\msg_type: audio), but treats .mp3\ as file attachments that require download before playback.
  • Telegram: Opus is the native format for voice messages, enabling waveform preview and instant playback.
  • WhatsApp, Discord: Also prefer opus for voice messages.

Without opus support at the TTS output level, developers need an additional ffmpeg conversion step (\mp3 → opus), which:

  1. Adds latency (~2-5s per conversion)
  2. Requires an external dependency (ffmpeg)
  3. Complicates serverless / lightweight deployment scenarios

Proposal

Add \opus\ as a supported value for the \�udio_setting.format\ parameter in both HTTP and WebSocket TTS APIs.

Preferred implementation:

  • Direct opus output from the TTS pipeline (no post-conversion)
  • Support in both sync (HTTP/WebSocket) and async TTS endpoints

Minimum viable:

  • Even container-level mp3→opus conversion would be helpful if it saves external dependency

API References

  • HTTP: \POST /v1/t2a_v2\ — \�udio_setting.format\ currently accepts: mp3, wav, flac, pcm
  • WebSocket: \wss://api.minimaxi.com/v1/t2a_v2\ — same \�udio_setting.format\ parameter

Additional Context

This would significantly improve the developer experience for anyone building chatbots, AI assistants, or voice-enabled agents that need to deliver TTS results as native voice messages on modern messaging platforms.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions