API Reference
Streaming Audio
Stream audio in real-time as it’s generated
POST
Streaming Audio
Get audio chunks progressively instead of waiting for the complete file. Perfect for real-time applications like chatbots and voice assistants.
Request
Streaming uses the same payload as regular speech generation, but audio arrives incrementally:Parameters
Bearer token:
Bearer sk-your-api-keyText to convert (up to 5,000 characters). Audio streams as it’s generated.
Voice identifier like
lyra, kai, or zara. See all voicesRecommended:
turbo-3 for lowest latency or mini-2 for lightweight streaming. Other models are supported but send larger chunks.Response
Returns chunked MP3 data withTransfer-Encoding: chunked header. Audio starts playing before generation completes.
Common patterns
Stream and save
Play in browser
Handle interruptions
Why streaming?
Streaming delivers audio faster because you don’t wait for complete generation. Users hear the first words within a second, making your app feel more responsive. Use streaming when:- Building conversational AI or chatbots
- Creating real-time voice experiences
- User experience matters more than file size
- You need low perceived latency
- Generating audio files for storage
- Creating podcasts or audiobooks
- File size optimization matters
- You process audio in batches
Streaming Audio

