| Async Pro v1.0 | Async Flash v1.5 | Async Flash v1.0 | |
|---|---|---|---|
| Model ID | async_pro_v1.0 | async_flash_v1.5 | async_flash_v1.0 |
| Best for | Highest-quality, low-latency English speech | Low-latency streaming with built-in text normalization | Real-time multilingual applications |
| Languages | English | English, Spanish, French, German, Italian, Portuguese | English, French, Spanish, German, Italian, Portuguese, Arabic, Russian, Romanian, Japanese, Hebrew, Armenian, Turkish, Hindi, Chinese |
| Supported endpoints | Streaming, WebSocket | Streaming, WebSocket | All (Streaming, WebSocket, HTTP sync, Timestamps) |
| Text normalization | Built-in | Built-in | Not available |
speed_control | Not available | Not available | 0.7 – 2.0 |
stability | Not available | Not available | 0 – 100 |
async_pro_v1.0POST /text_to_speech/streaming), WebSocket (WSS /text_to_speech/websocket/ws)async_flash_v1.5POST /text_to_speech/streaming), WebSocket (WSS /text_to_speech/websocket/ws)async_flash_v1.0POST /text_to_speech), Timestamps (POST /text_to_speech/with_timestamps)speed_control (0.7 – 2.0) — Adjusts the speaking speed of the synthesized voicestability (0 – 100) — Adjusts how stable or expressive the synthesized voice soundsasync_pro_v1.0. It produces the most natural-sounding speech, handles non-standard text out of the box, and is fast enough for real-time streaming use cases.async_flash_v1.5. It combines fast streaming with built-in text normalization for 6 languages.async_flash_v1.0. It covers 15 languages and is the only model available on the synchronous HTTP and timestamps endpoints.async_pro_v1.0 and async_flash_v1.5 are streaming and WebSocket only. They are not available on the synchronous POST /text_to_speech or POST /text_to_speech/with_timestamps endpoints.speed_control and stability parameters are only supported by async_flash_v1.0. They have no effect on the other models.