inworld/inworld-stt-1) supports 30 languages for speech recognition.
Supported languages
| Language | Code |
|---|---|
| Arabic | ar |
| Cantonese | yue |
| Chinese | zh |
| Czech | cs |
| Danish | da |
| Dutch | nl |
| English | en |
| Filipino | fil |
| Finnish | fi |
| French | fr |
| German | de |
| Greek | el |
| Hindi | hi |
| Hungarian | hu |
| Indonesian | id |
| Italian | it |
| Japanese | ja |
| Korean | ko |
| Macedonian | mk |
| Malay | ms |
| Persian | fa |
| Polish | pl |
| Portuguese | pt |
| Romanian | ro |
| Russian | ru |
| Spanish | es |
| Swedish | sv |
| Thai | th |
| Turkish | tr |
| Vietnamese | vi |
Specifying a language
Thelanguage field is a language hint — it tells the model which language to prefer, but it is not guaranteed to be respected. The model automatically detects the spoken language from the audio, and you can switch languages in the middle of a conversation without changing the hint.
The field accepts ISO 639-1 language codes (e.g., en, ja) matching the codes listed in the table above.
BCP-47 codes (e.g.,
en-US, ja-JP) are also accepted and will be automatically converted to the base ISO 639 language code — for example, en-US becomes en. Regional variants do not affect recognition behavior.Third-party provider languages
The Inworld STT API also supports models from third-party providers, each with their own language coverage. See the provider documentation for details:| Provider | Models | Language documentation |
|---|---|---|
| Groq | groq/whisper-large-v3 | Whisper — supported languages |
| AssemblyAI | assemblyai/universal-streaming-multilingual, assemblyai/u3-rt-pro, assemblyai/whisper-rt | AssemblyAI — supported languages |
| Soniox | soniox/stt-rt-v4 | Soniox — supported languages |
Next steps
Developer Quickstart
Make your first STT API call and get a transcript.
API Reference
View the complete API specification.