Skip to main content
Inworld provides a family of state-of-the-art TTS models, optimized for different use cases, quality levels, and performance requirements.

Inworld TTS 1.5 Max

Our flagship model, delivering the best balance of quality and speed

  • Rich, expressive, contextually aware speech
  • Support for 15 languages
  • Optimized for real-time use (<200ms median latency)
  • High quality instant voice cloning
  • Enhanced timestamps with phonetic details and visemes

Inworld TTS 1.5 Mini

Our ultra-fast, most cost-efficient model. For when latency is the top priority.

  • Ultra-low latency (~120ms median latency)
  • Support for 15 languages
  • Radically affordable pricing
  • High quality instant voice cloning
  • Enhanced timestamps with phonetic details and visemes

Models overview

NameModel IDDescriptionSupported languages
Llama Inworld TTS 1.5 Maxinworld-tts-1.5-max              Flagship model, best balance of quality and speed, with enhanced timestampsen, zh, ja, ko, ru, it, es, pt, fr, de, pl, nl, hi, he, ar
Llama Inworld TTS 1.5 Miniinworld-tts-1.5-mini                                Ultra-fast, most cost-efficient model, with enhanced timestampsen, zh, ja, ko, ru, it, es, pt, fr, de, pl, nl, hi, he, ar
Llama Inworld TTS Maxinworld-tts-1-max              Our most powerful previous generation model, with basic timestamps supporten, de, es, fr, it, ja, ko, nl, pl, pt, ru, zh, hi
Llama Inworld TTSinworld-tts-1              Our fastest previous generation model, with basic timestamps supporten, de, es, fr, it, ja, ko, nl, pl, pt, ru, zh, hi