Integrations - Inworld AI Documentation

Inworld’s API is integrated with leading voice and real-time platforms for developers. This makes it easy to get started building real-time voice agents and voice-based experiences at scale powered by Inworld’s radically affordable, state-of-the-art TTS models.

Daily (Pipecat)

Pipecat is an open source Python framework for building real-time voice and multimodal AI agents that can see, hear, and speak. It’s designed for developers who want full control over how AI services, network transports, and audio processing are orchestrated—enabling ultra-low latency, natural-feeling conversations across custom pipelines, whether running locally or in production infrastructure. Inworld voices and text-to-speech models are supported via a built-in InworldTTSService, allowing you to stream high-quality audio or generate speech on demand from within your own runtime. To get started with Pipecat + Inworld, follow this guide.

LiveKit

LiveKit is an open source platform for developers building realtime agents. It makes it easy to integrate audio, video, text, data, and AI models while offering scalable realtime infrastructure built on top of WebRTC. Inworld voices and text-to-speech models are available as a plugin for LiveKit Agents, a flexible framework for building real-time conversational agents. This makes it easier for developers to create previously unimaginable, real-time voice experiences such as multiplayer games, agentic NPCs, customer-facing avatars, live training simulations, and more at an accessible price. To get started with LiveKit + Inworld, follow this guide.

NLX

NLX is a no-code platform for developers and businesses to build, deploy, and manage conversational AI applications across a variety of channels. It enables the creation of sophisticated, multimodal experiences that can include chat, voice, and video. Inworld TTS is available through NLX as one of the default voice providers or you can build a custom integration. Kickstart your journey with NLX + Inworld by signing up for an NLX account, or dive right in with this how-to guide.

Stream (Vision Agents)

Stream (Vision Agents) is Stream’s open-source framework that helps developers quickly build low-latency vision AI applications. Since its initial launch, the project has expanded with additional plugins, better model support, and major improvements to latency, audio, and video handling. Stream (Vision Agents) integrates Inworld’s state-of-the-art TTS models directly into their platform, giving developers an out-of-the-box way to bring natural, expressive voice to their AI agents. To get started with Stream (Vision Agents) x Inworld, follow this guide.

Ultravox

Ultravox is a real-time voice AI infrastructure layer that delivers fast, natural, and scalable voice agents. Its purpose-built inference stack powers a best-in-class speech understanding model, while developer tools including easy-to-use APIs and client-side SDKs help teams deliver production voice agents faster. Inworld voices are natively integrated with the Ultravox platform and available for use in all accounts, making it easy to create natural, conversational agents with emotionally expressive voices. To get started with Ultravox + Inworld, follow these instructions.

Vapi

Vapi is a developer platform for building advanced voice AI agents. By handling the complex infrastructure, they enable developers to focus on creating great voice experiences. Inworld’s TTS is integrated with Vapi’s platform, giving you access to Inworld’s high-fidelity, emotionally expressive voices seamlessly on Vapi. To get started with Vapi + Inworld, follow this guide

Voximplant

Voximplant is a serverless Voice AI orchestration platform and cloud communications stack for building real-time voice agents over the phone and the web. It combines programmable telephony (PSTN, SIP, WhatsApp), WebRTC, and client SDKs with a serverless JavaScript runtime (VoxEngine), so developers can efficiently orchestrate calls, speech services, and LLMs in one environment. Inworld’s TTS is natively integrated into Voximplant’s realtime speech synthesis APIs, enabling low-latency streaming of expressive Inworld voices into any Voximplant-powered call. With a single VoxEngine scenario, you can connect your agent logic to Inworld for speech generation, route calls globally, and rapidly scale from prototype to production. To get started with Voximplant + Inworld, check out this announcement.

​Daily (Pipecat)

​LiveKit

​NLX

​Stream (Vision Agents)

​Ultravox

​Vapi

​Voximplant