Skip to main content
Inworld’s Realtime API (Speech-to-Speech) enables low-latency, speech-to-speech interactions with voice agents. The API follows the OpenAI Realtime protocol, extended to enable additional customization.

WebSocket Quickstart

Build a voice agent with WebSocket, mic input, and audio playback.

WebRTC Quickstart

Build a voice agent with browser-native WebRTC — no manual audio encoding.

API reference

See the full event schemas for the Realtime API.

JS examples

JavaScript examples for the Realtime API.

Python examples

Python examples for the Realtime API.
Inworld’s Realtime API is currently in research preview. Please share any feedback with us via the feedback form in Portal or in Discord.

Key Features

  • WebSocket and WebRTC transports: Connect over WebSocket or WebRTC with a standard event schema.
  • Automatic interruption-handling and turn-taking: Your agent will manage conversations naturally and be resilient to user barge-in.
  • Router support: Utilize Inworld Routers to enable a single agent to dynamically handle different user cohorts, or to facilitate A/B tests.
  • OpenAI compatibility: Drop-in replacement for the OpenAI Realtime API with a simple migration path.

Guides

Using realtime models

Configure sessions, send input, and orchestrate responses.

Managing conversations

Session lifecycle and conversation events.

OpenAI migration

Step-by-step guide to switch from OpenAI to Inworld.
See the API reference for full event schemas.