Ref: [Main Index](https://docs.inworld.ai/llms.txt)
# Inworld AI Documentation - Full Content
This file contains every documentation page from docs.inworld.ai, generated from docs.json.
For the lightweight index, see [llms.txt](https://docs.inworld.ai/llms.txt).
## Home
#### Hello Inworld
Source: https://docs.inworld.ai/introduction
## Build with Inworld
State-of-the-art voice AI at a radically accessible price point
Low-latency, natural speech-to-speech conversations
Powerful model routing to optimize for every user and context
## Get Started
Learn how to make your first TTS API call.
Create your first LLM Router request
Build a voice agent that streams audio using WebSocket.
Get started with the Unreal AI Agent Runtime.
Get started with the Node.js Agent Runtime.
Create and chat with an AI character with Agent Runtime.
**Using AI to code?**
Paste [https://platform.inworld.ai/llms.txt](https://platform.inworld.ai/llms.txt) into Claude, ChatGPT, or Cursor to integrate Inworld TTS and Agent Runtime into your app quickly and reliably.
---
## TTS
### Get Started
#### Intro to TTS
Source: https://docs.inworld.ai/tts/tts
Inworld's text-to-speech (TTS) models offer ultra-realistic, context-aware speech synthesis, zero data retention, and precise voice cloning capabilities, enabling developers to build natural and engaging experiences with human-like speech quality at an accessible price point.
Our models can be accessed via [API](/api-reference/ttsAPI/texttospeech/synthesize-speech) or the [TTS Playground](http://platform.inworld.ai/tts-playground).
Learn how to make your first API call with a guided tutorial.
Try different TTS models and voice cloning in TTS Playground.
Browse ready-to-use GitHub samples for common use cases.
## Models
### Our flagship model, delivering the best balance of quality and speed
- Rich, expressive, contextually aware speech
- Support for 15 languages
- Optimized for real-time use (\<200ms median latency)
- High quality instant voice cloning
### Our ultra-fast, most cost-efficient model. For when latency is the top priority.
- Ultra-low latency (~120ms median latency)
- Support for 15 languages
- Radically affordable pricing
- High quality instant voice cloning
## Features
| **Feature** | **TTS-1.5-Max** | **TTS-1.5-Mini** |
| :---- | :---- | :---- |
| Radically accessible pricing | $10/1M characters | $5/1M characters |
| Quality | #1 ranked, maximum stability| \#1 ranked |
| P50 Latency | 200 ms | 120 ms |
| [Free instant voice cloning](/tts/voice-cloning) | | |
| Professional voice cloning | | |
| [Custom pronunciation](/tts/capabilities/custom-pronunciation) | | |
| [Multilingual](/tts/capabilities/generating-audio#language-support) | 15 languages | 15 languages |
| [Audio markups](/tts/capabilities/audio-markups) for emotion, style and non-verbals | | |
| [Timestamp alignment](/tts/capabilities/timestamps) | | |
| [On-premises deployments](/tts/on-premises) | | |
| [Zero data retention](/tts/zero-data-retention) | | |
---
#### Developer Quickstart
Source: https://docs.inworld.ai/quickstart-tts
The [TTS Playground](https://platform.inworld.ai/tts-playground) is the easiest way to experiment with Inworld’s Text-to-Speech models—try out different voices, adjust parameters, and preview instant voice clones. Once you’re ready to go beyond testing and build into a real-time application, the API gives you full access to advanced features and integration options.
In this quickstart, we’ll focus on the Text-to-Speech API, guiding you through your first request to generate high-quality, ultra-realistic speech from text.
## Make your first TTS API request
Create an [Inworld account](https://platform.inworld.ai/signup).
In [Inworld Portal](https://platform.inworld.ai/), generate an API key by going to [**Settings** > **API Keys**](https://platform.inworld.ai/api-keys). Copy the Base64 credentials.
Set your API key as an environment variable.
```shell macOS and Linux
export INWORLD_API_KEY='your-base64-api-key-here'
```
```shell Windows
setx INWORLD_API_KEY "your-base64-api-key-here"
```
This is the simplest way to try Inworld TTS and works well for many applications — batch audio generation, pre-rendered content, and anywhere latency isn't critical. If your application requires real-time, low-latency audio delivery, see the [streaming example](#stream-your-audio-output) in the next step.
For Python or JavaScript, create a new file called `inworld_quickstart.py` or `inworld_quickstart.js`. Copy the corresponding code into the file. For a curl request, copy the request.
```python Python
import requests
import base64
import os
# Synchronous endpoint — returns complete audio in a single response.
# For low-latency or real-time use cases, use the streaming endpoint instead.
url = "https://api.inworld.ai/tts/v1/voice"
headers = {
"Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
"Content-Type": "application/json"
}
payload = {
"text": "What a wonderful day to be a text-to-speech model!",
"voiceId": "Ashley",
"modelId": "inworld-tts-1.5-max"
}
response = requests.post(url, json=payload, headers=headers)
response.raise_for_status()
result = response.json()
audio_content = base64.b64decode(result['audioContent'])
with open("output.mp3", "wb") as f:
f.write(audio_content)
```
```javascript JavaScript
const fs = require('fs');
async function main() {
const url = 'https://api.inworld.ai/tts/v1/voice';
const response = await fetch(url, {
method: 'POST',
headers: {
'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
text: 'What a wonderful day to be a text-to-speech model!',
voiceId: 'Ashley',
modelId: 'inworld-tts-1.5-max',
}),
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const result = await response.json();
const audioBuffer = Buffer.from(result.audioContent, 'base64');
fs.writeFileSync('output.mp3', audioBuffer);
console.log('Audio saved to output.mp3');
}
main();
```
```curl cURL
curl --request POST \
--url https://api.inworld.ai/tts/v1/voice \
--header "Authorization: Basic $INWORLD_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"text": "What a wonderful day to be a text-to-speech model!",
"voiceId": "Ashley",
"modelId": "inworld-tts-1.5-max"
}' \
| jq -r '.audioContent' | base64 -d > output.mp3
```
For Python, you may also have to install `requests` if not already installed.
```bash Python
pip install requests
```
Run the code for Python or JavaScript, or enter the curl command into your terminal.
```bash Python
python inworld_quickstart.py
```
```bash JavaScript
node inworld_quickstart.js
```
You should see a saved file called `output.mp3`. You can play this file with any audio player.
## Stream your audio output
Now that you've made your first TTS API request, you can try streaming responses as well. Assuming you've already followed the instructions above to set up your API key:
First, create a new file called `inworld_stream_quickstart.py` for Python or `inworld_stream_quickstart.js` for JavaScript. Next, set your `INWORLD_API_KEY` as an environment variable. Finally, copy the following code into the file.
For this streaming example, we'll use Linear PCM format (instead of MP3), which we specify in the `audio_config`. We also include a `Connection: keep-alive` header to reuse the TCP+TLS connection across requests.
The first request to the API may be slower due to the initial TCP and TLS handshake. Subsequent requests on the same connection will be faster. Use `Connection: keep-alive` (and a persistent session in Python) to take advantage of connection reuse. See the [low-latency examples](https://github.com/inworld-ai/inworld-api-examples/tree/main/tts) in our API examples repo for more advanced techniques.
```python Python
import requests
import base64
import os
import json
import wave
import io
import time
url = "https://api.inworld.ai/tts/v1/voice:stream"
payload = {
"text": "What a wonderful day to be a text-to-speech model! I'm super excited to show you how streaming works.",
"voice_id": "Ashley",
"model_id": "inworld-tts-1.5-max",
"audio_config": {
"audio_encoding": "LINEAR16",
"sample_rate_hertz": 48000,
},
}
# Use a persistent session for connection reuse (TCP+TLS keep-alive)
session = requests.Session()
session.headers.update({
"Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
"Content-Type": "application/json",
"Connection": "keep-alive",
})
start_time = time.time()
ttfb = None
raw_audio_data = io.BytesIO()
with session.post(url, json=payload, stream=True) as response:
response.raise_for_status()
for line in response.iter_lines(decode_unicode=True):
if line.strip():
try:
chunk = json.loads(line)
result = chunk.get("result")
if result and "audioContent" in result:
audio_chunk = base64.b64decode(result["audioContent"])
if ttfb is None:
ttfb = time.time() - start_time
# Skip WAV header (first 44 bytes) from each chunk
if len(audio_chunk) > 44:
raw_audio_data.write(audio_chunk[44:])
print(f"Received {len(audio_chunk)} bytes")
except json.JSONDecodeError:
continue
total_time = time.time() - start_time
with wave.open("output_stream.wav", "wb") as wf:
wf.setnchannels(1)
wf.setsampwidth(2)
wf.setframerate(payload["audio_config"]["sample_rate_hertz"])
wf.writeframes(raw_audio_data.getvalue())
print("Audio saved to output_stream.wav")
print(f"Time to first chunk: {ttfb:.3f}s" if ttfb else "No chunks received")
print(f"Total time: {total_time:.3f}s")
session.close()
```
```javascript JavaScript
const fs = require('fs');
async function main() {
const url = 'https://api.inworld.ai/tts/v1/voice:stream';
const audioConfig = {
audio_encoding: 'LINEAR16',
sample_rate_hertz: 48000,
};
const startTime = Date.now();
let ttfb = null;
const response = await fetch(url, {
method: 'POST',
headers: {
'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
'Content-Type': 'application/json',
'Connection': 'keep-alive',
},
body: JSON.stringify({
text: "What a wonderful day to be a text-to-speech model! I'm super excited to show you how streaming works.",
voice_id: 'Ashley',
model_id: 'inworld-tts-1.5-max',
audio_config: audioConfig,
}),
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const rawChunks = [];
// Read the streaming response
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() || '';
for (const line of lines) {
if (line.trim()) {
try {
const chunk = JSON.parse(line);
if (chunk.result && chunk.result.audioContent) {
const audioBuffer = Buffer.from(chunk.result.audioContent, 'base64');
if (ttfb === null) {
ttfb = (Date.now() - startTime) / 1000;
}
// Skip WAV header (first 44 bytes) from each chunk
if (audioBuffer.length > 44) {
rawChunks.push(audioBuffer.subarray(44));
console.log(`Received ${audioBuffer.length} bytes`);
}
}
} catch {
continue;
}
}
}
}
// Build WAV file from raw audio data
const rawAudio = Buffer.concat(rawChunks);
const header = Buffer.alloc(44);
const sampleRate = audioConfig.sample_rate_hertz;
const byteRate = sampleRate * 2; // 16-bit mono
header.write('RIFF', 0);
header.writeUInt32LE(36 + rawAudio.length, 4);
header.write('WAVE', 8);
header.write('fmt ', 12);
header.writeUInt32LE(16, 16);
header.writeUInt16LE(1, 20); // PCM
header.writeUInt16LE(1, 22); // mono
header.writeUInt32LE(sampleRate, 24);
header.writeUInt32LE(byteRate, 28);
header.writeUInt16LE(2, 32); // block align
header.writeUInt16LE(16, 34); // bits per sample
header.write('data', 36);
header.writeUInt32LE(rawAudio.length, 40);
fs.writeFileSync('output_stream.wav', Buffer.concat([header, rawAudio]));
const totalTime = (Date.now() - startTime) / 1000;
console.log('Audio saved to output_stream.wav');
console.log(`Time to first chunk: ${ttfb?.toFixed(3)}s`);
console.log(`Total time: ${totalTime.toFixed(3)}s`);
}
main();
```
Run the code for Python or JavaScript. The console will print out as streamed bytes are written to the audio file.
```python Python
python inworld_stream_quickstart.py
```
```javascript JavaScript
node inworld_stream_quickstart.js
```
You should see a saved file called `output_stream.wav`. You can play this file with any audio player.
## Next Steps
Now that you've tried out Inworld's TTS API, you can explore more of Inworld's TTS capabilities.
Understand the capabilities of Inworld's TTS models.
Create a personalized voice clone with just 5 seconds of audio.
Learn tips and tricks for synthesizing high-quality speech.
---
#### TTS Models
Source: https://docs.inworld.ai/tts/tts-models
Inworld provides a family of state-of-the-art TTS models, optimized for different use cases, quality levels, and performance requirements.
---
### Build with TTS
#### Capabilities
#### Generating Audio
Source: https://docs.inworld.ai/tts/capabilities/generating-audio
## Voices
Inworld offers a variety of built-in voices across available languages that showcase a range of vocal characteristics and styles. These voices can be immediately tried out in TTS Playground and used in your applications.
For greater customization, we recommend [voice cloning](/tts/voice-cloning). Create distinct, personalized voices tailored to your experience, with as little as 5 seconds of audio.
Voices perform optimally when synthesizing text in the same language as the original voice. While cross-language synthesis is possible, you'll achieve the best quality, pronunciation, and naturalness by matching the voice's native language to your text content.
## Language Support
As a larger and more capable model, Inworld TTS 1.5 Max is better suited for multilingual applications, offering better pronunciation, more accurate intonation, and more natural-sounding speech.
Inworld's models offer support for the following languages:
- English (`en`)
- Arabic (`ar`)
- Chinese (`zh`)
- Dutch (`nl`)
- French (`fr`)
- German (`de`)
- Hebrew (`he`)
- Hindi (`hi`)
- Italian (`it`)
- Japanese (`ja`)
- Korean (`ko`)
- Polish (`pl`)
- Portuguese (`pt`)
- Russian (`ru`)
- Spanish (`es`)
## Supported Formats
Multiple audio formats are available via API to support different application requirements. The default is MP3.
- **MP3:** Popular compressed format with broad device and platform compatibility.
- Sample rate: 16kHz - 48kHz
- Bit rates: 32kbps - 320kbps
- **PCM (`PCM`):** Raw uncompressed 16-bit signed little-endian samples with no WAV header. Recommended for WebSocket use cases and real-time applications that process raw audio samples directly without needing container metadata.
- Sample rate: 8kHz - 48kHz
- Bit depth: 16-bit
- **WAV (`WAV`):** Uncompressed 16-bit signed little-endian samples with WAV header optimized for HTTP streaming. For non-streaming, the WAV header is included in the response. For HTTP streaming, the WAV header is included in the first audio chunk only, so all chunks in that response can be concatenated directly into a single valid WAV file. For WebSocket streaming, a WAV header is emitted at the first audio chunk of each `flush`/`flush_completed` event, so direct concatenation without processing is only valid within a single flush; to build one continuous WAV file across multiple flushes, clients must strip or rebuild the repeated headers between flushes.
- Sample rate: 8kHz - 48kHz
- Bit depth: 16-bit
- **Linear PCM (`LINEAR16`):** Uncompressed 16-bit signed little-endian samples with WAV header. Maintained for backward compatibility. For non-streaming, the WAV header is included in the response. For streaming (HTTP streaming or WebSocket), the WAV header is included in every audio chunk, so each chunk is a valid WAV file on its own. Clients must strip headers when concatenating chunks.
- Sample rate: 8kHz - 48kHz
- Bit depth: 16-bit
- **Opus:** High-quality compressed format optimized for low latency web and mobile applications.
- Sample rate: 8kHz - 48kHz
- Bit rates: 32kbps - 192kbps
- **μ-law:** Compressed telephony format ideal for voice applications with bandwidth constraints.
- Sample rate: 8kHz
- **A-law:** Compressed telephony format ideal for voice applications with bandwidth constraints.
- Sample rate: 8kHz
## Additional Configurations
The following optional configurations can also be adjusted as needed when synthesizing audio:
- **Temperature**: Higher values increase variation, which can produce more diverse outputs with desirable outcomes, but also increases the chances of bad generations and hallucinations. Lower values improve stability and speaker similarity, though going too low increases the chances of broken generation. The default is 1.0.
- **Talking Speed**: Controls how fast the voice speaks. 1.0 is the normal native speed, while 0.5 is half the normal speed and 1.5 is 1.5x faster than the normal speed.
- **Emphasis Markers**: Asterisks around a word (e.g. `*really*`) can be used to signal emphasis, prompting the voice to stress that word more strongly. This helps convey tone, intent, or emotion more clearly in spoken output.
---
#### Voice Cloning
Source: https://docs.inworld.ai/tts/voice-cloning
Inworld's text-to-speech models offer best-in-class voice cloning capabilities, enabling developers to create distinct, personalized voices for their experiences.
There are three ways to clone a voice:
1. **Instant Voice Cloning** - Clone a voice in minutes, with only 5-15 seconds of audio. Also known as zero-shot cloning. Available to all users through Portal.
2. **Voice Cloning via API** - Instant voice cloning via API. Useful for workflow automation or enabling your users to clone their own voices.
3. **Professional Voice Cloning** - For the highest quality, fine-tune a model with 30\+ minutes of audio.
Professional voice cloning is currently not publicly available. To get access, please [reach out to our sales team](https://inworld.ai/contact-sales).
Don't have audio samples? Use [Voice Design](/tts/voice-design) to create a voice from a text description instead.
## Instant Voice Cloning
In Portal, select **TTS Playground** from the left-hand side panel. In the TTS Playground, click **Create Voice** and select **Clone**.
Name your voice and select the language, which should match the audio samples. _Voices will work best when synthesizing text that matches the language of the original audio samples._
You can either upload or record audio:
- **Upload**: Drag and drop or browse to upload 1 audio file. Accepted formats: wav, mp3, webm. Maximum file size is 4MB. Audio samples longer than 15 seconds will be automatically trimmed to 15 seconds.
- **Record**: Click "Record audio" and record your audio. You can use the suggested scripts to help guide your recording, or use your own script. For best results, record in a quiet place to minimize background noise, avoid mic noise, and speak with a variety of emotions to capture the full range of the voice.
Enable "Remove background noise" if you wish to remove background noise from your audio. Confirm you have the rights to clone the voice, then click "Continue".
Check out our [Voice Cloning Best Practices](/tts/best-practices/voice-cloning) for helpful tips and tricks to improve the quality of your voices clones.
Once voice cloning completes, you'll see the "Try your cloned voice" interface. Enter text in the input field and press play to hear your cloned voice. You can test different phrases to ensure the voice sounds as expected.
If the voice doesn't sound quite right, you can delete the voice and start over, create another voice, or test it in the TTS Playground for more advanced testing options.
There is a default limit of 1,000 cloned voices stored per account. If you need a higher limit, please [contact our sales team](https://inworld.ai/contact-sales).
To use the cloned voice via API, copy the voice ID for your cloned voice in TTS Playground. Use that value for the `voiceId` when making an API call. See our [Quickstart](/quickstart-tts) to learn how to make your first API call.
Instant voice cloning may not perform well for less common voices, such as children's voices or unique accents. For those use cases, we recommend professional voice cloning.
## Voice Cloning API Reference And Examples
If you want to automate voice cloning (for example, to support creator onboarding at scale), use the Voice Cloning API.
- **API reference**: [Clone a voice](/api-reference/voiceAPI/voiceservice/clone-voice)
- **Python example**: [example_voice_clone.py](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_voice_clone.py)
- **JavaScript example**: [example_voice_clone.js](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_voice_clone.js)
Voice cloning has lower rate limits than regular speech synthesis. For details, see [Rate limits](/resources/rate-limits).
## Next Steps
Looking for more tips and tricks? Check out the resources below to get started\!
Learn best practices for producing high-quality voice clones.
Learn best practices for synthesizing high-quality speech.
Explore Python and JavaScript code examples for TTS integration.
---
#### Voice Design
Source: https://docs.inworld.ai/tts/voice-design
Inworld's Voice Design lets you create a completely new voice from a text description. It is perfect for when you need a unique voice but can't find the right voice in our Voice Library and don't have existing audio recordings for voice cloning.
Voice Design uses a model to generate a voice based on the following two inputs:
1. **Voice description** - A text description of the voice you have in mind (e.g., age, gender, accent, tone, pitch).
2. **Script** - The text the voice will speak. This shapes the generated voice, so using a script that matches the intended voice produces the best results.
Each time you generate, we'll return up to three voice previews so you can listen, compare, and select the ones that work best for your project.
Voice Design is currently in [research preview](/tts/resources/support#what-do-experimental-preview-and-stable-mean). Please share any feedback with us via the feedback form in [Portal](https://platform.inworld.ai) or in [Discord](https://discord.gg/inworld).
{
user.groups?.includes('inworld')
? <>
To get started, there are two ways to use voice design:
Through Inworld Portal - Go to TTS Playground > Create Voice > Design and follow the guided flow.
Via API - Useful if you want to generate a lot of voices or expose this capability to your users.
>
: <> >
}
## Design a Voice in Portal
In [Portal](https://platform.inworld.ai/), select [**TTS Playground**](https://platform.inworld.ai/tts-playground) from the left-hand side panel. Click **Create Voice** and select **Design**.
Describe the voice you want to create. The description must be in English and be between 30 and 250 characters.
Keep your description concise but specific, so the model can most accurately produce what you have in mind. A good voice description should include:
- **Gender and age range** (e.g., "a mid-20s to early 30s female voice", "a middle-aged male voice")
- **Accent** (e.g., "British accent", "Southern American accent")
- **Pitch and pace** (e.g., "low-pitched", "fast-paced", "steady pace")
- **Tone and emotion** (e.g., "warm and friendly", "authoritative and composed")
- **Timbre** (e.g., "rich and smooth", "slightly raspy", "clear and bright")
**Example**: "A middle-aged male voice with a clear British accent speaking at a steady pace and with a neutral tone."
Use the **Improve Description** button to automatically enhance your description based on best practices. This adds missing attributes like pitch, pace, tone, and timbre to help the model produce a more accurate voice.
Choose the language for your generated voice. If you're using the auto-generated script, the script will be written in your selected language.
Select how you want to provide the script that the voice will speak:
- **Auto-generate script** - The system automatically generates a script that matches your voice description in the selected language. This is the easiest option and works well for most use cases.
- **Write my own** - Write a custom script for the voice to speak. For best results, scripts should result in 5 to 15 seconds of audio, which is roughly between 50 and 200 characters in English.
The script shapes the voice that gets generated. Use a script that matches your imagined voice, and the model will tailor the voice to suit the content it's speaking.
Click **Generate voice**, which will create up to 3 voice previews. Listen to each preview by clicking the play button, then select the voice(s) you want to keep.
Each generation produces slightly different results. If the first set of voices doesn't sound right, click **Generate voice** again to regenerate or adjust your description and voice script to better match what you have in mind before regenerating.
Check out our [Voice Cloning Best Practices](/tts/best-practices/voice-design) guide for helpful tips and tricks to improve your designed voices.
After selecting one or more voices, give each voice a name, add optional tags, and save them to your voice library. Your designed voices will appear alongside your other voices in the TTS Playground.
To use your designed voice via API, copy the voice ID from the TTS Playground. Use that value for the `voiceId` when making an API call. See our [Quickstart](/quickstart-tts) to learn how to make your first API call.
{
user.groups?.includes('inworld')
? <>
Design a Voice via API
When designing a voice via API, there are two steps:
Design a voice - Call this endpoint to generate up to three voice previews based on a voice description and script.
Publish a voice - Publish a preview voice to your library.
>
: <> >
}
## Voice Design API Reference And Examples
If you want to automate voice design (for example, to support creator onboarding at scale), use the Voice Design API.
- **Design a voice API reference**: [Design a voice](/api-reference/voiceAPI/voiceservice/design-voice)
- **Publish a voice API reference**: [Publish a voice](/api-reference/voiceAPI/voiceservice/publish-voice)
- **Python example**: [example_voice_design_publish.py](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_voice_design_publish.py)
- **JavaScript example**: [example_voice_design_publish.js](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_voice_design_publish.js)
## Next Steps
Learn best practices for designing voices.
Clone an existing voice with just 5-15 seconds of audio.
Learn how to make your first TTS API call in minutes.
---
#### Voice Tags
Source: https://docs.inworld.ai/tts/capabilities/voice-tags
Voice tags provide descriptive metadata about each voice, helping you categorize and filter voices based on their characteristics. Tags describe properties like gender, age group, tone, and style, making it easier to find the right voice for your use case.
## Understanding voice tags
Each voice includes a `tags` array with descriptive labels such as:
- **Gender**: `male`, `female`, `non-binary`
- **Age group**: `young_adult`, `adult`, `middle-aged`, `elderly`
- **Vocal style**: `energetic`, `calm`, `professional`, `friendly`, `warm`
- **Voice quality**: `smooth`, `clear`, `expressive`, `conversational`
## Adding voice tags
To add custom tags to voices, you'll use the voice cloning feature in the API playground:
### Step-by-step process
1. **Navigate to the API playground**
- Go to the [TTS playground](https://platform.inworld.ai/tts-playground)
- Select the "Clone Voice" option
2. **Configure voice parameters**
- Enter a voice name and description for your cloned voice
3. **Add voice tags**
- Press Enter after each tag entry to add it to the list
4. **Upload your audio sample**
5. **Submit and process**
6. **Verify tags in voice list**
- Your new voice will appear with the assigned tags
## Using voice tags
Voice tags are returned in the [List voices](https://docs.inworld.ai/api-reference/ttsAPI/texttospeech/list-voices) endpoint response:
---
#### Audio Markups
Source: https://docs.inworld.ai/tts/capabilities/audio-markups
Audio markups let you control how the model speaks—not only what it says, but pacing, emotion, and non-verbal sounds. This page covers two kinds: **SSML break tags** for inserting silences, and **emotion, delivery, and non-verbal markups**, bracket-style tags for expression and vocalizations.
## SSML break tags
*Use when you need precise control over silence duration and position.*
You can insert silences at specific points in the generated speech. The TTS API and Inworld Portal support SSML ` ` in text input for streaming, non-streaming, and WebSocket requests, in all languages. You can specify silences in milliseconds or seconds. For example, ` ` and ` ` produce the same result.
**Constraints:**
- Use well-formed SSML: specify the slash and brackets—for example, ` `.
- Tag names and attributes are **case insensitive**; for example, ` ` works.
- Up to **20** break tags are supported per request. After the first 20 tags, the remaining ones will be ignored.
- Each break is at most **10 seconds**—for example, `time="10s"` or `time="10000ms"`.
**Example:**
```
One second pause two seconds pause this is the end.
```
## Emotion, delivery, and non-verbal markups
*Use when you want to control emotion, delivery style, or add sounds like sighs and laughs.*
The markups below are experimental and supported for English only. They give you finer control over how the model speaks: emotional expression, delivery style such as whispering, and non-verbal vocalizations such as sighs and coughs.
These markups are currently [experimental](/tts/resources/support#what-do-experimental-preview-and-stable-mean) and only support English.
### Emotion and Delivery Style
Emotion and delivery style markups control the way a given text is spoken. These work best when used at the beginning of a text and apply to the text that follows.
- **Emotion**: `[happy]`, `[sad]`, `[angry]`, `[surprised]`, `[fearful]`, `[disgusted]`
- **Delivery Style**: `[laughing]`, `[whispering]`
For example:
```
[happy] I can't believe this is happening.
```
**Best practices:** Use only one emotion or delivery style markup at the **beginning** of your text. Using multiple emotion and delivery style markups or placing them mid-text may produce mixed results. Instead, split the text into separate requests with the markup at the start of each. See our [Best Practices](/tts/best-practices/generating-speech#audio-markups) guide for more details.
### Non-verbal Vocalization
Non-verbal vocalization markups add in non-verbal sounds based on where they are placed in the text.
- `[breathe]`, `[clear_throat]`, `[cough]`, `[laugh]`, `[sigh]`, `[yawn]`
For example:
```
[clear_throat] Did you hear what I said? [sigh] You never listen to me!
```
**Best practices:** You can use multiple non-verbal vocalizations within a single piece of text to add the appropriate vocal effects throughout the speech.
---
#### Custom Pronunciation
Source: https://docs.inworld.ai/tts/capabilities/custom-pronunciation
Sometimes you may need to ensure that a word is spoken with a specific pronunciation, especially for uncommon words such as company names, brand names, nicknames, geographic locations, medical terms, or legal terms that may not appear in the model’s training data. Custom pronunciation lets you precisely control how these words are spoken.
### How to Use
Inworld TTS supports inline IPA phoneme notation for custom pronunciation. Use the [International Phonetic Alphabet (IPA)](https://www.vocabulary.com/resources/ipa-pronunciation/) format, wrapped in slashes (`/ /`).
For example:
- Suppose you are building an AI travel agent, and it is recommending the destination Crete, which is pronounced /kriːt/ (“kreet”) in English.
- You can ensure the correct pronunciation by passing it inline: `Your interests are a perfect match for a honeymoon in /kriːt/.`
The model will substitute the IPA pronunciation wherever it appears inline in your text. If the text is generated by an LLM, you can simply replace the original spelling with the IPA transcription before passing it to the TTS model.
### Finding the Right IPA Phonemes
If you are unsure of the correct phonemes, there are several ways to find them:
- **Ask an LLM like ChatGPT**: For example, you can ask:
> “What are the IPA phonemes for the word Crete, pronounced like ‘kreet’?”
- **Use reference websites**: Resources such as [Vocabulary.com’s IPA Pronunciation Guide](https://www.vocabulary.com/resources/ipa-pronunciation/) provide tables of symbols with example words.
Once you have the correct phonemes, you can embed them directly into your TTS request: `Your adventure in /kriːt/ begins today.`
---
#### Timestamps
Source: https://docs.inworld.ai/tts/capabilities/timestamps
Timestamp alignment supports English and Spanish; other languages are experimental.
Timestamp alignment lets you retrieve timing information that matches the generated audio, which is useful for experiences like word highlighting, karaoke‑style captions, and lipsync.
Set the `timestampType` request parameter to control granularity:
- `WORD`: Return timestamps for each word, including detailed phoneme-level timing with viseme symbols
- `CHARACTER`: Return timestamps for each character or punctuation
Enabling timestamp alignment can increase latency (especially for the non-streaming endpoint).
When enabled, the response includes timestamp arrays:
- `WORD`: `timestampInfo.wordAlignment` with `words`, `wordStartTimeSeconds`, `wordEndTimeSeconds`
- For TTS 1.5 models, `phoneticDetails` containing detailed phoneme-level timing with viseme symbols
- `CHARACTER`: `timestampInfo.characterAlignment` with `characters`, `characterStartTimeSeconds`, `characterEndTimeSeconds`
Phoneme and viseme timings (`phoneticDetails`) are currently only returned for **WORD** alignment (not CHARACTER).
See the [API reference](https://docs.inworld.ai/api-reference/ttsAPI/texttospeech/synthesize-speech) for full details.
## Streaming behavior
You can control how timestamp data is delivered alongside audio using [`timestampTransportStrategy`](/api-reference/ttsAPI/texttospeech/synthesize-speech-stream#body-timestamp-transport-strategy).
### Sync (default)
Audio and alignment arrive together in each chunk. Every chunk contains both audio data and its corresponding timestamps.
```
Chunk 1: audio + timestamps for chunk 1
Chunk 2: audio + timestamps for chunk 2
Chunk 3: audio + timestamps for chunk 3
```
This is the simplest approach, however the first audio will be slightly delayed.
### Async
Audio chunks arrive first, followed by separate trailing messages containing only timestamp data. This reduces time-to-first-audio with TTS 1.5 models, since the server doesn't need to wait for alignment computation before sending audio.
```
Chunk 1: audio only
Chunk 2: audio only
Chunk 3: audio only
Chunk 4: timestamps only (alignment for chunks 1–3)
Chunk 5: timestamps only
...
```
Use async when you prioritize playback speed and can handle timestamps arriving after their corresponding audio. Use sync when you need audio and timestamps together in each chunk (e.g., for real-time lip-sync or word highlighting during playback).
Set `timestampTransportStrategy` to `SYNC` or `ASYNC` in your request. See the [API reference](/api-reference/ttsAPI/texttospeech/synthesize-speech-stream#body-timestamp-transport-strategy) for details.
### Response structure
#### TTS 1.5 models (`inworld-tts-1.5-mini`, `inworld-tts-1.5-max`)
Returns enhanced alignment data with **phonetic details**: detailed phoneme-level timing with viseme symbols for precise lip-sync animation.
```json
{
"timestampInfo": {
"wordAlignment": {
"words": ["Hello,", "world,", "this", "will", "be", "saved"],
"wordStartTimeSeconds": [0, 0.28, 0.96, 1.25, 1.38, 1.5],
"wordEndTimeSeconds": [0.28, 0.8, 1.25, 1.38, 1.5, 1.99],
"phoneticDetails": [
{
"wordIndex": 0,
"phones": [
{"phoneSymbol": "h", "startTimeSeconds": 0, "durationSeconds": 0.07, "visemeSymbol": "aei"},
{"phoneSymbol": "ə", "startTimeSeconds": 0.07, "durationSeconds": 0.030000001, "visemeSymbol": "aei"},
{"phoneSymbol": "l", "startTimeSeconds": 0.1, "durationSeconds": 0.089999996, "visemeSymbol": "l"},
{"phoneSymbol": "oʊ1", "startTimeSeconds": 0.19, "durationSeconds": 0.09, "visemeSymbol": "o"}
],
"isPartial": false
},
{
"wordIndex": 1,
"phones": [
{"phoneSymbol": "w", "startTimeSeconds": 0.28, "durationSeconds": 0.18, "visemeSymbol": "qw"},
{"phoneSymbol": "ɝ1", "startTimeSeconds": 0.46, "durationSeconds": 0.119999975, "visemeSymbol": "r"},
{"phoneSymbol": "l", "startTimeSeconds": 0.58, "durationSeconds": 0.08000004, "visemeSymbol": "l"},
{"phoneSymbol": "d", "startTimeSeconds": 0.66, "durationSeconds": 0.13999999, "visemeSymbol": "cdgknstxyz"}
],
"isPartial": false
},
{
"wordIndex": 2,
"phones": [
{"phoneSymbol": "ð", "startTimeSeconds": 0.96, "durationSeconds": 0.14000005, "visemeSymbol": "th"},
{"phoneSymbol": "ɪ1", "startTimeSeconds": 1.1, "durationSeconds": 0.06999993, "visemeSymbol": "ee"},
{"phoneSymbol": "s", "startTimeSeconds": 1.17, "durationSeconds": 0.08000004, "visemeSymbol": "cdgknstxyz"}
],
"isPartial": false
}
]
}
}
}
```
##### Phonetic details structure
Each entry in `phoneticDetails` contains:
| Field | Description |
| :---- | :---- |
| `wordIndex` | Index of the word this phonetic detail belongs to (0-based). |
| `phones` | Array of phonemes that make up this word. |
| `isPartial` | True when the server considers the word potentially unstable (e.g., last word in a non-final streaming update). Clients may choose to delay processing partial words until `isPartial` becomes `false`. |
Each phone entry contains:
| Field | Description |
| :---- | :---- |
| `phoneSymbol` | The phoneme symbol in IPA notation. |
| `startTimeSeconds` | Start time of the phoneme in seconds. May be omitted for the first phoneme of a word. |
| `durationSeconds` | Duration of the phoneme in seconds. |
| `visemeSymbol` | The viseme symbol for lip-sync animation. |
##### Viseme symbols
The following viseme symbols are used for lip-sync animation:
| Viseme | Description |
| :---- | :---- |
| `aei` | Open mouth vowels (a, e, i, ə, ʌ, æ, ɑ, etc.) |
| `o` | Rounded vowels (o, ʊ, əʊ, oʊ, etc.) |
| `ee` | Front vowels (i, ɪ, eɪ, etc.) |
| `bmp` | Bilabial consonants (b, m, p) |
| `fv` | Labiodental consonants (f, v) |
| `l` | Lateral consonant (l) |
| `r` | Rhotic sounds (r, ɝ, ɚ) |
| `th` | Dental fricatives (θ, ð) |
| `qw` | Rounded consonants (w, ʍ) |
| `cdgknstxyz` | Alveolar/velar consonants (c, d, g, k, n, s, t, x, y, z) |
#### TTS 1 models (`inworld-tts-1`, `inworld-tts-1-max`)
Returns basic word/character timing arrays:
```json
{
"timestampInfo": {
"wordAlignment": {
"words": ["Hello", "world,", "this", "will", "be", "saved"],
"wordStartTimeSeconds": [0, 0.33, 0.69, 0.89, 1.1, 1.26],
"wordEndTimeSeconds": [0.28, 0.63, 0.87, 1.05, 1.16, 1.6]
}
}
}
```
---
#### Long Text Input
Source: https://docs.inworld.ai/tts/capabilities/long-text-input
The TTS API accepts up to **2,000 characters** per request. For longer content — articles, book chapters, scripts — you need to split the text into chunks, synthesize each one, and stitch the resulting audio back together.
We provide ready-to-run scripts in **[Python](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_tts_long_input.py)** and **[JavaScript](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_tts_long_input_compressed.js)** that handle this entire pipeline for you.
## How It Works
The input is split into segments under the 2,000-character API limit. The chunking algorithm looks for natural break points in the following priority order:
1. Paragraph breaks (`\n\n`)
2. Line breaks (`\n`)
3. Sentence endings (`.` `!` `?`)
4. Last space (fallback)
This ensures audio segments end at natural pauses, producing smooth-sounding output.
Each chunk is sent to the TTS API with controlled concurrency and automatic retry logic for rate limits. Chunks are processed in parallel (default: 2 concurrent requests) to speed up synthesis while respecting API rate limits.
The individual audio responses are combined into a single output file. The Python script produces a **WAV** file with configurable silence between segments, while the JavaScript script produces an **MP3** file and uses **ffmpeg** to merge segments with correct duration metadata.
## Configuration
Both scripts share the same tunable parameters:
| Parameter | Default | Description |
| :--- | :--- | :--- |
| `MIN_CHUNK_SIZE` | 500 | Minimum characters before looking for a break point |
| `MAX_CHUNK_SIZE` | 1,900 | Maximum chunk size (stays under the 2,000-char API limit) |
| `MAX_CONCURRENT_REQUESTS` | 2 | Parallel API requests (increase with caution to avoid rate limits) |
| `MAX_RETRIES` | 3 | Retry attempts for rate-limited requests with exponential backoff |
## Running the Scripts
### Prerequisites
- An **Inworld API key** set as the `INWORLD_API_KEY` environment variable
- A text file with your long-form content
- **Python 3** (for the Python script) or **Node.js** (for the JavaScript script)
- **ffmpeg** (optional, for the JS script — produces correct MP3 duration metadata)
### Python
```bash
export INWORLD_API_KEY=your_api_key_here
pip install requests python-dotenv
python example_tts_long_input.py
```
The script reads the input text file, chunks it, synthesizes all chunks with the Inworld TTS API, and saves the combined audio as a **WAV** file. It also prints a splice report showing the exact timestamps where chunks were joined, useful for quality checking.
### JavaScript
```bash
export INWORLD_API_KEY=your_api_key_here
node example_tts_long_input_compressed.js
```
The script follows the same chunking and synthesis pipeline, outputting a compressed **MP3** file. When **ffmpeg** is available, it merges segments with correct duration metadata. Otherwise, it falls back to raw concatenation.
## Code Examples
WAV output with splice report and configurable silence between segments
Compressed MP3 output with ffmpeg-based segment merging
## Next Steps
Learn about the standard (non-streaming) synthesis API.
Use streaming for real-time playback of shorter content.
Optimize time-to-first-audio for real-time use cases.
---
#### TTS Playground
Source: https://docs.inworld.ai/tts/tts-playground
The TTS Playground makes it easy to try out Inworld's TTS capabilities through an interactive playground. It can be used to find the perfect voice for your project, test different text inputs, adjust voice settings, and experiment with audio markup tags.
## Get Started
In [Portal](https://platform.inworld.ai/), select **TTS Playground** from the left-hand side panel.
Enter the text you want to convert to speech. If you need some ideas, you can select one of the suggestion chips at the bottom of the screen. Note that the playground accepts up to 2,000 characters per request. For longer content, see [Long Text Input](/tts/capabilities/long-text-input).
On the right-hand side, click on the voice dropdown to browse available voices. You can filter by language or search by name, and click the play button next to each voice to hear how the voice sounds. Select a voice.
Click the "Generate" button on the bottom right. The audio will automatically start playing once it's been generated. You can also download the clip to save it.
## Advanced Features
For greater control over the generated audio, you can try out the following:
1. **Try a different model** - Select a different model from the right-hand side panel to see how it compares. See [Models](/tts/tts-models) for more information about each model.
2. **Adjust configurations** - Use the sliders on the right-hand side panel to adjust Temperature and Talking Speed. See [here](/tts/capabilities/generating-audio#additional-configurations) for more information.
3. **Experiment with audio markups** - Try adding audio markups, such as `[happy]` or `[cough]`, to your input text to control emotional expression, delivery style, and non-verbal vocalizations. See [here](/tts/capabilities/audio-markups) for more information.
In addition, you can enable the **Highlight words** toggle in the right-hand panel to visualize word-level [timestamps](/tts/capabilities/timestamps) during playback. This feature is currently only available for English when using the `inworld-tts-1.5-mini` and `inworld-tts-1.5-max` models.
## Create a Voice
In the TTS Playground, click **+ Create a Voice** to design a new voice from a text description or clone one from audio:
- **[Voice Design](/tts/voice-design)** — Describe the voice you want in text (age, accent, tone, etc.) and get AI-generated voice candidates
- **[Voice Cloning](/tts/voice-cloning)** — Clone a voice from 5–15 seconds of audio samples
## Next Steps
Ready for more? Whether you're looking to clone a voice, design one from text, or start building with our API, we've got you covered.
Create a voice from a text description—no audio needed.
Create a personalized voice clone with just 5 seconds of audio.
Learn tips and tricks for synthesizing high-quality speech.
Learn how to make your first API call in minutes.
---
#### Synthesize Speech
Source: https://docs.inworld.ai/tts/synthesize-speech
You send text, the server generates the entire audio, and returns it in a single HTTP response. No streaming, no open connections — just one request and one response with the complete audio file.
Best for batch or offline work like audiobooks, voiceovers, and podcasts, or any workflow where you can wait for the full file before playback.
For real-time playback or low-latency scenarios, use the [Streaming API](/tts/synthesize-speech-streaming) or [WebSocket API](/tts/synthesize-speech-websocket). See the [latency best practices](/tts/best-practices/latency) for details.
## Code Examples
View our JavaScript implementation example
View our Python implementation example
## API Reference
View the complete API specification
## Next Steps
Learn best practices for producing high-quality voice clones.
Learn best practices for synthesizing high-quality speech.
Explore Python and JavaScript code examples for TTS integration.
---
#### Synthesize Speech (Streaming)
Source: https://docs.inworld.ai/tts/synthesize-speech-streaming
You send text, the server returns audio chunks over HTTP as they are generated. Playback can begin before the full synthesis is complete, significantly reducing time-to-first-audio.
Best for real-time applications, conversational AI, and long-form content — anywhere you want low-latency playback without managing a persistent connection.
For even lower latency with multiple requests in a session, consider the [WebSocket API](/tts/synthesize-speech-websocket). For tips on optimizing latency, see the [latency best practices guide](/tts/best-practices/latency).
## Timestamp Transport Strategy
When using [timestamp alignment](/tts/capabilities/timestamps), you can choose how timestamps are delivered alongside audio using `timestampTransportStrategy`:
- **`SYNC`** (default): Each chunk contains both audio and timestamps together.
- **`ASYNC`**: Audio chunks arrive first, with timestamps following in separate trailing messages. This reduces time-to-first-audio with TTS 1.5 models.
See [Timestamps](/tts/capabilities/timestamps#streaming-behavior) for details on how each mode works.
## Code Examples
View our JavaScript implementation example
View our Python implementation example
## API Reference
View the complete API specification
## Next Steps
Learn best practices for producing high-quality voice clones.
Learn best practices for synthesizing high-quality speech.
Explore Python and JavaScript code examples for TTS integration.
---
#### Synthesize Speech (WebSocket)
Source: https://docs.inworld.ai/tts/synthesize-speech-websocket
You open a persistent WebSocket connection and send text messages. The server streams audio chunks back over the same connection — no per-request overhead, no repeated handshakes. This gives you the lowest possible latency.
Best for voice agents and interactive applications that send multiple synthesis requests in a session, where avoiding connection setup on every call makes a measurable difference.
If you only need a single request-response with chunked audio, the [Streaming API](/tts/synthesize-speech-streaming) is simpler to integrate.
For tips on optimizing latency, see the [latency best practices guide](/tts/best-practices/latency).
## Timestamp Transport Strategy
When using [timestamp alignment](/tts/capabilities/timestamps), you can choose how timestamps are delivered alongside audio using `timestampTransportStrategy`:
- **`SYNC`** (default): Each chunk contains both audio and timestamps together.
- **`ASYNC`**: Audio chunks arrive first, with timestamps following in separate trailing messages. This reduces time-to-first-audio with TTS 1.5 models.
See [Timestamps](/tts/capabilities/timestamps#streaming-behavior) for details on how each mode works.
## Code Examples
View our JavaScript implementation example
View our Python implementation example
## API Reference
View the complete API specification
## Next Steps
Learn best practices for producing high-quality voice clones.
Learn best practices for synthesizing high-quality speech.
Explore Python and JavaScript code examples for TTS integration.
---
#### Integrations
Source: https://docs.inworld.ai/tts/integrations
Inworld’s API is integrated with leading voice and real-time platforms for developers. This makes it easy to get started building real-time voice agents and voice-based experiences at scale powered by Inworld’s radically affordable, state-of-the-art TTS models.
## Daily (Pipecat)
[Pipecat](https://docs.pipecat.ai/getting-started/introduction) is an open source Python framework for building real-time voice and multimodal AI agents that can see, hear, and speak. It’s designed for developers who want full control over how AI services, network transports, and audio processing are orchestrated—enabling ultra-low latency, natural-feeling conversations across custom pipelines, whether running locally or in production infrastructure.
Inworld voices and text-to-speech models are supported via a built-in `InworldTTSService`, allowing you to stream high-quality audio or generate speech on demand from within your own runtime.
To get started with Pipecat + Inworld, follow this [guide](https://docs.pipecat.ai/server/services/tts/inworld).
## LiveKit
[LiveKit](https://livekit.io/) is an open source platform for developers building realtime agents. It makes it easy to integrate audio, video, text, data, and AI models while offering scalable realtime infrastructure built on top of WebRTC.
Inworld voices and text-to-speech models are available as a plugin for LiveKit Agents, a flexible framework for building real-time conversational agents. This makes it easier for developers to create previously unimaginable, real-time voice experiences such as multiplayer games, agentic NPCs, customer-facing avatars, live training simulations, and more at an accessible price.
To get started with LiveKit + Inworld, follow this [guide](https://docs.livekit.io/agents/integrations/tts/inworld/).
## NLX
NLX is a no-code platform for developers and businesses to build, deploy, and manage conversational AI applications across a variety of channels. It enables the creation of sophisticated, multimodal experiences that can include chat, voice, and video.
Inworld TTS is available through NLX as one of the default voice providers or you can build a custom integration.
Kickstart your journey with NLX + Inworld by signing up for an NLX account, or dive right in with this how-to [guide](https://docs.nlx.ai/platform/build/integrations/text-to-speech-providers/inworld).
## Stream (Vision Agents)
[Stream (Vision Agents)](https://visionagents.ai/) is Stream's [open-source framework](https://github.com/GetStream/vision-agents) that helps developers quickly build low-latency vision AI applications. Since its initial launch, the project has expanded with additional plugins, better model support, and major improvements to latency, audio, and video handling.
Stream (Vision Agents) integrates Inworld's state-of-the-art TTS models directly into their platform, giving developers an out-of-the-box way to bring natural, expressive voice to their AI agents.
To get started with Stream (Vision Agents) x Inworld, follow this [guide](https://github.com/GetStream/Vision-Agents/tree/main/plugins/inworld).
## Ultravox
[Ultravox](https://ultravox.ai) is a real-time voice AI infrastructure layer that delivers fast, natural, and scalable voice agents. Its purpose-built inference stack powers a best-in-class speech understanding model, while developer tools including easy-to-use APIs and client-side SDKs help teams deliver production voice agents faster.
Inworld voices are natively integrated with the Ultravox platform and available for use in all accounts, making it easy to create natural, conversational agents with emotionally expressive voices.
To get started with Ultravox + Inworld, follow these [instructions](https://docs.ultravox.ai/voices/bring-your-own#inworld).
## Vapi
[Vapi](https://vapi.ai/) is a developer platform for building advanced voice AI agents. By handling the complex infrastructure, they enable developers to focus on creating great voice experiences.
Inworld's TTS is integrated with Vapi's platform, giving you access to Inworld's high-fidelity, emotionally expressive voices seamlessly on Vapi.
To get started with Vapi + Inworld, follow this [guide](https://docs.vapi.ai/api-reference/assistants/create#request.body.voice)
## Voximplant
[Voximplant](https://voximplant.ai/) is a serverless Voice AI orchestration platform and cloud communications stack for building real-time voice agents over the phone and the web. It combines programmable telephony (PSTN, SIP, WhatsApp), WebRTC, and client SDKs with a serverless JavaScript runtime (VoxEngine), so developers can efficiently orchestrate calls, speech services, and LLMs in one environment.
Inworld's TTS is natively integrated into Voximplant's realtime speech synthesis APIs, enabling low-latency streaming of expressive Inworld voices into any Voximplant-powered call. With a single VoxEngine scenario, you can connect your agent logic to Inworld for speech generation, route calls globally, and rapidly scale from prototype to production.
To get started with Voximplant + Inworld, check out this [announcement](https://voximplant.com/blog/inworld-text-to-speech-now-available-in-voximplant).
---
#### On-Prem
#### TTS On-Premises
Source: https://docs.inworld.ai/tts/on-premises
Inworld TTS On-Premises lets organizations run high-quality text-to-speech models locally — without sending text or audio data to the cloud. It's built for enterprises that require strict data control, low latency, and compliance with internal or regulatory standards.
Inworld TTS On-Premises is available for both the **Inworld TTS-1.5 Mini** and **Inworld TTS-1.5 Max** models.
To get started with TTS On-Premises, contact [sales@inworld.ai](mailto:sales@inworld.ai) for pricing and access to the container registry.
## Why TTS On-Premises
No outbound data transfer. Full ownership of text and audio.
Optimized for production workloads and interactive applications.
Suitable for air-gapped, private, and compliance-sensitive deployments.
Containerized architecture designed for operational stability.
## How it works
Inworld TTS On-Premises is delivered as a GPU-accelerated, Docker-containerized version of the Inworld TTS API. It exposes both REST and gRPC APIs for easy integration.
| Port | Protocol | Description |
|------|----------|-------------|
| **8081** | HTTP | REST API (recommended) |
| **9030** | gRPC | For gRPC clients |
### Performance
- **Latency:** Real-time streaming on supported NVIDIA GPUs
- **Throughput:** Multiple concurrent sessions are supported depending on the GPU being utilized
Contact [sales@inworld.ai](mailto:sales@inworld.ai) to get a detailed performance report for your specific hardware.
## System requirements
Inworld TTS supports all modern cloud NVIDIA GPUs: A100s, H100s, H200, B200, B300. If you have a specific target hardware platform not on this list, please reach out for custom support.
The minimum inference machine requirements are as follows:
| Component | Requirement |
|-----------|-------------|
| **GPU** | NVIDIA H100 SXM5 (80GB) |
| **RAM** | 64GB+ system memory |
| **CPU** | 8+ cores |
| **Disk** | 50GB free space |
| **OS** | Ubuntu 22.04 LTS |
| **Software** | Docker + NVIDIA Container Toolkit |
| **Software** | Google Cloud SDK (gcloud CLI) |
| **CUDA** | 13.0+ |
## Prerequisites
Before deploying TTS On-Premises, ensure the following software is installed on your Ubuntu 22.04 LTS machine.
### NVIDIA drivers
Install the latest NVIDIA drivers for your GPU. Follow the official guide at [nvidia.com/drivers](https://www.nvidia.com/en-us/drivers), or use the following commands on Ubuntu:
```bash
# Update packages
sudo apt-get update
# Install basic toolchain and kernel headers
sudo apt-get install -y gcc make wget linux-headers-$(uname -r)
# Install NVIDIA driver (check https://www.nvidia.com/en-us/drivers for the latest version)
sudo apt-get install -y nvidia-driver-580
```
### Docker
Install Docker Engine by following the official guide: [Install Docker Engine on Ubuntu](https://docs.docker.com/engine/install/ubuntu/).
Optionally, add the current user to the `docker` group so you can run Docker without `sudo`: [Linux post-installation steps](https://docs.docker.com/engine/install/linux-postinstall/).
### NVIDIA Container Toolkit
Install the NVIDIA Container Toolkit to enable GPU access from Docker containers. Follow both the **Installation** and **Configuration** sections of the official guide: [NVIDIA Container Toolkit install guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).
### Google Cloud SDK
Install the gcloud CLI by following the official guide: [Install the gcloud CLI](https://cloud.google.com/sdk/docs/install#deb).
### Verify prerequisites
Run the following command to verify that Docker, NVIDIA drivers, and the NVIDIA Container Toolkit are all correctly installed:
```bash
docker run --rm --gpus all nvidia/cuda:13.0.0-base-ubuntu22.04 nvidia-smi
```
You should see your GPU listed in the output alongside the driver version and CUDA version. If this command succeeds, your environment is ready for TTS On-Premises deployment.
### Firewall requirements
The TTS On-Premises container listens on the following ports for inbound traffic:
| Port | Protocol | Description |
|------|----------|-------------|
| **8081** | HTTP | REST API |
| **9030** | gRPC | gRPC API |
You will also need to allow the following outbound traffic:
- `us-central1-docker.pkg.dev` on port **443** — GCP Artifact Registry for pulling container images
## Quick start
### 1. Create a GCP service account
Create a service account in your GCP project and generate a key file:
```bash
# Create the service account
gcloud iam service-accounts create inworld-tts-onprem \
--project= \
--display-name="Inworld TTS On-Prem" \
--description="Service account for Inworld TTS on-prem container"
# Create a key file
gcloud iam service-accounts keys create service-account-key.json \
--iam-account=inworld-tts-onprem@.iam.gserviceaccount.com \
--project=
```
### 2. Share the service account email with Inworld
Send the service account email (e.g., `inworld-tts-onprem@.iam.gserviceaccount.com`) to your Inworld contact. Inworld will provide your **Customer ID**.
### 3. Authenticate to the container registry
```bash
gcloud auth activate-service-account \
--key-file=service-account-key.json
gcloud auth configure-docker us-central1-docker.pkg.dev
```
For more authentication options, see [Configure authentication to Artifact Registry for Docker](https://docs.google.com/artifact-registry/docs/docker/authentication#gcloud-helper).
### 4. Configure
```bash
cp onprem.env.example onprem.env
```
Edit `onprem.env` with your values:
```bash
INWORLD_CUSTOMER_ID=
TTS_IMAGE=us-central1-docker.pkg.dev/inworld-ai-registry/tts-onprem/tts-1.5-mini-h100-onprem:
KEY_FILE=./service-account-key.json
```
### 5. Start
```bash
./run.sh
```
The script will:
1. Check prerequisites (Docker, GPU, NVIDIA Container Toolkit)
2. Validate your configuration
3. Fix key file permissions if needed
4. Pull the Docker image
5. Start the container
6. Wait for services to be ready (~3 minutes)
The ML model takes approximately 3 minutes to load on first startup. This is normal.
### 6. Verify the deployment
Check that the container is running and services are healthy:
```bash
./run.sh status
```
### 7. Send a test request
```bash
curl -X POST http://localhost:8081/tts/v1/voice \
-H "Content-Type: application/json" \
-d '{
"text": "Hello, this is a test of the on-premises TTS system.",
"voice_id": "Craig",
"model_id": "inworld-tts-1.5-mini",
"audio_config": {
"audio_encoding": "LINEAR16",
"sample_rate_hertz": 48000
}
}'
```
### List available voices
```bash
curl http://localhost:8081/tts/v1/voices
```
For the full API specification, see the [Synthesize Speech API reference](/api-reference/ttsAPI/texttospeech/synthesize-speech).
## Lifecycle commands
```bash
./run.sh # Start the container
./run.sh stop # Stop and remove the container
./run.sh status # Check container and service health
./run.sh logs # Show recent logs from all services
./run.sh logs -f # Tail all service logs live
./run.sh logs export # Export all logs to a timestamped folder
./run.sh restart # Restart the container
```
## Available images
| Image | Model | GPU |
|-------|-------|-----|
| `tts-1.5-mini-h100-onprem` | 1B (mini) | H100 |
| `tts-1.5-max-h100-onprem` | 8B (max) | H100 |
Registry: `us-central1-docker.pkg.dev/inworld-ai-registry/tts-onprem/`
## Configuration
### onprem.env
| Variable | Required | Description |
|----------|----------|-------------|
| `INWORLD_CUSTOMER_ID` | Yes | Your customer ID |
| `TTS_IMAGE` | Yes | Docker image URL (see [Available Images](#available-images)) |
| `KEY_FILE` | Yes | Path to your GCP service account key file |
## Logs
```bash
# Show recent logs from all services (last 20 lines each)
./run.sh logs
# Tail all service logs live
./run.sh logs -f
# Export all logs to a timestamped folder
./run.sh logs export
```
Individual service logs:
```bash
docker exec inworld-tts-onprem tail -f /var/log/tts-v3-trtllm.log # ML server
docker exec inworld-tts-onprem tail -f /var/log/tts-normalization.log # Text normalization
docker exec inworld-tts-onprem tail -f /var/log/public-tts-service.log # TTS service
docker exec inworld-tts-onprem tail -f /var/log/grpc-gateway.log # HTTP gateway
docker exec inworld-tts-onprem tail -f /var/log/w-proxy.log # gRPC proxy
docker exec inworld-tts-onprem tail -f /var/log/supervisord.log # Supervisor
```
## Troubleshooting
| Issue | Solution |
|-------|----------|
| "INWORLD_CUSTOMER_ID is required" | Set `INWORLD_CUSTOMER_ID` in `onprem.env` |
| "GCP credentials file not found" | Check that `KEY_FILE` in `onprem.env` points to a valid file |
| "Credentials file is not readable" | Fix permissions on host: `chmod 644 .json` |
| "Topic not found" | Verify your `INWORLD_CUSTOMER_ID` matches the PubSub topic name |
| "Permission denied for topic" | Ensure Inworld has granted your service account publish access |
| Slow startup (~3 min) | Normal — text processing grammars take time to initialize |
```bash
# Check service status
docker exec inworld-tts-onprem supervisorctl -s unix:///tmp/supervisor.sock status
# Export logs for support
./run.sh logs export
```
Share the exported logs folder with [Inworld support](mailto:support@inworld.ai) when reporting issues.
## Advanced: manual Docker run
For users who prefer to run Docker directly without `run.sh`:
```bash
docker run -d \
--gpus all \
--name inworld-tts-onprem \
-p 8081:8081 \
-p 9030:9030 \
-e INWORLD_CUSTOMER_ID= \
-v $(pwd)/service-account-key.json:/app/gcp-credentials/service-account.json:ro \
us-central1-docker.pkg.dev/inworld-ai-registry/tts-onprem/tts-1.5-mini-h100-onprem:
```
- Ensure your key file has 644 permissions: `chmod 644 service-account-key.json`
- The container exposes port 8081 (HTTP) and 9030 (gRPC)
- Use `docker ps` to check container health — STATUS will show `healthy` when ready
```bash
# Stop and remove
docker stop inworld-tts-onprem && docker rm inworld-tts-onprem
# View logs
docker logs inworld-tts-onprem
# Check service status
docker exec inworld-tts-onprem supervisorctl -s unix:///tmp/supervisor.sock status
```
## Benchmarking
For performance testing, see the [Benchmarking](/tts/on-premises-benchmarking) guide.
## FAQs
Yes. The on-premises container is designed for production workloads. To get started, contact [sales@inworld.ai](mailto:sales@inworld.ai) for access to the repository.
For complete data control, low latency, and compliance with strict security or regulatory requirements.
No. All text and audio processing occurs entirely within your environment.
Deployment takes just a few minutes, with a brief model warm-up (~200 seconds).
Enterprises, governments, and regulated industries that cannot use cloud-based TTS.
**In-scope:**
- API compatibility with Inworld public API
- All built-in voices in Inworld's Voice Library
- The following model capabilities: text normalization, timestamps, and audio pre- and post-processing settings
- Deployment how-to's and latency benchmarks reproduction scripts
**Out-of-scope:**
- Instant voice cloning features and their APIs
- Voice design and its API
---
#### Benchmarking
Source: https://docs.inworld.ai/tts/on-premises-benchmarking
A comprehensive load testing tool for TTS On-Premises that measures performance metrics including latency, throughput, and streaming characteristics across different QPS (Queries Per Second) loads.
## Overview
The tool simulates realistic TTS workloads by sending requests at specified rates with configurable burstiness patterns. It measures:
- End-to-end latency
- Audio generation latency per second
- Streaming metrics (first chunk, 4th chunk, average chunk latencies)
- Request success rates
- Server performance under different load conditions
## Quick start
```bash
# Install the load test tool
pip install -e .
# Basic load test with streaming
python load-test.main \
--host http://localhost:8081 \
--stream \
--min-qps 1.0 \
--max-qps 7.0 \
--qps-step 2.0 \
--number-of-samples 300
```
## Parameters
### Required
| Parameter | Description | Example |
|---|---|---|
| `--host` | Base address of the On-Premises TTS server (endpoint auto-appended) | `http://localhost:8081` |
### Load configuration
| Parameter | Default | Description |
|---|---:|---|
| `--min-qps` | `1.0` | Minimum requests per second to test |
| `--max-qps` | `10.0` | Maximum requests per second to test |
| `--qps-step` | `2.0` | Step size for QPS increments |
| `--number-of-samples` | `1` | Total number of texts to synthesize per QPS level |
| `--burstiness` | `1.0` | Request timing pattern (`1.0` = Poisson, `< 1.0` = bursty, `> 1.0` = uniform) |
### TTS configuration
| Parameter | Default | Description |
|---|---:|---|
| `--stream` | `False` | Use streaming synthesis (`/SynthesizeSpeechStream`) vs non-streaming (`/SynthesizeSpeech`) |
| `--max_tokens` | `400` | Maximum tokens to synthesize (~8s audio at 50 tokens/s) |
| `--voice-ids` | `["Olivia", "Remy"]` | Voice IDs to use (can specify multiple) |
| `--model_id` | `None` | Model ID for TTS synthesis (optional) |
| `--text_samples_file` | `scripts/tts_load_testing/text_samples.json` | File containing text samples |
### Output and analysis
| Parameter | Default | Description |
|---|---:|---|
| `--benchmark_name` | auto-generated | Name for the benchmark run (affects output files) |
| `--plot_only` | `False` | Only generate plots from existing results (skip testing) |
| `--verbose` | `False` | Enable verbose output for debugging |
## Examples
### Streaming vs non-streaming comparison
```bash
# Non-streaming test
python load-test.main \
--host http://localhost:8081 \
--min-qps 10.0 \
--max-qps 50.0 \
--qps-step 10.0 \
--number-of-samples 500 \
--benchmark_name non-streaming-test
# Streaming test
python load-test.main \
--host http://localhost:8081 \
--stream \
--min-qps 10.0 \
--max-qps 50.0 \
--qps-step 10.0 \
--number-of-samples 500 \
--benchmark_name streaming-test
```
### Plot-only mode
Generate plots from existing results without re-running tests:
```bash
./scripts/tts-load-test \
--plot_only \
--benchmark_name prod-stress-test
```
## Understanding results
The tool generates comprehensive metrics for each QPS level.
### Latency metrics
- **E2E Latency:** Complete request-response time
- **Audio Generation Latency:** Time per second of generated audio
- **First Chunk Latency:** Time to first audio chunk (streaming only)
- **4th Chunk Latency:** Time to 4th audio chunk (streaming only)
- **Average Chunk Latency:** Mean time between chunks (streaming only)
### Percentiles
Results include P50, P90, P95, and P99 percentiles for all latency metrics.
### Output files
Results are saved in `benchmark_result/{benchmark_name}/`:
- `result.json` — Raw performance data
- `{benchmark_name}_*.png` — Performance charts
## Burstiness parameter
The burstiness parameter controls request timing distribution:
| Value | Behavior |
|---|---|
| `1.0` | Poisson process (natural randomness) |
| `< 1.0` | More bursty (requests come in clusters) |
| `> 1.0` | More uniform (evenly spaced requests) |
## Performance tips
1. **Start small** — Begin with low QPS and small sample sizes
2. **Use appropriate text samples** — Match your production text length distribution
3. **Monitor server resources** — Watch CPU, memory, and network during tests
4. **Consider burstiness** — Real-world traffic is often bursty (try 0.7–0.9)
5. **Test both modes** — Compare streaming vs non-streaming for your use case
## Troubleshooting
### Common issues
| Issue | Solution |
|---|---|
| Connection errors | Verify server address and network connectivity |
| Authentication errors | Set `INWORLD_API_KEY` for external APIs |
| High latency | Check server load and network conditions |
| Memory issues | Reduce `number-of-samples` for high QPS tests |
### Debug mode
Use the `--verbose` flag for detailed request/response logging:
```bash
./scripts/tts-load-test --verbose --host ... # other params
```
## Architecture
The tool uses:
- **Async/await:** Efficient concurrent request handling
- **Pausable timers:** Accurate server-only timing measurements
- **Multiple protocols:** gRPC, HTTP REST API support
- **Configurable clients:** Pluggable client architecture
- **Real-time progress:** Live progress bars and status updates
---
### Best Practices
#### Generating speech
Source: https://docs.inworld.ai/tts/best-practices/generating-speech
This guide covers techniques and best practices for generating high-quality, natural-sounding speech for your applications.
If you're using an LLM to generate text for TTS, see our dedicated guide on [Prompting for TTS](/tts/best-practices/prompting-for-tts) for prompt templates and techniques.
## General Best Practices
1. **Pick a suitable voice** - Different voices will be better suited for different applications. Choose a voice that matches the emotional range and expression you're looking for. For example, for a meditation app, select a more steady and calm voice. For an encouraging fitness coach, select a more expressive and excited voice.
2. **Pay attention to punctuation** - Punctuation matters! Use exclamation points (!) to make the voice more emphatic and excited. Use periods to insert natural pauses. Where possible, make sure to include punctuation at the end of the sentence.
3. **Use asterisks for emphasis** - You can emphasize specific words by surrounding them with asterisks. For example, writing "We \*need\* a beach vacation" will cause the voice to stress the word "need" when speaking, whereas "We need a \*beach\* vacation" will emphasize the word "beach". This can help clarify tone or intent in nuanced dialogue.
4. **Match the voice to the text language** - Voices perform optimally when synthesizing text in the same language as the original voice. While cross-language synthesis is possible, you'll achieve the best quality, pronunciation, and naturalness by matching the voice's native language to your text content.
5. **Normalize complex text** - If you find that the model is mispronouncing certain complex phrases like phone numbers or dollar amounts, it can help to normalize the text. This may be particularly helpful for non-English languages. Some examples of normalization include:
- **Phone numbers**: "(123)456-7891" -> "one two three, four five six, seven eight nine one"
- **Dates**: 5/6/2025 -> "may sixth twenty twenty five" *(helpful since date formats may vary)*
- **Times**: "12:55 PM" -> "twelve fifty-five PM"
- **Emails**: test@example.com -> "test at example dot com"
- **Monetary values**: $5,342.29 -> "five thousand three hundred and forty two dollars and twenty nine cents"
- **Symbols**: 2+2=4 -> “two plus two equals four"
6. **Tune the temperature** - The temperature controls the variation in audio output. Higher values increase variation, which can produce more diverse outputs with desirable outcomes but also increases the chances of bad generations and hallucinations. This can be useful for generating barks, demo clips, or other non-real-time use cases. Lower temperatures improve stability and speaker similarity, though going too low increases the chances of broken generation. For real-time use cases, we recommend keeping the temperature between 0.8 and 1, with the default being 1.0.
## Latency
For realtime use cases, minimizing latency is critical. Here are some tips and techniques you can use:
1. **Stream TTS output** - Instead of waiting for the entire generation (which may take some time if it is long), you can start playback as soon as the first chunk arrives so that the user doesn't have to wait. Inworld's [websocket streaming](/api-reference/ttsAPI/texttospeech/synthesize-speech-websocket) should be the lowest-latency option, but [streaming over HTTP](/api-reference/ttsAPI/texttospeech/synthesize-speech-stream) will also be superior to a [non-streaming setup](/api-reference/ttsAPI/texttospeech/synthesize-speech).
2. **Chunk TTS input** - Instead of sending a large request to the TTS model (whether it's pre-written or generated by an LLM), consider breaking it into sentence chunks and sending them one by one. The Inworld Agent Runtime provides [built-in tools](/node/runtime-reference/classes/graph_dsl_nodes_text_chunking_node.TextChunkingNode) to handle this in a performant manner. For synthesizing text longer than 2,000 characters, see our ready-to-run scripts in the [Long Text Input](/tts/capabilities/long-text-input) guide.
## Advanced Tips
### Natural, Conversational Speech
Natural human conversation is not perfect. It's full of filler words, pauses, and other natural speech patterns that make it sound more human.
Our TTS models are trained to generate the requested text as is, in order to produce the most accurate and consistent output that can be used for a wide range of applications. After all, not all applications want to have a bunch of filler words inserted into the speech!
To generate natural, conversational speech, you can use the following techniques:
1. Insert filler words like `uh`, `um`, `well`, `like`, and `you know` in the text. For example, instead of:
```
I'm not too sure about that.
```
change it to:
```
Uh, I'm not uh too sure about that.
```
If the text is already being generated using an LLM, you can add instructions in the prompt to insert filler words in the response. Alternatively, you can use a small LLM to insert filler words given a piece of text.
2. Use [audio markups](/tts/capabilities/audio-markups) to add non-verbal vocalizations like `[sigh]`, `[breathe]`, `[clear_throat]`. These natural speech patterns can make the speech sound more natural.
### Audio Markups
This feature is currently [experimental](/tts/resources/support#what-do-experimental%2C-preview%2C-and-stable-mean%3F), and is not recommended for real-time, production use cases.
When using audio markups, there are a number of techniques for producing the best results.
1. **Choose contextually appropriate markups** - Markups will work best when they make sense with the text content. When markups conflict with the text, the model may struggle to handle the contradiction. For example, the following phrase can be challenging:
```
[angry] I appreciate your help and I’m really grateful for your kindness.
```
The text is clearly grateful and sincere, which contradicts with the angry markup.
2. **Avoid conflicting markups** - When using multiple markups for a single text, ensure they don't conflict with each other. For example, this markup can be problematic:
```
[angry] I can't believe you did that. [yawn] You never listen.
```
Yawning typically indicates boredom or tiredness, which rarely occurs alongside anger.
3. **Break up the text** -
Emotion and delivery style markups work best when placed at the beginning of text with a single markup per request. Using multiple emotion and delivery style markups or placing them mid-text may produce mixed results. Instead of making one request like this:
```
[angry] I can't believe you didn't save the last bite of cake for me. [laughing] Got you! I was just kidding.
```
Break it into two requests:
```
[angry] I can't believe you didn't save the last bite of cake for me.
```
```
[laughing] Got you! I was just kidding.
```
4. **Repeat non-verbal vocalizations if necessary** - If a non-verbal vocalization is consistently being omitted, it may help to repeat the markup to ensure that it is vocalized. This works best for vocalizations where repetition sounds natural, such as `[laugh] [laugh]` or `[cough] [cough]`.
---
#### Latency
Source: https://docs.inworld.ai/tts/best-practices/latency
For realtime use cases, minimizing latency is critical. Here are some tips and techniques you can use:
1. **Stream TTS output** - Instead of waiting for the entire generation (which may take some time if it is long), you can start playback as soon as the first chunk arrives so that the user doesn't have to wait. Inworld's [websocket streaming](/api-reference/ttsAPI/texttospeech/synthesize-speech-websocket) should be the lowest-latency option, but [streaming over HTTP](/api-reference/ttsAPI/texttospeech/synthesize-speech-stream) will also be superior to a [non-streaming setup](/api-reference/ttsAPI/texttospeech/synthesize-speech).
2. **Chunk streaming LLM output into TTS** - For the fastest time to first audio, consider breaking streaming LLM output into sentence chunks and sending them one by one to TTS. The Inworld Agent Runtime provides [built-in tools](/node/runtime-reference/classes/graph_dsl_nodes_text_chunking_node.TextChunkingNode) to handle this in a performant manner.
3. **Use JWT authentication to stream directly to the client** - For applications like mobile apps or browser-based experiences, use [JWT authentication](/api-reference/introduction#jwt-authentication) to stream TTS directly to the client rather than proxying through your server and adding extra latency.
4. **Reuse connections with keep-alive** - The first request to the API incurs a TCP and TLS handshake. Use `Connection: keep-alive` (and persistent sessions in Python) to reuse the established connection on subsequent requests. See our [low-latency Python](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_tts_low_latency_http.py) and [JavaScript](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_tts_low_latency_http.js) examples for this technique in practice.
## Next Steps
Looking for more tips and tricks? Check out the resources below to get started!
Learn best practices for producing high-quality voice clones.
Learn best practices for synthesizing high-quality speech.
Explore Python and JavaScript code examples for TTS integration.
---
#### Prompting for TTS
Source: https://docs.inworld.ai/tts/best-practices/prompting-for-tts
When an LLM generates text that gets fed into TTS, the default output often sounds flat and unnatural. LLMs tend to produce clean, well-formatted text, but clean text isn't the same as *speakable* text. Dates stay as `12/04`, acronyms aren't expanded, and there are no cues for emphasis, pauses, or emotion.
This guide shows you what to add to your LLM system prompt so that its output is optimized for Inworld TTS.
## Quality Dimensions
### Emphasis
Use asterisks around words to make TTS stress them. Exclamation marks add energy, and ellipses create trailing-off effects.
**Prompt snippet:**
```
Use asterisks (*word*) to emphasize key words in your response — focus on
prices, deadlines, action items, or any word the listener needs to catch.
Use punctuation to convey tone:
- Exclamation marks for excitement or urgency
- Ellipsis (...) for trailing off, hesitation, or leaving a thought unfinished
Example: "I thought it would work, but..."
```
**Before (no emphasis guidance):**
> I think this is a really important point and you should consider it carefully.
**After (with emphasis guidance):**
> I think this is a \*really\* important point, and you should consider it \*carefully\*.
Use single asterisks only (`*word*`). Double asterisks (`**word**`) will cause TTS to read the asterisk characters aloud instead of emphasizing the word.
### Pronunciation
For uncommon words like brand names, proper nouns, and technical terms, Inworld TTS supports inline [IPA phoneme notation](/tts/capabilities/custom-pronunciation). You can provide a pronunciation dictionary in your system prompt that the LLM substitutes inline.
**Prompt snippet:**
```
When you use any of the following words, replace them with their IPA pronunciation
inline using slash notation:
- "Crete" → /kriːt/
- "Yosemite" → /joʊˈsɛmɪti/
- "Nguyen" → /ŋwɪən/
- "Acai" → /ɑːsɑːˈiː/
```
**Before (no pronunciation guidance):**
> You should visit Crete for your honeymoon.
**After (with IPA substitution):**
> You should visit /kriːt/ for your honeymoon.
Inworld TTS reads the IPA notation and produces the correct pronunciation. See [Custom Pronunciation](/tts/capabilities/custom-pronunciation) for details on finding the right IPA phonemes.
Another common approach is to use a string parser that replaces important-to-pronounce words from your pronunciation dictionary before passing the text to TTS. This works well as a post-processing step when you don't want to add IPA instructions to your LLM prompt, or when the same dictionary needs to be applied consistently across multiple LLM providers.
### Pauses and Pacing
Punctuation controls pacing in TTS. Periods create natural pauses between thoughts. Commas insert shorter breaks. Sentence length affects overall rhythm: short sentences speed things up, longer sentences slow them down.
**Prompt snippet:**
```
Control pacing through punctuation and sentence structure:
- Use periods to separate thoughts and create pauses
- Use commas for shorter breaks within sentences
- Use ellipsis (...) to create a lingering pause or beat
- Use short sentences for emphasis and urgency
- Use longer sentences for calm, measured delivery
```
**Before (flat pacing):**
> The results are in and we exceeded our target by 40 percent so this is the best quarter we have ever had.
**After (with pacing guidance):**
> The results are in. We exceeded our target... by \*forty percent\*. This is the \*best\* quarter we have ever had.
### Non-verbal Vocalizations
Inworld TTS supports non-verbal tokens that add human-like sounds: `[sigh]`, `[laugh]`, `[breathe]`, `[cough]`, `[clear_throat]`, `[yawn]`. These make speech sound more natural and emotionally grounded.
Audio markups are currently [experimental](/tts/resources/support#what-do-experimental%2C-preview%2C-and-stable-mean%3F) and only support English.
**Prompt snippet:**
```
Insert non-verbal vocalizations where they would naturally occur in conversation:
- [sigh] for frustration, relief, or resignation
- [laugh] for amusement or warmth
- [breathe] before delivering important or emotional statements
- [cough] or [clear_throat] for naturalistic transitions
- [yawn] for tiredness
Place these tokens inline in your text, e.g.: "[sigh] I really thought that would work."
```
**Before (no vocalizations):**
> I really thought that would work. Oh well, let's try again.
**After (with vocalizations):**
> [sigh] I \*really\* thought that would work. [laugh] Oh well, let's try again.
See [Audio Markups](/tts/capabilities/audio-markups) for the full list of supported markups including emotion and delivery style tags.
### Conversational Naturalness
Natural human speech is full of filler words like `uh`, `um`, `well`, `like`, `you know`. Adding these to LLM output makes TTS sound less robotic and more conversational.
**Prompt snippet:**
```
To sound natural and conversational, include filler words where a human speaker
would naturally use them:
- "uh" and "um" for thinking moments
- "well" and "so" for transitions
- "like" and "you know" for casual emphasis
Example: "So, uh, I was thinking we could, you know, try a different approach."
```
**Before (no fillers):**
> I was thinking we could try a different approach.
**After (with fillers):**
> So, uh, I was thinking we could, you know, try a \*different\* approach.
Filler words are best for casual, conversational use cases. Skip them for formal applications like news reading, professional narration, or customer support.
### Output Length
LLMs tend to be verbose. A detailed paragraph may read well on screen, but sounds unnatural and exhausting when spoken aloud. Keeping responses short produces better-sounding speech and reduces latency.
A good default is to ask your LLM to respond in 1–2 sentences unless the user's query specifically demands a longer answer. Use sentences as your length unit, not words or characters. LLMs operate on tokens, so word and character counts are unreliable constraints.
**Prompt snippet:**
```
Keep your responses to 1-2 sentences unless the user's question specifically
requires a longer explanation. Prefer concise, direct answers.
```
**Before (too verbose):**
> Well, the weather forecast for tomorrow is showing that there will be partly cloudy skies throughout the morning hours, with temperatures expected to reach a high of around seventy-five degrees Fahrenheit by the early afternoon, and then cooling down to approximately sixty degrees in the evening.
**After (concise):**
> Tomorrow looks like partly cloudy skies, with a high around \*seventy-five\* and cooling to sixty by evening.
## Example Prompt Templates
Below are complete, copyable system prompt blocks tailored for common use cases. Each template combines the techniques above into a ready-to-use prompt.
Use this template for chatbots, AI companions, virtual friends, and other informal conversational applications.
```
## Speech Output Rules
Your responses will be converted to speech using TTS. Follow these
rules to produce natural, expressive spoken output:
### Expressiveness
- Use *asterisks* to emphasize key words
- Use exclamation marks for excitement, ellipsis for trailing off
- Insert non-verbal vocalizations where natural:
[sigh], [laugh], [breathe], [cough], [clear_throat], [yawn]
Example: "[laugh] That's *exactly* what I was thinking!"
### Naturalness
- Include filler words (uh, um, well, like, you know) where a human would naturally pause
- Vary sentence length for natural rhythm
- Use contractions (don't, can't, I'm, we're) instead of formal forms
### Pronunciation
- Replace uncommon proper nouns with IPA: e.g., /kriːt/ for Crete
[Add your pronunciation dictionary here]
### Text Formatting
- Write numbers in spoken form: "twenty-three" not "23"
- Write dates in spoken form: "march fifteenth" not "3/15"
- Never use markdown formatting, bullet points, or structured text
- Never use emojis or special characters
- Write everything as natural spoken sentences
```
Use this template for customer support agents, sales assistants, and other professional conversational applications.
```
## Speech Output Rules
Your responses will be converted to speech using TTS. Follow these
rules to produce clear, professional spoken output:
### Clarity
- Use *asterisks* sparingly to emphasize critical information (prices, deadlines, action items)
- Use short, clear sentences for important details
- Use periods to separate distinct points
### Professionalism
- Do NOT use filler words (uh, um, like, you know)
- Do NOT use non-verbal vocalizations ([sigh], [laugh], etc.)
- Maintain a warm but professional tone
- Use contractions naturally (don't, we'll, you're)
### Numbers and Data
- Speak account numbers digit by digit: "one two three four five six" not "123456"
- Speak prices naturally: "forty-nine ninety-nine" or "forty-nine dollars and ninety-nine cents"
- Speak dates fully: "january fifteenth, twenty twenty-five" not "1/15/2025"
- Speak phone numbers in groups: "five five five, one two three, four five six seven"
### Pronunciation
- Replace product names and brand terms with IPA where needed
[Add your pronunciation dictionary here]
### Text Formatting
- Never use markdown formatting, bullet points, or structured text
- Never use emojis or special characters
- Write everything as natural spoken sentences
```
Use this template for coding assistants, documentation readers, technical narrators, and developer-facing tools.
```
## Speech Output Rules
Your responses will be converted to speech using TTS. Follow these
rules to produce accurate, well-paced technical speech:
### Technical Accuracy
- Spell out acronyms on first use: "AWS, or Amazon Web Services"
- For common acronyms after first use, speak them as words if pronounceable
(e.g., "NASA") or spell them out if not (e.g., "A-P-I")
- Speak URLs by component: "github dot com slash inworld dash AI"
- Speak code identifiers in plain English: "the getUserName function" not "getUserName()"
- Speak version numbers naturally: "version three point two" not "v3.2"
### Pronunciation
- Replace technical proper nouns with IPA:
[Add your pronunciation dictionary here, e.g.:]
- "Kubernetes" → /kuːbərˈnɛtiːz/
- "Nginx" → /ˈɛndʒɪnɛks/
- "PostgreSQL" → /ˈpoʊstɡrɛsˌkjuːˈɛl/
### Pacing
- Use measured, even pacing. Avoid rushing through technical content.
- Insert periods before key technical terms to create natural pauses
- Keep sentences moderate length
- Do NOT use filler words (uh, um, like, you know)
### Text Formatting
- Write all numbers in spoken form: "forty-two" not "42"
- Never use markdown formatting, bullet points, or code blocks
- Never use emojis or special characters
- Write everything as natural spoken sentences
```
## Notes on Normalization
Inworld TTS includes an optional **normalization** step that automatically expands dates, numbers, emails, currencies, and symbols into their spoken forms before synthesis. Understanding how normalization interacts with your LLM output is important for getting the best results.
Toggle normalization with the `applyTextNormalization` parameter in your [TTS API request](/api-reference/ttsAPI/texttospeech/synthesize-speech-stream#body-apply-text-normalization):
- `ON` — always normalize
- `OFF` — skip normalization entirely
- `APPLY_TEXT_NORMALIZATION_UNSPECIFIED` (default) — TTS decides per-request
Normalization adds slight latency to each TTS request. For latency-sensitive applications, consider having your LLM handle text expansion directly and setting `applyTextNormalization` to `OFF`.
### With Normalization On
Inworld TTS handles common expansions automatically. Your LLM prompt still benefits from guiding edge cases that normalization may not cover:
- **Ambiguous dates**: `01/02/2025` could be January 2nd or February 1st depending on locale
- **Domain-specific abbreviations**: `RDS`, `k8s`, `HIPAA` may not expand as expected
- **Uncommon acronyms**: Industry-specific terms that aren't in common usage
### With Normalization Off
The LLM must handle **all** text expansion. Your prompt must instruct the LLM to write everything in spoken form: no digits, no symbols, no shorthand.
### Comparison Table
| Raw Text | Normalization Produces | LLM Should Produce (Normalization Off) |
|---|---|---|
| `12/04/2025` | "twelve oh four twenty twenty-five" | "december fourth, twenty twenty-five" |
| `(555) 123-4567` | "five five five, one two three, four five six seven" | "five five five, one two three, four five six seven" |
| `$1,249.99` | "one thousand two hundred forty-nine dollars and ninety-nine cents" | "twelve hundred forty-nine dollars and ninety-nine cents" |
| `3:45 PM` | "three forty-five PM" | "three forty-five PM" |
| `test@example.com` | "test at example dot com" | "test at example dot com" |
| `2 + 2 = 4` | "two plus two equals four" | "two plus two equals four" |
### When to Use Each
- **Normalization on** (recommended for most cases): Less prompt engineering required. Inworld TTS handles standard expansions and you only need to guide edge cases.
- **Normalization off**: Use when you need full control over how text is spoken, or when your domain has specific pronunciation requirements that conflict with default expansion rules.
**Prompt snippet for normalization off:**
```
CRITICAL: Write ALL text in fully spoken form. Never use digits, symbols, or abbreviations.
- Dates: "december fourth, twenty twenty-five" not "12/04/2025"
- Phone numbers: "five five five, one two three, four five six seven" not "(555) 123-4567"
- Currency: "forty-nine dollars and ninety-nine cents" not "$49.99"
- Times: "three forty-five PM" not "3:45 PM"
- Emails: "john at example dot com" not "john@example.com"
- Symbols: "two plus two equals four" not "2+2=4"
```
## Tips for Iterating
- **Test with the TTS Playground**: Use the [TTS Playground](/tts/tts-playground) to quickly hear how your LLM output sounds when synthesized. Paste in sample outputs and iterate on your prompt until the speech quality meets your needs.
- **Tune LLM temperature for consistency**: Lower temperatures produce more consistent output that follows your formatting rules reliably. Higher temperatures can produce more expressive text but may ignore specific instructions. Start around `0.7` and adjust based on results.
- **Iterate on your pronunciation dictionary**: Start with a small set of terms and expand as you discover mispronunciations during testing. Ask an LLM to generate IPA for new terms.
## Next Steps
Best practices for synthesizing high-quality speech, including punctuation, emphasis, and temperature tuning.
Control emotion, delivery style, and non-verbal vocalizations with markup tags.
Define exact pronunciations for uncommon words using inline IPA notation.
---
#### Voice Cloning
Source: https://docs.inworld.ai/tts/best-practices/voice-cloning
This guide walks through best practices and techniques for generating high-quality voice clones. For more information on how to create a voice clone, check out this [guide](/tts/voice-cloning).
Inworld offers two types of voice cloning: instant voice cloning (available via [Inworld Portal](https://platform.inworld.ai)) and professional voice cloning (please [reach out](https://inworld.ai/contact-sales) for more information). We've broken down the best practices in this guide to general best practices that apply to all voice clones, as well as more specific best practices for each type of cloning.
## General Best Practices
1. **Capture the full range of expression** - Make sure your script and delivery cover the emotions and expressiveness you want the voice to capture. The more variety you include, the better the model will be at recreating those feelings. If the audio is flat, the resulting voice will usually sound monotone as well. Below are some scripts you can use that we've found work well:
- *Are you ready to save big? Get set for the sale of the century! Deals and discounts like never before! You won't want to miss this.*
- *Every challenge we face is an opportunity in disguise. Wouldn't you agree? So cheer up! It'll all be okay.*
- *How have you been? It's been way too long since we last caught up. By the way, I heard about your recent promotion. Congratulations! I'm so excited for you!*
2. **Speak clearly and consistently** - Pronounce each word carefully and avoid filler sounds like sighs or coughs. Try not to have unnaturally long pauses in the middle of your recording, as this can affect the flow of the cloned voice.
3. **Minimize noise** - Record in a quiet environment and keep a reasonable distance from the microphone to reduce echo, plosives, and device noise. After recording, listen back to ensure your audio is clean and free of any unwanted sounds.
## Best Practices for Instant Voice Cloning
1. **Keep final clip short** - Use a 5-15s total length for enough context while keeping the voice consistent.
2. **Use high-quality audio** - Record with at least a 22 kHz sample rate and 16-bit depth.
3. **Vary emotion and delivery** - Combine a few short clips that show different expressions into your final clip; use short pauses or crossfades between clips to avoid abrupt cuts.
4. **Use clean audio** - Avoid artifacts, background noise, and non-speech sounds.
5. **Normalize volume** - Keep levels fairly consistent with normal voice variation; avoid clipping due to very high dB.
6. **Avoid mid-word cuts** - Don’t use samples that break in the middle of words.
Instant voice cloning may not perform well for less common voices, such as children's voices or unique accents. For those use cases, we recommend professional voice cloning.
## Best Practices for Professional Voice Cloning
1. **Follow the optimal recording specifications** - For the best voice quality, we recommend recording audio with the following specifications:
- Audio Format: .wav
- Sampling Frequency: 48 kHz
- Bit Rate: 24 bits
- Codec: Linear PCM (uncompressed)
- Channel(s): 1 (mono)
- Loudness Level: -23LUFS ±0.5 LU (compliant with ITU-R BS.1770-3)
- Peak Values Level (Max): -5 dBFS using True Peak value (compliant with ITU-R BS.1770-3)
- Noise Floor Level (Max): -60dB
2. **Maintain consistent voice delivery** - Keep your voice consistent throughout all recordings. It’s fine to reflect natural variation based on the script (such as hesitations, questions, or exclamations), but avoid major changes in accent or style between samples.
3. **Provide ample, high-quality samples** - While the minimum required audio is only 30 samples (5–20 seconds each, totaling about 5 minutes), we recommend at least 120 samples (totaling about 20 minutes) for the best results. There’s no upper limit to the number of samples you can provide—more clean, high-quality recordings will generally lead to higher quality clones.
4. **Include transcripts where possible** - Text transcripts are not strictly necessary, but we recommend providing them if available—especially for uncommon words, product names, or company terms. This ensures accurate pronunciation in the final voice clone.
## Automation via API
If you need to clone multiple voices (for example, to support a batch of creators or a pipeline workflow), you can automate voice cloning via the API.
- **API reference**: [Clone a voice](/api-reference/voiceAPI/voiceservice/clone-voice)
- **Python example**: [example_voice_clone.py](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_voice_clone.py)
- **JavaScript example**: [example_voice_clone.js](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_voice_clone.js)
Voice cloning has lower rate limits than regular speech synthesis. For details, see [Rate limits](/resources/rate-limits).
---
#### Voice Design
Source: https://docs.inworld.ai/tts/best-practices/voice-design
This guide walks through best practices and techniques for generating high-quality voices using Voice Design. For a step-by-step walkthrough on how to design a voice, check out the [Voice Design guide](/tts/voice-design).
Voice Design is currently in [research preview](/tts/resources/support#what-do-experimental-preview-and-stable-mean). Please share any feedback with us via the feedback form in [Portal](https://platform.inworld.ai) or in [Discord](https://discord.gg/inworld).
## Voice Description Best Practices
The voice description helps the model understand the type of voice you want to generate. The following best practices will help you write descriptions that produce better voices:
1. **Be specific in your description** - Vague descriptions like "a fun voice" may produce less consistent results. Include details about age, gender, language (if not English), accent, pitch, pace, timbre, tone, and emotional quality.
We generally recommend structuring your description in this order: *Distinctive Qualities → Gender → Language / Accent → Age → Tone → Delivery Style → Pacing → Additional Qualities → Audio Quality*
For example:
> *"A soothing, calming female voice with soft American accent, 30-45 years old. Gentle, flowing delivery with natural pauses and smooth transitions. Warm, peaceful tone that creates relaxation without sounding robotic. Perfect broadcast quality audio."*
2. **Be specific with age** - If more general terms like "young" and "old" are not producing the desired voice, use more specific age ranges like "mid-20s to early 30s" or "late 60s to early 70s".
- For child voices, try specifying exact ages (e.g., "8-10 years old") and emphasize "natural" and "age-appropriate" to avoid over-cutesy results.
- For elderly voices, include both the age range and specific texture descriptors ("gravelly," "weathered") along with pacing cues ("slower, deliberate").
3. **For regional accents, specify the city or region** - For regional accents, always include the specific city or region. For example, write "Boston accent" rather than "Northeast accent."
4. **Describe vocal texture in the middle** - Place descriptions of the vocal texture and timbre (e.g., "raspy," "breathy," "nasally") in the middle of your voice description, never at the end. Use modifiers like "slight," "subtle," or "natural" to prevent over-exaggeration.
5. **End with audio quality** - For the clearest audio quality, include the phrase "Perfect broadcast quality audio.” at the end of your description. This can be especially helpful if the voice includes descriptions like "gravelly", "breathy", or "scratchy" that may be misinterpreted as audio degradation.
6. **Avoid conflicting descriptors** - Don't use conflicting descriptors (e.g., "fast-paced" with "slow, deliberate"), as that may confuse the model.
7. **Experiment with multiple generations** - Each generation produces slightly different results. Especially for less common voices (e.g., children, elderly specific regional accents), you may need to generate a couple of times to get a succesful voice.
## Voice Script Best Practices
The script shapes the voice that gets generated, as the model will tailor the voice to suit the content it's speaking. If writing your own script, the following best practices will help ensure the best results.
1. **Match the script to the voice** - The model will tailor the voice to the script. Write a script that matches your voice and desired use case. For example, if you're designing a customer support voice, use a script that sounds like a customer support conversation. For accented voices, use words and phrasing typical of that accent. For example, for a British voice, use words like "brilliant," "proper," or "spot on."
2. **Aim for 5-15 seconds** - Aim for a script that will generate 5-15 seconds of audio (50-200 characters in English), so that your resulting voice has enough generated audio to reference for how the voice should sound in your future audio generations.
3. **Match the desired language** - Make sure the script is in the desired language (e.g., write a Chinese script if you want the voice to speak Chinese).
## Next Steps
Follow the step-by-step guide to design your first voice.
Learn best practices for synthesizing high-quality speech with your designed voices.
Clone an existing voice with just 5-15 seconds of audio.
---
### Resources
#### Release Notes
Source: https://docs.inworld.ai/release-notes/tts
## Inworld TTS 1.5
Launched [Inworld TTS 1.5](https://inworld.ai/blog/inworld-tts-1-5-the-world-s-best-realtime-text-to-speech-model), our newest generation of realtime TTS models featuring:
* **Two New Models:** Our flagship model `inworld-tts-1.5-max` is ideal for most use cases, with the best balance of quality and speed. For use cases where latency is the top priority, we also offer `inworld-tts-1.5-mini`.
* **Latency Improvements:** Our new TTS-1.5 models achieve P90 latency for first audio chunk delivery under 250ms for our Max model and under 160ms for our Mini model, a 4x improvement compared to TTS-1.
* **More Expressive and More Stable:** TTS 1.5 is 20% more expressive than prior generations and demonstrates a 25% reduction in word error rate.
* **Additional Languages:** We've added support for additional languages, including Hindi, Arabic, and Hebrew, bringing total languages supported to 15.
## Updates to Inworld TTS
Released an upgraded version of the Inworld TTS models with higher overall quality.
* **Speech Quality:** Clearer, more natural speech with smoother pacing and more accurate pronunciation.
* **Voice Similarity:** Cloned voices sound closer to the originals, preserving each voice’s unique style.
* **Non-English Languages:** More consistent, reliable output across supported non-English languages.
* **Custom Pronunciation:** New support for inline IPA, giving you control over exact word pronunciations. See the [Key Features](/tts/capabilities/custom-pronunciation) for details.
---
#### Billing
Source: https://docs.inworld.ai/portal/billing
---
#### Usage
Source: https://docs.inworld.ai/portal/usage
---
#### Zero Data Retention
Source: https://docs.inworld.ai/tts/resources/zero-data-retention
Enterprise customers in healthcare, finance, and other regulated industries can now ensure that no customer text inputs or audio outputs are persistently stored in our systems after processing completes.
## What products are configured for Zero Data Retention Mode?
| Product | Type | Eligible for Zero Data Retention? |
| :---- | :---- | :---- |
| Text-to-Speech | Text Input | Yes |
| Text-to-Speech | Audio Output | Yes |
| Voice Cloning | Audio Samples | No |
| Voice Design | Voice Description/Script | No |
## How Zero Data Retention works
When enabled, all customer text inputs sent to our text-to-speech engine are processed in memory to generate audio. Once complete, both the text and the generated audio are immediately redacted from our logging systems. This applies to all text sent for speech synthesis, including requests using cloned voices.
## Supported models
TTS Zero Data Retention currently supports **TTS-1.5-Mini** and **TTS-1.5-Max** only. Legacy TTS-1 and TTS-1-Max models are not in the support scope.
## Enterprise-level security
Zero Data Retention is configured at the workspace level, allowing customers with strict data privacy requirements to maintain compliance.
Whether you're handling protected health information or sensitive financial data, you get the security guarantees needed for production deployment at scale.
## FAQs
TTS Zero Data Retention protects privacy by redacting all input texts from logs — including text sent to Cloned Voices — while only retaining the initial source audio required to build the voice itself. This is currently a workspace-level configuration managed by the Inworld Engineering team.
Yes. Zero Data Retention is configured per workspace. Contact our sales team to enable it for specific workspaces.
With Zero Data Retention enabled, original input text and audio output cannot be retrieved. Debugging and troubleshooting capabilities are limited to non-sensitive metadata only.
No. The TTS engine still receives the text to generate the audio. The redaction happens only at the logging layer, ensuring the record of what was said is not saved.
---
#### ElevenLabs Migration
Source: https://docs.inworld.ai/tts/resources/elevenlabs-migration
Batch-migrate your existing ElevenLabs voice clones into Inworld using our open-source migration tool: [github.com/inworld-ai/voice-migration-tool](https://github.com/inworld-ai/voice-migration-tool)
The tool runs locally on your machine and communicates directly with the ElevenLabs and Inworld APIs; it does not proxy your data through any additional intermediary servers.
Requires **Node.js** 18+ and **ffmpeg** installed on your machine.
## Steps
1. **Clone and start the tool** — Run: `git clone https://github.com/inworld-ai/voice-migration-tool.git && cd voice-migration-tool && npm install && npm run dev` then open http://localhost:3000.
2. **Connect your accounts** — Enter your ElevenLabs API key, Inworld API key, and Inworld workspace name. Only voices you created yourself are shown.
3. **Select voices and migrate** — Select voices and click Migrate Selected. Audio samples are automatically converted to WAV, padded to 5s minimum, and trimmed to 15s maximum.
4. **Preview your migrated voices** — Click Preview on any voice to generate a sample utterance and confirm the clone works.
---
#### Support
Source: https://docs.inworld.ai/tts/resources/support
---
## STT
### Get Started
#### Intro to STT
Source: https://docs.inworld.ai/stt/overview
Inworld's Speech-to-Text (STT) API provides a unified integration point for industry-leading transcription providers. You get consistent authentication, request formatting, and response handling across providers — without managing multiple SDKs or credentials.
The API supports both synchronous transcription for complete audio files and real-time bidirectional streaming over WebSocket for live audio.
Make your first STT API call and get a transcript.
View the complete API specification.
Browse ready-to-use GitHub samples for sync and real-time STT.
## Supported Providers
### Groq
| **Model ID** | **Endpoints** | **Best for** |
| :--- | :--- | :--- |
| `groq/whisper-large-v3` | Sync API only | General-purpose transcription for recorded audio |
### AssemblyAI
| **Model ID** | **Endpoints** | **Best for** |
| :--- | :--- | :--- |
| `assemblyai/universal-streaming-multilingual` | WebSocket only | Multilingual streaming (English, Spanish, French, German, Italian, Portuguese) |
| `assemblyai/universal-streaming-english` | WebSocket only | English-optimized streaming |
AssemblyAI models currently support the WebSocket streaming endpoint only. Sync HTTP support is coming soon.
For pricing details, see [inworld.ai/pricing](https://inworld.ai/pricing).
## Features
| **Feature** | **groq/whisper-large-v3** | **assemblyai/universal-streaming-multilingual** | **assemblyai/universal-streaming-english** |
| :--- | :--- | :--- | :--- |
| Pricing | $0.111/hour | $0.15/hour | $0.15/hour |
| Endpoint | Sync API only | WebSocket only | WebSocket only |
| Real-time streaming | | | |
| Best for | General-purpose transcription for recorded audio | Multilingual streaming (English, Spanish, French, German, Italian, Portuguese) | English-optimized streaming |
| Languages | 100+ (Whisper) | 6 languages | English |
## Supported Audio Formats
| **Format** | **Sync API** | **WebSocket Streaming** |
| :--- | :--- | :--- |
| `LINEAR16` (PCM) | | |
| `MP3` | | |
| `OGG_OPUS` | | |
| `FLAC` | | |
| `AUTO_DETECT` | | |
Recommended defaults: 16,000 Hz sample rate, 16-bit depth, mono. For container formats (MP3, FLAC, OGG_OPUS), `sampleRateHertz` is optional — the API auto-detects it from the file header.
## Endpoints
| **Endpoint** | **Method** | **Description** |
| :--- | :--- | :--- |
| [`/stt/v1/transcribe`](/api-reference/sttAPI/speechtotext/transcribe) | POST | Send complete audio, receive full transcript |
| [`/stt/v1/transcribe:streamBidirectional`](/api-reference/sttAPI/speechtotext/transcribe-stream-websocket) | WebSocket | Stream audio in real time, receive transcription chunks as they become available |
---
#### Developer Quickstart
Source: https://docs.inworld.ai/stt/quickstart
In this quickstart, you'll send an audio file to the STT API and receive a transcript.
## Make your first STT API request
Create an [Inworld account](https://platform.inworld.ai/signup).
(See Authentication section above for API key setup.)
Set your API key as an environment variable.
```shell macOS and Linux
export INWORLD_API_KEY='your-base64-api-key-here'
```
```shell Windows
setx INWORLD_API_KEY "your-base64-api-key-here"
```
The STT API accepts base64-encoded audio. Prepare your audio file (e.g., `input.mp3`) and encode it:
```shell macOS
export AUDIO_BASE64=$(base64 input.mp3 | tr -d '\n')
```
```shell Linux
export AUDIO_BASE64=$(base64 -w0 input.mp3)
```
Recommended audio settings: 16,000 Hz sample rate, mono, 16-bit depth. See [Supported Audio Formats](/stt/overview#supported-audio-formats) for all options.
```curl cURL
curl --request POST \
--url https://api.inworld.ai/stt/v1/transcribe \
--header "Authorization: Basic $INWORLD_API_KEY" \
--header "Content-Type: application/json" \
--data "{
\"transcribeConfig\": {
\"modelId\": \"groq/whisper-large-v3\",
\"audioEncoding\": \"MP3\"
},
\"audioData\": {
\"content\": \"$AUDIO_BASE64\"
}
}"
```
Set `audioEncoding` to match your file format (`MP3`, `LINEAR16`, `OGG_OPUS`, `FLAC`), or use `AUTO_DETECT` to let the API infer it from the audio header.
A successful response contains the transcript:
```json
{
"transcription": {
"transcript": "Hey, I just wanted to check in on the delivery status for my order.",
"isFinal": true,
"wordTimestamps": []
},
"usage": null
}
```
| Field | Description |
| :--- | :--- |
| `transcription.transcript` | The transcribed text |
| `transcription.isFinal` | Whether the result is finalized |
| `transcription.wordTimestamps` | Per-word timing data (coming soon) |
| `usage` | Usage metrics for billing (coming soon) |
## Next Steps
Learn about supported providers, audio formats, and endpoints.
View the complete API specification.
---
#### Voice Profiles
Source: https://docs.inworld.ai/stt/voice-profiles
Voice Profile analyzes vocal characteristics of the speaker alongside transcription. It returns structured classification data for **Age**, **Emotion**, **Pitch**, **Vocal Style**, and **Accent**, each with confidence scores ranging from 0.0 to 1.0.
Voice Profile is available across all STT models on the Inworld STT API. By understanding _who_ is speaking and _how_ they are speaking, applications can adapt responses, adjust tone, route conversations, or trigger context-sensitive behaviors in real time.
## Use cases
- **Voice agents and NPCs** — Adapt responses based on the speaker's detected emotion or vocal style.
- **Accessibility** — Detect age category or vocal style to adjust UI, pacing, or interaction complexity.
- **Content moderation** — Flag unusual vocal patterns (shouting, crying) for escalation or review.
- **Analytics and insights** — Aggregate emotion and vocal style data across sessions.
- **Localization** — Use accent detection to dynamically select language models or localized content.
## How it works
Voice Profile analysis runs automatically when configured via `inworldConfig.voiceProfileThreshold` in your request. The confidence threshold controls which labels are returned — only labels at or above the threshold are included. Default: `0.5`. Range: 0.0–1.0.
## Classification categories
### Age
| Label | Description |
| :--- | :--- |
| `young` | Young adult / teenager |
| `adult` | Adult speaker |
| `kid` | Child speaker |
| `old` | Elderly speaker |
| `unclear` | Age could not be determined |
### Emotion
| Label | Description |
| :--- | :--- |
| `tender` | Soft, gentle, caring tone |
| `sad` | Sorrowful or melancholy tone |
| `calm` | Relaxed, even-tempered delivery |
| `neutral` | No strong emotional signal |
| `happy` | Cheerful, upbeat tone |
| `angry` | Frustrated, aggressive tone |
| `fearful` | Anxious or frightened tone |
| `surprised` | Startled or astonished tone |
| `disgusted` | Revulsion or strong disapproval |
| `unclear` | Emotion could not be determined |
### Pitch
| Label | Description |
| :--- | :--- |
| `low` | Low-pitched voice |
| `medium` | Medium-pitched voice |
| `high` | High-pitched voice |
### Vocal Style
| Label | Description |
| :--- | :--- |
| `whispering` | Hushed, breathy delivery |
| `normal` | Standard conversational speech |
| `singing` | Melodic or musical delivery |
| `mumbling` | Unclear, low-articulation speech |
| `crying` | Speech accompanied by crying |
| `laughing` | Speech accompanied by laughter |
| `shouting` | Loud, raised-voice delivery |
| `monotone` | Flat, unvaried pitch delivery |
| `unclear` | Vocal style could not be determined |
### Accent
Detects the speaker's accent using BCP-47 locale codes. Supported codes include: `en-US`, `en-GB`, `en-AU`, `zh-CN`, `fr-FR`, `es-ES`, `es-419`, `es-MX`, `ar-EG`, and more.
## Configuration
Set `voiceProfileThreshold` inside `inworldConfig`:
```json
{
"transcribeConfig": {
"modelId": "",
"language": "en-US",
"audioEncoding": "MP3",
"inworldConfig": {
"voiceProfileThreshold": 0.5
}
}
}
```
## Response structure
The `voiceProfile` object is returned alongside `transcription` and `usage` in both sync and streaming responses. Each category is an array of `{ label, confidence }` objects, sorted by descending confidence.
The JSON below shows the **normalized** response shape (camelCase throughout). Raw API payloads may use `snake_case` for the same fields (for example `vocal_style`, `transcribed_audio_ms`, `model_id`). Prefer representing one layer per example in your own docs and client code — either the raw API shape or the normalized shape — not a mix of both.
### Example response (sync, normalized shape)
```json
{
"transcription": {
"transcript": "Hey, I just wanted to check in on the delivery status.",
"isFinal": true
},
"voiceProfile": {
"age": [
{ "label": "young", "confidence": 0.78 }
],
"emotion": [
{ "label": "tender", "confidence": 0.97 },
{ "label": "sad", "confidence": 0.03 }
],
"pitch": [
{ "label": "medium", "confidence": 0.85 }
],
"vocalStyle": [
{ "label": "whispering", "confidence": 0.97 },
{ "label": "normal", "confidence": 0.03 }
],
"accent": [
{ "label": "en-US", "confidence": 0.48 }
]
},
"usage": {
"transcribedAudioMs": 3200,
"modelId": "inworld/inworld-stt-1"
}
}
```
## Best practices
- **Start with the default threshold (0.5)** — Filters out low-confidence noise while keeping useful labels.
- **Use emotion and vocal style together** — Combining both gives a richer picture.
- **Handle missing fields gracefully** — Fields may be absent if classification confidence is insufficient.
- **Accent is probabilistic** — Use it as a signal rather than a hard routing decision.
---
### Resources
#### Billing
Source: https://docs.inworld.ai/stt/resources/billing
---
#### Usage
Source: https://docs.inworld.ai/stt/resources/usage
---
#### Support
Source: https://docs.inworld.ai/stt/resources/support
---
## Realtime API
### Overview
#### Intro to Realtime API (Speech-to-Speech)
Source: https://docs.inworld.ai/realtime/overview
Inworld's Realtime API (Speech-to-Speech) enables low-latency, speech-to-speech interactions with voice agents.
The API follows the OpenAI Realtime protocol, extended to enable additional customization.
Build a voice agent with WebSocket, mic input, and audio playback.
Build a voice agent with browser-native WebRTC — no manual audio encoding.
See the full event schemas for the Realtime API.
JavaScript examples for the Realtime API.
Python examples for the Realtime API.
Inworld's Realtime API is currently in [research preview](/tts/resources/support#what-do-experimental-preview-and-stable-mean). Please share any feedback with us via the feedback form in [Portal](https://platform.inworld.ai) or in [Discord](https://discord.gg/inworld).
## Key Features
- **WebSocket and WebRTC transports**: Connect over [WebSocket](/realtime/connect/websocket) or [WebRTC](/realtime/connect/webrtc) with a standard event schema.
- **Automatic interruption-handling and turn-taking**: Your agent will manage conversations naturally and be resilient to user barge-in.
- **Router support**: Utilize Inworld Routers to enable a single agent to dynamically handle different user cohorts, or to facilitate A/B tests.
- **OpenAI compatibility**: Drop-in replacement for the OpenAI Realtime API with a simple [migration path](/realtime/openai-migration).
## Guides
Configure sessions, send input, and orchestrate responses.
Session lifecycle and conversation events.
Step-by-step guide to switch from OpenAI to Inworld.
See the [API reference](/api-reference/realtimeAPI/realtime/realtime-websocket) for full event schemas.
---
#### WebSocket Quickstart
Source: https://docs.inworld.ai/realtime/quickstart-websocket
Build a browser-based voice agent that streams audio to the Inworld Realtime API using WebSocket.
The WebSocket transport is best for server-side and proxied connections where you can set custom headers. For browser-native voice with lower latency, see the [WebRTC Quickstart](/realtime/quickstart-webrtc).
## Get Started
Create an [Inworld account](https://platform.inworld.ai/signup).
(See Authentication section above for API key setup.)
Set your API key as an environment variable.
```shell macOS and Linux
export INWORLD_API_KEY='your-base64-api-key-here'
```
```shell Windows
setx INWORLD_API_KEY "your-base64-api-key-here"
```
Create `server.js`. It proxies WebSocket events between the browser and Inworld, configures the voice session, and triggers an initial greeting.
```javascript server.js
import { readFileSync } from 'fs';
import { createServer } from 'http';
import { WebSocketServer, WebSocket } from 'ws';
const html = readFileSync('index.html');
const server = createServer((req, res) => {
res.writeHead(200, { 'Content-Type': 'text/html' });
res.end(html);
});
const wss = new WebSocketServer({ server, path: '/ws' });
const SESSION_CFG = JSON.stringify({
type: 'session.update',
session: {
instructions: 'You are a friendly voice assistant. Keep responses brief.',
}
});
const GREET = JSON.stringify({
type: 'conversation.item.create',
item: { type: 'message', role: 'user', content: [{ type: 'input_text', text: 'Greet the user' }] }
});
wss.on('connection', (browser) => {
let setup = 0;
const api = new WebSocket(
`wss://api.inworld.ai/api/v1/realtime/session?key=voice-${Date.now()}&protocol=realtime`,
{ headers: { Authorization: `Basic ${process.env.INWORLD_API_KEY}` } }
);
api.on('message', (raw) => {
if (setup < 2) {
const t = JSON.parse(raw.toString()).type;
if (t === 'session.created') { api.send(SESSION_CFG); setup = 1; }
else if (t === 'session.updated' && setup === 1) { api.send(GREET); api.send('{"type":"response.create"}'); setup = 2; }
}
if (browser.readyState === WebSocket.OPEN) browser.send(raw.toString());
});
browser.on('message', (msg) => { if (api.readyState === WebSocket.OPEN) api.send(msg.toString()); });
browser.on('close', () => api.close());
api.on('close', () => { if (browser.readyState === WebSocket.OPEN) browser.close(); });
api.on('error', (e) => console.error('API error:', e.message));
});
let port = 3000;
server.on('error', (e) => {
if (e.code === 'EADDRINUSE') { console.warn(`Port ${port} in use, trying ${++port}…`); server.listen(port); }
else throw e;
});
server.listen(port, () => console.log(`Open http://localhost:${port}`));
```
Create `index.html` in the same directory. It captures microphone audio, plays agent audio, and displays transcripts that fade after each turn.
```html index.html
Voice Agent
Start Conversation
```
```bash
npm init -y && npm pkg set type=module
npm install ws
node server.js
```
Open [http://localhost:3000](http://localhost:3000) and click **Start Conversation**. The agent greets you with audio.
## How It Works
| Component | Role |
| --- | --- |
| **Browser** | Captures mic audio (PCM16, 24 kHz), plays agent audio |
| **Server** | Proxies events between browser and Inworld, holds the API key server-side |
| **Inworld Realtime API** | Handles speech-to-text, LLM processing, and text-to-speech in one WebSocket session |
Key events used:
- `input_audio_buffer.append` — streams mic audio to Inworld
- `response.output_audio.delta` — agent audio chunks for playback
- `input_audio_buffer.speech_started` — triggers interruption (stops agent playback)
## Next Steps
Full connection details, session config, and event handling.
Configure the key elements of your voice agent.
---
#### WebRTC Quickstart
Source: https://docs.inworld.ai/realtime/quickstart-webrtc
Build a browser-based voice agent that streams audio to the Inworld Realtime API using WebRTC. Audio is handled natively by the browser — no manual PCM encoding or base64 conversion needed.
WebRTC is ideal for browser voice apps with low latency. For server-side integrations, see the [WebSocket Quickstart](/realtime/quickstart-websocket).
## Get Started
Create an [Inworld account](https://platform.inworld.ai/signup).
(See Authentication section above for API key setup.)
Create a `.env` file:
```shell .env
INWORLD_API_KEY=your-base64-api-key-here
```
Create `server.js`. It serves the page and provides a `/api/config` endpoint that fetches ICE servers from the WebRTC proxy while keeping the API key server-side.
```javascript server.js
import 'dotenv/config';
import { readFileSync } from 'fs';
import { createServer } from 'http';
const html = readFileSync('index.html');
const API_KEY = process.env.INWORLD_API_KEY || '';
const PROXY = 'https://api.inworld.ai';
const server = createServer(async (req, res) => {
if (req.url === '/api/config') {
let ice = [];
try {
const r = await fetch(`${PROXY}/v1/realtime/ice-servers`, {
headers: { Authorization: `Bearer ${API_KEY}` },
});
if (r.ok) ice = (await r.json()).ice_servers || [];
} catch {}
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ api_key: API_KEY, ice_servers: ice, url: `${PROXY}/v1/realtime/calls` }));
return;
}
res.writeHead(200, { 'Content-Type': 'text/html' });
res.end(html);
});
let port = 3000;
server.on('error', (e) => {
if (e.code === 'EADDRINUSE') { console.warn(`Port ${port} in use, trying ${++port}…`); server.listen(port); }
else throw e;
});
server.listen(port, () => console.log(`Open http://localhost:${port}`));
```
Create `index.html` in the same directory. It connects via WebRTC, streams mic audio automatically, and plays agent audio through an RTP track.
```html index.html
WebRTC Voice Agent
Start Conversation
```
```bash
npm init -y && npm pkg set type=module
npm install dotenv
node server.js
```
Open [http://localhost:3000](http://localhost:3000) and click **Start Conversation**. The agent greets you with audio.
## Option 2: Using OpenAI Agents SDK
If you're building a more advanced voice agent with features like agent handoffs, tool calling, and guardrails, you can use the [OpenAI Agents SDK](https://github.com/openai/openai-agents-js) with Inworld's WebRTC proxy. We provide a ready-to-run playground based on OpenAI's realtime agents demo.
```bash
git clone https://github.com/inworld-ai/experimental-oai-realtime-agents-playground.git
cd experimental-oai-realtime-agents-playground
npm install
```
If you are unable to access this repository, please contact **support@inworld.ai** for access.
Open `.env` and set `OPENAI_API_KEY` to your **Inworld** API key (the same Base64 credentials from [Inworld Portal](https://platform.inworld.ai/)):
```shell .env
OPENAI_API_KEY=your-inworld-base64-api-key-here
```
Despite the variable name `OPENAI_API_KEY`, this must be your **Inworld** API key — not an OpenAI key. The SDK uses this variable name by convention, but the playground routes all traffic through the Inworld WebRTC proxy.
```bash
npm run dev
```
Open [http://localhost:3000](http://localhost:3000). Select a scenario from the **Scenario** dropdown and start talking.
The playground includes two agentic patterns:
- **Chat-Supervisor** — A realtime chat agent handles basic conversation while a more capable text model (e.g. `gpt-4.1`) handles tool calls and complex responses.
- **Sequential Handoff** — Specialized agents transfer the user between them to handle specific intents (e.g. authentication → returns → sales).
For full details on customizing agents, see the playground's README.
---
## How It Works
| Component | Role |
| --- | --- |
| **Browser** | Captures mic audio via WebRTC, plays agent audio from RTP track |
| **Node.js server** | Serves the page and `/api/config` (ICE servers + API key) |
| **WebRTC proxy** | Bridges WebRTC ↔ WebSocket, transcodes OPUS ↔ PCM16 |
| **Inworld Realtime API** | Handles speech-to-text, LLM processing, and text-to-speech |
Key differences from WebSocket:
- Audio flows via **RTP tracks** (no base64 encoding)
- Events flow via **DataChannel** (same JSON schema)
- Browser handles **OPUS codec** natively
## Next Steps
Full connection details, session config, and SDK integration.
VAD configuration, audio formats, and conversation flow.
Migrate from OpenAI Realtime API to Inworld.
---
### Build with Realtime API
#### WebSocket
Source: https://docs.inworld.ai/realtime/connect/websocket
Connect via WebSocket. For browser-native, low-latency voice, see [WebRTC](/realtime/connect/webrtc).
## Endpoint
```
wss://api.inworld.ai/api/v1/realtime/session?key=&protocol=realtime
```
| Parameter | Required | Description |
| --- | --- | --- |
| `key` | Yes | Session ID from your app |
| `protocol` | Yes | `realtime` |
## Authentication
| Environment | Header | Notes |
| --- | --- | --- |
| **Server-side (Node.js)** | `Authorization: Basic ` | The API key from [Inworld Portal](https://platform.inworld.ai/) is already Base64-encoded |
| **Client-side (browser)** | `Authorization: Bearer ` | Mint a JWT on your backend. See the [JWT sample app](https://github.com/inworld-ai/inworld-nodejs-jwt-sample-app) for a complete example |
## Flow
1. Connect → receive `session.created`
2. Send `session.update` (instructions, audio config, tools)
3. Stream audio (`input_audio_buffer.append`) or text (`conversation.item.create`)
4. `response.create` → handle `response.output_*` until `response.done`
## Session Config
`session.update` accepts partial updates so you are able to dynamically update your prompt, voice, model, tools, and so on during the conversation.
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
model: 'openai/gpt-4o-mini',
instructions: 'You are a concise concierge.',
output_modalities: ['audio', 'text'],
audio: {
input: {
turn_detection: {
type: 'semantic_vad',
eagerness: 'medium',
create_response: true,
interrupt_response: true
}
},
output: {
voice: 'Clive',
model: 'inworld-tts-1.5-mini',
speed: 1.0
}
},
tools: [{
type: 'function',
name: 'get_weather',
description: 'Fetch weather for a location',
parameters: {
type: 'object',
properties: { location: { type: 'string' } },
required: ['location']
}
}]
}
}));
```
## Audio
Input and output audio should be PCM16, 24 kHz mono, base64 encoded. Recommended chunk size is 100-200ms.
```javascript
ws.send(JSON.stringify({ type: 'input_audio_buffer.append', audio: base64PcmChunk }));
```
Use `input_audio_buffer.clear` to discard unwanted audio.
## Text
The Realtime API can accept text as well as audio. Send it from your client using `conversation.item.create`.
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{
type: 'input_text',
text: 'Can you summarize the notes I sent?'
}]
}
}));
```
## Events
Speech-to-speech conversations are facilitated by websocket events - both client-sent events which you'll send to the API, and server-sent events which you'll receive and react to.
- **Session:** `session.created`, `session.updated`
- **Conversation:** `conversation.item.added/done/retrieved/deleted/truncated`, transcription deltas/completions
- **Responses:** `response.created`, `response.output_item.added/done`, `response.output_text.delta/done`, `response.output_audio.delta/done`, `response.done`
- **Audio/VAD:** `input_audio_buffer.speech_started`, `input_audio_buffer.speech_stopped`, `response.output_audio_transcript.delta`
- **Errors:** `error`
The full list of events and their schemas is available in the [API reference](/api-reference/realtimeAPI/realtime/realtime-websocket).
## Node.js websocket server example
Server-side Node.js example using the `ws` library with Basic auth.
```javascript
const sessionId = 'your-session-id';
const credentials = process.env.INWORLD_API_KEY;
const ws = new WebSocket(`wss://api.inworld.ai/api/v1/realtime/session?key=${sessionId}&protocol=realtime`, {
headers: {
Authorization: `Basic ${credentials}`
}
});
ws.on('open', () => {
console.log('WebSocket connected');
});
ws.on('message', (buffer) => {
const message = JSON.parse(buffer.toString());
switch (message.type) {
case 'session.created':
console.log('Session created:', message.session.id);
updateSession();
break;
case 'session.updated':
console.log('Session updated');
sendMessage('Hello!');
break;
case 'conversation.item.added':
console.log('Conversation item added:', message.item.id);
break;
case 'conversation.item.done':
console.log('Conversation item done');
createResponse();
break;
case 'input_audio_buffer.speech_started':
console.log('Speech started at', message.audio_start_ms, 'ms');
break;
case 'input_audio_buffer.speech_stopped':
console.log('Speech stopped at', message.audio_end_ms, 'ms');
break;
case 'conversation.item.input_audio_transcription.delta':
console.log('Transcription delta:', message.delta);
break;
case 'conversation.item.input_audio_transcription.completed':
console.log('Transcription complete:', message.transcript);
break;
case 'response.created':
console.log('Response created:', message.response.id);
break;
case 'response.output_item.added':
console.log('Output item added:', message.item.id);
break;
case 'response.output_text.delta':
console.log('Text delta:', message.delta);
break;
case 'response.output_audio.delta':
// Decode and play audio chunk
const audioBuffer = Buffer.from(message.delta, 'base64');
playAudio(audioBuffer);
break;
case 'response.output_audio_transcript.delta':
console.log('Audio transcript delta:', message.delta);
break;
case 'response.done':
console.log('Response complete, status:', message.response.status);
break;
case 'error':
console.error('Error:', message.error.message, message.error.code);
break;
}
});
function updateSession() {
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
output_modalities: ['text', 'audio'],
instructions: 'You are a helpful AI assistant.',
audio: {
input: {
turn_detection: {
type: 'semantic_vad',
eagerness: 'medium',
create_response: true,
interrupt_response: true
}
},
output: {
voice: 'Clive'
}
}
}
}));
}
function sendMessage(text) {
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{ type: 'input_text', text }]
}
}));
}
function createResponse() {
ws.send(JSON.stringify({
type: 'response.create',
response: {
output_modalities: ['text', 'audio']
}
}));
}
function cancelResponse() {
ws.send(JSON.stringify({ type: 'response.cancel' }));
}
function sendAudioChunk(audioChunk) {
ws.send(JSON.stringify({
type: 'input_audio_buffer.append',
audio: audioChunk // base64-encoded audio data
}));
}
function clearAudioBuffer() {
ws.send(JSON.stringify({ type: 'input_audio_buffer.clear' }));
}
```
[API reference](/api-reference/realtimeAPI/realtime/realtime-websocket) for full schemas.
---
#### WebRTC
Source: https://docs.inworld.ai/realtime/connect/webrtc
Connect via WebRTC for browser-native, low-latency voice. A WebRTC proxy bridges your peer connection to the same realtime service used by the [WebSocket](/realtime/connect/websocket) transport, transcoding OPUS ↔ PCM16 and forwarding events transparently.
## Endpoint
```
https://api.inworld.ai
```
| Endpoint | Method | Description |
| --- | --- | --- |
| `/v1/realtime/calls` | POST | SDP offer/answer exchange |
| `/v1/realtime/ice-servers` | GET | STUN/TURN server configuration |
## Authentication
Pass your Inworld API key as a Bearer token. The proxy forwards it to the realtime service.
```
Authorization: Bearer
```
Keep the API key server-side. Serve it to the browser via a backend endpoint (see examples below).
## Flow
1. Fetch config from your server (API key + ICE servers)
2. Create `RTCPeerConnection` with ICE servers
3. Create data channel `oai-events` + add microphone track
4. Create SDP offer → POST to `/v1/realtime/calls` → set SDP answer
5. Data channel opens → send `session.update` → start conversation
Audio flows via RTP tracks (no manual encode/decode). Events flow via data channel using the same JSON schema as [WebSocket](/realtime/connect/websocket).
## Session Config
Same `session.update` as WebSocket, sent through the data channel. See [model, voice, and TTS configuration](/realtime/usage/using-realtime-models#choose-an-llm) for details.
```javascript
dc.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
model: 'openai/gpt-4o-mini',
instructions: 'You are a concise concierge.',
output_modalities: ['audio', 'text'],
audio: {
input: {
turn_detection: {
type: 'semantic_vad',
eagerness: 'medium',
create_response: true,
interrupt_response: true
}
},
output: {
voice: 'Clive',
model: 'inworld-tts-1.5-mini',
speed: 1.0
}
}
}
}));
```
## Audio
Unlike WebSocket (manual base64 PCM), WebRTC handles audio natively:
- **Input**: browser captures mic and sends OPUS over RTP automatically
- **Output**: proxy sends AI audio back as an RTP track — attach to `` to play
```javascript
pc.ontrack = (e) => {
const audio = document.createElement('audio');
audio.autoplay = true;
audio.srcObject = new MediaStream([e.track]);
document.body.appendChild(audio);
};
```
`response.output_audio.delta` events are **not** sent through the data channel — audio is delivered via the RTP track instead.
## Text & Responses
Same as WebSocket, but sent through the data channel:
```javascript
dc.send(JSON.stringify({
type: 'conversation.item.create',
item: { type: 'message', role: 'user', content: [{ type: 'input_text', text: 'Hello!' }] }
}));
dc.send(JSON.stringify({ type: 'response.create' }));
```
## Events
Same event types as [WebSocket](/realtime/connect/websocket#events), received on the data channel.
## Option 1: Direct WebRTC
Server — serves the page and a `/api/config` endpoint that fetches ICE servers and keeps the API key hidden:
```javascript
const html = readFileSync('index.html');
const API_KEY = process.env.INWORLD_API_KEY || '';
const PROXY = 'https://api.inworld.ai';
const server = createServer(async (req, res) => {
if (req.url === '/api/config') {
let ice = [];
try {
const r = await fetch(`${PROXY}/v1/realtime/ice-servers`, {
headers: { Authorization: `Bearer ${API_KEY}` },
});
if (r.ok) ice = (await r.json()).ice_servers || [];
} catch {}
res.writeHead(200, { 'Content-Type': 'application/json' });
res.end(JSON.stringify({ api_key: API_KEY, ice_servers: ice, url: `${PROXY}/v1/realtime/calls` }));
return;
}
res.writeHead(200, { 'Content-Type': 'text/html' });
res.end(html);
});
let port = 3000;
server.on('error', (e) => {
if (e.code === 'EADDRINUSE') { console.warn(`Port ${port} in use, trying ${++port}…`); server.listen(port); }
else throw e;
});
server.listen(port, () => console.log(`http://localhost:${port}`));
```
Client — full WebRTC flow in the browser:
```javascript
const cfg = await (await fetch('/api/config')).json();
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const pc = new RTCPeerConnection({ iceServers: cfg.ice_servers });
const dc = pc.createDataChannel('oai-events', { ordered: true });
stream.getAudioTracks().forEach(t => pc.addTrack(t, stream));
pc.ontrack = (e) => {
const audio = document.createElement('audio');
audio.autoplay = true;
audio.srcObject = new MediaStream([e.track]);
document.body.appendChild(audio);
};
dc.onopen = () => {
dc.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
model: 'openai/gpt-4o-mini',
instructions: 'You are a helpful voice assistant.',
output_modalities: ['audio', 'text'],
audio: {
input: { turn_detection: { type: 'semantic_vad', eagerness: 'medium', create_response: true, interrupt_response: true } },
output: { voice: 'Clive', model: 'inworld-tts-1.5-mini' }
}
}
}));
};
dc.onmessage = (e) => {
const msg = JSON.parse(e.data);
if (msg.type === 'response.output_text.delta') console.log(msg.delta);
if (msg.type === 'error') console.error(msg.error?.message);
};
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);
// wait for ICE gathering...
const res = await fetch(cfg.url, {
method: 'POST',
headers: { 'Content-Type': 'application/sdp', Authorization: `Bearer ${cfg.api_key}` },
body: pc.localDescription.sdp,
});
await pc.setRemoteDescription({ type: 'answer', sdp: await res.text() });
```
## Option 2: OpenAI Agents SDK
The [OpenAI Agents SDK](https://github.com/openai/openai-agents-js) manages the full WebRTC lifecycle — peer connection, SDP exchange, mic, and audio playback:
```javascript
const agent = new RealtimeAgent({
name: 'assistant',
instructions: 'You are a helpful voice assistant.',
model: 'openai/gpt-4o-mini',
});
const cfg = await (await fetch('/api/config')).json();
const audioEl = document.createElement('audio');
audioEl.autoplay = true;
const session = new RealtimeSession(agent, {
transport: new OpenAIRealtimeWebRTC({
useInsecureApiKey: true,
audioElement: audioEl,
changePeerConnection: async (pc) => {
if (cfg.ice_servers?.length) pc.setConfiguration({ iceServers: cfg.ice_servers });
return pc;
},
}),
model: 'gpt-4o-realtime-preview-2025-06-03',
});
await session.connect({ url: cfg.url, apiKey: cfg.api_key });
session.sendMessage('Hello!');
```
The server-side `/api/config` endpoint is identical to Option 1.
## WebSocket vs WebRTC
| | WebSocket | WebRTC |
| --- | --- | --- |
| **Audio** | PCM16 base64 (manual) | OPUS via RTP (native) |
| **Latency** | Higher | Lower (UDP) |
| **NAT** | Not needed | ICE (STUN/TURN) |
| **Events** | WS messages | DataChannel (same schema) |
| **Best for** | Server-side / Node.js | Browser voice apps |
[API reference](/api-reference/realtimeAPI/realtime/realtime-websocket) for full event schemas.
---
### Guides
#### Configuring Models
Source: https://docs.inworld.ai/realtime/usage/using-realtime-models
The Inworld Realtime API uses an OpenAI Realtime API-compatible event system to facilitate voice experiences. This guide summarizes the key building blocks documented in the API reference ([API reference](/api-reference/realtimeAPI/realtime/realtime-websocket)).
## Configure a Session
For WebSocket, the connection starts with a `session.created` event. For WebRTC, send `session.update` as soon as the data channel opens. In both cases, use `session.update` to configure your session. Here you can set:
- `model` — LLM provider and model (e.g. `openai/gpt-4.1-nano`) or router (e.g. `inworld/latency-optimizer-ab-test`)
- `instructions`
- `output_modalities` (`["audio", "text"]`, `["audio"]`, or `["text"]`)
- Audio input and output configuration — voice, TTS model, PCM format, speed
- `max_output_tokens` (`"inf"` or a numeric ceiling)
- `tools` (function definitions) and `tool_choice` settings
Partial updates are supported, so you can adjust the LLM, voice, TTS model, temperature, or tool lists mid-session without rebuilding the socket.
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
model: 'openai/gpt-4o-mini',
instructions: 'You are a friendly narrator.',
output_modalities: ['audio', 'text'],
temperature: 0.8,
audio: {
input: {
turn_detection: {
type: 'semantic_vad',
create_response: true,
interrupt_response: true
}
},
output: {
voice: 'Clive',
model: 'inworld-tts-1.5-mini',
speed: 1.0
}
}
}
}));
```
## Choose a Router or LLM
Set `model` in `session.update` to select which [Router](/router/introduction) or LLM handles the conversation. The format is `provider/modelName` or `inworld/routerId`:
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
model: 'openai/gpt-4o-mini'
}
}));
```
If you omit `model`, the default model (`google-ai-studio/gemini-2.5-flash`) is used. You can change the model mid-session with a partial update — the new model takes effect on the next response.
## Choose a Voice
Set `audio.output.voice` to control the agent's speaking voice:
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
audio: { output: { voice: 'Olivia' } }
}
}));
```
The default voice is `Dennis`. Browse available voices in the [TTS Playground](/tts/tts-playground) or list them programmatically with the [List Voices API](/api-reference/voiceAPI/voiceservice/list-voices).
## Choose a TTS Model
Set `audio.output.model` to select the text-to-speech model:
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
audio: { output: { model: 'inworld-tts-1.5-max' } }
}
}));
```
| Model | Size | Notes |
| --- | --- | --- |
| `inworld-tts-1.5-mini` | 1B | Faster inference, lower latency (default) |
| `inworld-tts-1.5-max` | 8B | Higher quality audio |
The default is `inworld-tts-1.5-mini`. You can change the TTS model mid-session alongside voice or independently.
## Send Input
### Audio
There are two ways to send audio input:
**Method 1: Streaming Audio (Real-time)**
Use `input_audio_buffer.*` events for streaming real-time audio from a microphone:
1. Convert microphone data to PCM16, 24 kHz, mono.
2. Send chunks via `input_audio_buffer.append`.
3. VAD automatically detects speech boundaries and commits the buffer.
**Method 2: Pre-recorded Audio**
Use `conversation.item.create` with `input_audio` content type for pre-recorded audio chunks:
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{
type: 'input_audio',
audio: base64AudioData // Base64-encoded PCM16 or OPUS
}]
}
}));
```
### Text
Create explicit conversation items for text turns:
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{
type: 'input_text',
text: 'Give me a two-sentence summary.'
}]
}
}));
```
## Function Calling
The Realtime API supports function calling so your agent can fetch live data or trigger actions mid-conversation. Define functions in `session.tools`, then handle calls as they arrive.
### 1. Register a tool
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
tools: [{
type: 'function',
name: 'get_horoscope',
description: 'Get the horoscope for a zodiac sign',
parameters: {
type: 'object',
properties: {
sign: {
type: 'string',
description: 'Zodiac sign, e.g. Aries, Taurus'
}
},
required: ['sign']
}
}],
tool_choice: 'auto'
}
}));
```
### 2. Handle the function call
When the model decides to call a function, you receive a `response.function_call_arguments.done` event with the `call_id`, function `name`, and serialized `arguments`. Execute your logic, then return the result:
```javascript
ws.on('message', (buffer) => {
const event = JSON.parse(buffer.toString());
if (event.type === 'response.function_call_arguments.done') {
const { call_id, name, arguments: argsJson } = event;
const args = JSON.parse(argsJson);
// Run your business logic
let result;
if (name === 'get_horoscope') {
result = fetchHoroscope(args.sign);
}
// Send the function result back
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'function_call_output',
call_id,
output: JSON.stringify(result)
}
}));
// Tell the model to continue with the result
ws.send(JSON.stringify({ type: 'response.create' }));
}
});
```
### 3. What happens next
After `response.create`, the model incorporates the function output and continues the conversation — speaking the horoscope aloud (if `output_modalities` includes `audio`) or streaming text deltas. The user hears the answer without any gap in the conversation flow.
You can register multiple tools and the model will call them as needed. Each call arrives as a separate `response.function_call_arguments.done` event with its own `call_id`.
## Manage Conversation State
Use conversation events to keep context lean:
- `conversation.item.retrieve`: pull any prior item by ID.
- `conversation.item.delete`: remove items that should not remain in context.
Pair these with `max_output_tokens` and `response.cancel` to control overall cost ([conversation management guide](/realtime/usage/managing-conversations)).
## Monitor Errors
Handle `error` events (with `type`, `code`, and `param`) and implement a reconnection/backoff strategy for transient failures. See the [API reference](/api-reference/realtimeAPI/realtime/realtime-websocket) for error event schemas.
```javascript
ws.on('message', (buffer) => {
const event = JSON.parse(buffer.toString());
if (event.type === 'error') {
handleError(event.error);
}
});
```
---
#### Managing Conversations
Source: https://docs.inworld.ai/realtime/usage/managing-conversations
## Conversation Items
Conversation items represent messages and interactions in your conversation. Each item has:
- **ID**: Unique identifier
- **Type**: `message`, `function_call`, `function_call_output`
- **Role**: `user`, `assistant`, or `tool`
- **Content**: The actual content of the item (array of content parts)
### Content Types
Conversation items support different content types depending on direction:
**Input Content Types** (for user messages):
- `input_text` - Plain text input from the user
- `input_audio` - Base64-encoded audio input from the user
**Output Content Types** (for assistant responses):
- `text` - Text output from the assistant
- `audio` - Audio output from the assistant
You can mix multiple content parts in a single conversation item. For example, you can combine text and audio in the same message.
## Creating Conversation Items
### Text Messages
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [
{
type: 'input_text',
text: 'Hello, how are you?'
}
]
}
}));
```
### Audio Messages
There are two ways to send audio input:
**Method 1: Streaming Audio (Real-time)**
Use `input_audio_buffer.append` for streaming real-time audio from a microphone:
```javascript
// Stream audio chunks in real-time
ws.send(JSON.stringify({
type: 'input_audio_buffer.append',
audio: base64AudioData
}));
// VAD automatically detects speech boundaries and commits the buffer
```
**Method 2: Pre-recorded Audio Chunks**
Use `conversation.item.create` with `input_audio` for pre-recorded audio chunks:
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{
type: 'input_audio',
audio: base64AudioData // Base64-encoded PCM16 or OPUS audio
}]
}
}));
```
**When to use each method:**
- **Streaming (`input_audio_buffer.append`)**: Use for real-time microphone input, voice conversations, live audio streaming
- **Pre-recorded (`conversation.item.create` with `input_audio`)**: Use for pre-recorded audio files, batch processing, or when you have complete audio chunks ready
### Mixed Content
You can combine multiple content types in a single conversation item:
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [
{
type: 'input_text',
text: 'Here is some context about the audio:'
},
{
type: 'input_audio',
audio: base64AudioData
},
{
type: 'input_text',
text: 'And here is additional context.'
}
]
}
}));
```
## Receiving Conversation Items
When items are added to the conversation, you'll receive events:
```javascript
ws.on('message', (data) => {
const event = JSON.parse(data);
if (event.type === 'conversation.item.added') {
console.log('Item added:', event.item.id);
console.log('Content:', event.item.content);
}
if (event.type === 'conversation.item.done') {
console.log('Item processing complete:', event.item.id);
}
});
```
## Retrieving Conversation Items
Retrieve specific conversation items:
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.retrieve',
item_id: 'item-id-here'
}));
```
The server will respond with:
```javascript
{
type: 'conversation.item.retrieved',
item: {
id: 'item-id-here',
type: 'message',
role: 'user',
content: [...]
}
}
```
## Deleting Conversation Items
Remove items from the conversation:
```javascript
ws.send(JSON.stringify({
type: 'conversation.item.delete',
item_id: 'item-id-here'
}));
```
You'll receive a confirmation:
```javascript
{
type: 'conversation.item.deleted',
item_id: 'item-id-here'
}
```
## Function Calling
The Realtime API supports function calling, allowing the assistant to invoke tools you define. Configure functions in `session.update` and handle function call events.
### Defining Functions
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
tools: [{
type: 'function',
name: 'get_weather',
description: 'Get the weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'The city and state, e.g. San Francisco, CA'
}
},
required: ['location']
}
}],
tool_choice: 'auto'
}
}));
```
### Handling Function Calls
```javascript
ws.on('message', (data) => {
const event = JSON.parse(data);
if (event.type === 'response.function_call_arguments.done') {
const result = executeFunction(event.name, JSON.parse(event.arguments));
ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'function_call_output',
call_id: event.call_id,
output: JSON.stringify(result)
}
}));
ws.send(JSON.stringify({
type: 'response.create'
}));
}
});
```
## Voice Activity Detection
Voice Activity Detection (VAD) automatically detects when speech starts and stops, enabling natural turn-taking in conversations. Configure VAD through `session.update`.
### Configuring VAD
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
audio: {
input: {
turn_detection: {
type: 'semantic_vad',
eagerness: 'medium',
create_response: true,
interrupt_response: true
}
}
}
}
}));
```
### VAD Types
- **`semantic_vad`**: Uses conversational awareness to detect natural speech boundaries. Adjust `eagerness` (`low`, `medium`, `high`) to control responsiveness.
### VAD Events
```javascript
ws.on('message', (data) => {
const event = JSON.parse(data);
if (event.type === 'input_audio_buffer.speech_started') {
console.log('Speech detected');
// Update UI to show user is speaking
}
if (event.type === 'input_audio_buffer.speech_stopped') {
console.log('Speech ended');
// Update UI, prepare for response
}
});
```
## Error Handling
The Realtime API emits `error` events for various failure scenarios. Handle these events to provide robust error recovery and user feedback.
### Error Event Structure
```javascript
ws.on('message', (data) => {
const event = JSON.parse(data);
if (event.type === 'error') {
const error = event.error;
switch (error.type) {
case 'invalid_request_error':
console.error('Invalid request:', error.message);
if (error.param) {
console.error('Parameter:', error.param);
}
break;
case 'server_error':
console.error('Server error:', error.message);
// Implement retry logic
break;
case 'rate_limit_error':
console.error('Rate limit exceeded');
// Pause requests, implement backoff
break;
}
}
});
```
### Error Types
- **`invalid_request_error`**: Invalid parameters or malformed requests. Check `error.param` for the specific field.
- **`server_error`**: Transient server-side failures. Implement retry logic with exponential backoff.
- **`rate_limit_error`**: Rate limit exceeded. Throttle requests and retry with exponential backoff.
## Interruption Handling
Interrupt active responses when new user input arrives.
### Interrupting Responses
Cancel an in-progress response when the user starts speaking again:
```javascript
ws.on('message', (data) => {
const event = JSON.parse(data);
if (event.type === 'input_audio_buffer.speech_started') {
// User started speaking, cancel current response
ws.send(JSON.stringify({
type: 'response.cancel'
}));
}
});
```
When `interrupt_response: true` is set in VAD configuration, the server automatically cancels responses when new speech is detected.
## Managing Context
### Session Instructions
Update session instructions to guide the conversation:
```javascript
ws.send(JSON.stringify({
type: 'session.update',
session: {
type: 'realtime',
instructions: 'You are a helpful assistant. Be concise and friendly.'
}
}));
```
### Conversation History
The API automatically maintains conversation history. You can:
1. **Keep full history**: Let the conversation grow naturally
2. **Selective deletion**: Remove specific items that aren't needed
3. **Session resets**: Start a new session when you need a clean context window
## Example: Conversation Manager
Here's a complete example of managing conversations:
```javascript
class ConversationManager {
constructor(ws) {
this.ws = ws;
this.items = new Map();
this.setupListeners();
}
setupListeners() {
this.ws.on('message', (data) => {
const event = JSON.parse(data);
switch (event.type) {
case 'conversation.item.added':
this.items.set(event.item.id, event.item);
break;
case 'conversation.item.deleted':
this.items.delete(event.item_id);
break;
}
});
}
sendMessage(text) {
this.ws.send(JSON.stringify({
type: 'conversation.item.create',
item: {
type: 'message',
role: 'user',
content: [{
type: 'input_text',
text: text
}]
}
}));
}
deleteItem(itemId) {
this.ws.send(JSON.stringify({
type: 'conversation.item.delete',
item_id: itemId
}));
}
getConversationHistory() {
return Array.from(this.items.values());
}
}
```
## Best Practices
1. **Monitor Context Length**: Keep track of conversation length to avoid exceeding limits
2. **Strategic Deletion**: Remove old context that's no longer relevant
3. **Item Tracking**: Maintain a local map of conversation items for quick access
4. **Error Handling**: Handle cases where items might not exist when deleting/retrieving
5. **Context Management**: Use session instructions to guide conversation behavior
## Use Cases
- **Long Conversations**: Delete old context to maintain performance
- **Error Recovery**: Delete incorrect items and resend
- **Context Switching**: Clear conversation context when changing topics
- **Memory Management**: Remove items that are no longer needed
---
#### OpenAI migration
Source: https://docs.inworld.ai/realtime/openai-migration
If you're already using the [OpenAI Realtime API](https://developers.openai.com/api/docs/guides/realtime), you can switch to Inworld with minimal code changes. The event schema, session structure, and client/server events are compatible.
## What stays the same
- Event types: `session.update`, `conversation.item.create`, `response.create`, streaming deltas, etc.
- Session config: `instructions`, `audio.input`, `audio.output`, `tools`, etc.
- Client and server event shapes
- Stream audio in and receive streaming audio out with interruption-handling enabled out-of-the-box
## What changes
### Router
With the Inworld Realtime API you can leverage your [routers](/router/introduction) to have a single voice agent dynamically handle many different user cohorts.
### Custom Voices
Cloned or designed voices are usable by your realtime agents and are configurable with a `session.update` event.
### Endpoints
| Transport | OpenAI | Inworld |
| --- | --- | --- |
| WebSocket | `wss://api.openai.com/v1/realtime?model=...` | `wss://api.inworld.ai/api/v1/realtime/session?key=&protocol=realtime` |
| WebRTC SDP | `https://api.openai.com/v1/realtime/calls` | `https://api.inworld.ai/v1/realtime/calls` |
### Session setup
You need a session ID from your Inworld app before connecting. Pass it as the `key` query parameter for WebSocket.
## Full reference
For event schemas and exhaustive details, see the [WebSocket API reference](/api-reference/realtimeAPI/realtime/realtime-websocket).
---
### Resources
#### Billing
Source: https://docs.inworld.ai/realtime/resources/billing
---
#### Usage
Source: https://docs.inworld.ai/realtime/resources/usage
---
#### Support
Source: https://docs.inworld.ai/realtime/resources/support
---
## LLM Router
### Getting Started
#### Intro to Inworld Router
Source: https://docs.inworld.ai/router/introduction
Inworld Router is an intelligent routing layer that helps you select the right model and configuration for your use case, to maximize the performance and user metrics you care about from cost and latency to user retention and revenue.
In addition to providing a unified API to access hundreds of LLMs through a single endpoint while automatically handling fallbacks, Inworld Router enables you to easily run A/B experiments, route different user segments to different models, and measure the impact on your KPIs. This means you can actually optimize for your specific application and your specific users.
Learn how to make your first API call in minutes with a guided tutorial.
Understand the core concepts behind Inworld Router
Inworld Router is currently in [research preview](/tts/resources/support#what-do-experimental-preview-and-stable-mean). Please share any feedback with us via the feedback form in [Portal](https://platform.inworld.ai) or in [Discord](https://discord.gg/inworld).
## Key Benefits
- **Unified API**: Access models from OpenAI, Anthropic, Google, and more through a single API
- **High reliability**: Automatically fall back to other providers if one fails
- **Dynamic selection**: Optimize the model or provider in real-time based on price, speed, or intelligence
- **Cost optimization**: Automatically choose the most cost-effective provider or model for each request to help you stay within budget
- **Live experimentation**: Easily run experiments on different models and prompts to see what works best for your users
- **Insightful analytics**: Seamlessly integrate with your metrics to understand how different models impact your KPIs
---
#### Inworld Router Quickstart
Source: https://docs.inworld.ai/router/quickstart
This quickstart will walk you through creating your first LLM Router (via [Portal](#create-your-first-router-on-portal) or [API](#create-your-first-router-via-api)) and making your first request.
While this guide uses the OpenAI Chat Completions compatible `/v1/chat/completions` endpoint, you can also integrate with our [Anthropic Messages API compatible](/router/anthropic-compatibility) `/v1/messages` endpoint, [OpenAI SDK](/router/openai-compatibility#openai-sdk), and Anthropic SDK.
## Create your first Router on Portal
In [Inworld Portal](https://platform.inworld.ai/), go to **Routers** and select "Compare frontier models".
Click **Save** to create this router, which splits your traffic between Anthropic's Opus 4.6, Google's Gemini 3.1 Pro, and OpenAI's GPT-5.2.
Go to [**API Keys**](https://platform.inworld.ai/api-keys) and click **Generate new key**. Copy the Base64 credentials.
Now run the following code to make your first request to your router.
```bash curl
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/compare-frontier-models",
"messages": [
{"role": "user", "content": "Who created you?"}
]
}'
```
If you run the request multiple times, you should see traffic split across Anthropic's Opus 4.6, Google's Gemini 3.1 Pro, and OpenAI's GPT-5.2.
## Create your first Router via API
In [Inworld Portal](https://platform.inworld.ai/), go to [**API Keys**](https://platform.inworld.ai/api-keys) and click **Generate new key**. Enable **Write** permissions for the Router API, and click **Generate**.
Copy the Base64 credentials.
Now let's create a router that splits your traffic between Anthropic's Opus 4.6, Google's Gemini 3.1 Pro, and OpenAI's GPT-5.2.
```shell curl
curl --location 'https://api.inworld.ai/router/v1/routers' \
--header 'Content-Type: application/json' \
--header 'Authorization: Basic ' \
--data '{
"name": "compare-frontier-models",
"default_route": {
"route_id": "default",
"variants": [
{
"variant": {
"variant_id": "anthropic",
"model_id": "anthropic/claude-opus-4-6"
},
"weight": 33
},
{
"variant": {
"variant_id": "google",
"model_id": "google-ai-studio/gemini-3.1-pro-preview"
},
"weight": 33
},
{
"variant": {
"variant_id": "openai",
"model_id": "openai/gpt-5.2"
},
"weight": 34
}
]
}
}'
```
```python Python
url = "https://api.inworld.ai/router/v1/routers"
headers = {
"Content-Type": "application/json",
"Authorization": "Basic "
}
data = {
"name": "compare-frontier-models",
"default_route": {
"route_id": "default",
"variants": [
{
"variant": {
"variant_id": "anthropic",
"model_id": "anthropic/claude-opus-4-6"
},
"weight": 33
},
{
"variant": {
"variant_id": "google",
"model_id": "google-ai-studio/gemini-3.1-pro-preview"
},
"weight": 33
},
{
"variant": {
"variant_id": "openai",
"model_id": "openai/gpt-5.2"
},
"weight": 34
}
]
}
}
response = requests.post(url, headers=headers, json=data)
print(response.json())
```
```javascript JavaScript
const response = await fetch("https://api.inworld.ai/router/v1/routers", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Basic "
},
body: JSON.stringify({
name: "compare-frontier-models",
default_route: {
route_id: "default",
variants: [
{
variant: {
variant_id: "anthropic",
model_id: "anthropic/claude-opus-4-6"
},
weight: 33
},
{
variant: {
variant_id: "google",
model_id: "google-ai-studio/gemini-3.1-pro-preview"
},
weight: 33
},
{
variant: {
variant_id: "openai",
model_id: "openai/gpt-5.2"
},
weight: 34
}
]
}
})
});
const data = await response.json();
console.log(data);
```
Now run the following code to make your first request to your router.
```bash curl
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/compare-frontier-models",
"messages": [
{"role": "user", "content": "Who created you?"}
]
}'
```
If you run the request multiple times, you should see traffic split across Anthropic's Opus 4.6, Google's Gemini 3.1 Pro, and OpenAI's GPT-5.2.
## Next Steps
Now that you've made your first request, you can explore more of Inworld Router's capabilities.
Understand core router concepts, including fallbacks, conditional routing, and more
Learn more about how to use conditional routing
View request and response schemas for Chat Completions and Router APIs
---
### Core Concepts
#### Core Concepts
Source: https://docs.inworld.ai/router/core-concepts/overview
A **router** is a reusable configuration that can be used with our [Chat Completions](/api-reference/routerAPI/chat-completions) API to define how requests are routed to models. Routers let you set up fallbacks, conditional routing, A/B test across models, attach prompt templates, and configure generation parameters — all without changing your application code.
You can create a router via Portal or [API](/api-reference/routerAPI/routerservice/create-router).
## How it all fits together
A router is made up of [**routes**](#routes) and [**variants**](#variants). Here's the full flow when a request hits a router:
1. **Route evaluation** — Conditional routes are checked in order. The first match is selected, and if none match, the default route is used.
2. **Variant selection** — Within the matched route, a variant is chosen based on weights.
3. **Model called** — The request is sent to a model based on the [variant configuration](#variant-configuration). If the variant uses `auto`, the best model is dynamically selected based on the provided criteria.
If a `user` is specified in the [Chat Completions](/api-reference/routerAPI/chat-completions) request, that user will consistently receive the same variant across requests (sticky routing).
## Routes
A **route** is a specific path within a router. There are two types of routes you can configure:
1. **Default route** - This is the default route that will be used if no conditional routes exist or match. If no default route is configured and no conditions match, the API returns an error.
```json
{
"defaultRoute": {
"route_id": "default",
"variants": [...]
}
}
```
2. **Conditional route** - Conditional routes let you route requests based on runtime context (e.g., user tier). This can be useful if you want to segment users.
Each conditional route includes a [CEL expression](https://github.com/google/cel-spec/blob/master/doc/langdef.md) that is evaluated against the request metadata (passed via `extra_body.metadata` in the Chat Completions request). Routes are evaluated **in order**, and the **first route** whose condition evaluates to `true` is selected.
```json
{
"route": {
"route_id": "premium",
"variants": [...]
},
"condition": {
"cel_expression": "tier == \"premium\""
}
}
```
To trigger this route, pass the matching metadata in your Chat Completions request:
```json
{
"model": "inworld/my-router",
"messages": [{ "role": "user", "content": "Hello!" }],
"extra_body": {
"metadata": { "tier": "premium" }
}
}
```
Since the first matching route wins, place more specific conditional routes before general ones. For example, put `tier == "premium" && region == "us"` before `tier == "premium"` — otherwise the general condition would match first and the specific one would never be reached.
## Variants
A **variant** is a specific configuration within a route. Each variant specifies which model to use and can optionally include its own text generation parameters and prompt templates.
When a route has multiple variants, traffic is distributed based on **weights**. Weights must sum to 100 within each route.
This is useful for A/B testing — for example, splitting traffic 50/50 between two models to compare performance:
```json
{
"route_id": "experiment",
"variants": [
{
"variant": { "variant_id": "gpt5", "model_id": "openai/gpt-5.2" },
"weight": 50
},
{
"variant": { "variant_id": "claude", "model_id": "anthropic/claude-opus-4-6" },
"weight": 50
}
]
}
```
### Variant configuration
Each variant supports the following fields:
| Field | Description |
| :--- | :--- |
| `variant_id` | Unique identifier for this variant within its route. |
| `model_id` | The model to use. Can be a provider-prefixed model (e.g., `openai/gpt-5.2`), a model without provider for [provider routing](#provider-routing) (e.g., `gpt-oss-120b`), or `auto` for [dynamic model selection](#auto-selection). |
| `model_selection` | Configures [auto selection](#auto-selection) criteria (when `model_id` is `auto`), [fallback models](#fallbacks) (when `model_id` is a specific model), or [provider routing](#provider-routing) (when `model_id` has no provider prefix). |
| `message_templates` | Prompt templates for this variant. Useful for variant-specific system prompts. Supports [prompt variables](/router/usage/prompt-variables). |
| `text_generation_config` | Generation parameters such as `temperature`, `max_tokens`, `top_p`, `frequency_penalty`, `presence_penalty`, `seed`, and `stop_sequences`. If set, this **entirely replaces** the router-level defaults. |
Here's a fully configured variant example:
```json
{
"variant_id": "gpt5",
"model_id": "openai/gpt-5.2",
"text_generation_config": {
"temperature": 0.7,
"max_tokens": 1024
},
"message_templates": [
{ "role": "system", "content": "You are a helpful assistant specialized in {{topic}}." }
],
"model_selection": {
"models": ["anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-pro"],
"sort": [{ "metric": "SORT_METRIC_LATENCY" }]
}
}
```
In this example, requests routed to this variant will use `openai/gpt-5.2` as the primary model with the specified temperature and prompt. If the primary model fails, the fallback models in `model_selection.models` are tried in order of lowest latency (determined by the `sort` criteria).
#### Fallbacks
When a variant has a specific `model_id` (e.g., `openai/gpt-5.2`), you can configure fallback models via `model_selection.models`. If the primary model fails, the router automatically retries with the fallback models.
By default, fallbacks are tried in the order listed. Add `sort` to control the order — for example, trying the cheapest fallback first.
```json In order listed
// If gpt-5.2 fails, try Claude Opus 4.6, then if that fails, try Gemini 2.5 Pro
{
"variant_id": "gpt-5.2-with-fallbacks",
"model_id": "openai/gpt-5.2",
"model_selection": {
"models": ["anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-pro"]
}
}
```
```json Sorted by latency
// If gpt-5.2 fails, try Claude Opus 4.6 and Gemini 2.5 Pro in order of lowest latency
{
"variant_id": "gpt-5.2-with-fallbacks",
"model_id": "openai/gpt-5.2",
"model_selection": {
"models": ["anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-pro"],
"sort": [{ "metric": "SORT_METRIC_LATENCY" }]
}
}
```
You can inspect which models were attempted in the `metadata.attempts` array of the response.
#### Provider routing
Instead of specifying a provider-prefixed model (e.g., `openai/gpt-5.2`), you can specify just the model name (e.g., `gpt-oss-120b`) and let the router select the best provider automatically. Optionally, use `model_selection.provider` to control which providers are tried and in what order.
By default, the provider with the lowest latency is selected, and if it fails, the next-best provider is tried. You can change how the providers are sorted, or the explicit order in which providers are tried.
```json Default provider selection
// Automatically selects the lowest-latency provider for gpt-oss-120b
{
"variant_id": "lowest-latency-provider",
"model_id": "gpt-oss-120b"
}
```
```json Order by throughput
// Select the highest throughput provider for gpt-oss-120b
{
"variant_id": "highest-throughput-provider",
"model_id": "gpt-oss-120b",
"model_selection": {
"sort": [{"metric": "SORT_METRIC_THROUGHPUT"}]
}
}
```
```json Explicit provider order
// Try groq first, then fireworks
{
"variant_id": "groq-preferred",
"model_id": "gpt-oss-120b",
"model_selection": {
"provider": {
"order": ["groq", "fireworks"]
}
}
}
```
#### Auto selection
Instead of specifying a fixed model (e.g., `openai/gpt-5.2`), you can set `model_id` to `auto` and use `model_selection` to let the router dynamically pick the best model based on the sort criteria:
```json
{
"variant_id": "auto-variant",
"model_id": "auto",
"model_selection": {
"sort": [{ "metric": "SORT_METRIC_LATENCY"}, {"metric": "SORT_METRIC_PRICE"}]
}
}
```
In this example, the model with the lowest latency will be selected, using price as a tie-breaker.
The available sort metrics are:
| **Metric** | **Description** |
| :--- | :--- |
| `SORT_METRIC_PRICE` | Publicly listed token pricing, based on adding input and output token pricing. |
| `SORT_METRIC_LATENCY` | Median time to first token. |
| `SORT_METRIC_THROUGHPUT` | Median output tokens per second. |
| `SORT_METRIC_INTELLIGENCE` | Overall intelligence based on the Artificial Analysis Intelligence Index. |
| `SORT_METRIC_MATH` | Math capabilities based on the MATH-500 benchmark. |
| `SORT_METRIC_CODING` | Coding capabilities based on the LiveCodeBench benchmark. |
You can also limit the set of models to consider by specifying `models` and `ignore`.
```json Specify models
// Select the lowest latency model between gpt-5.2, Opus 4.6, and Gemini 2.5 Pro
{
"variant_id": "auto-variant",
"model_id": "auto",
"model_selection": {
"models": ["openai/gpt-5.2", "anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-pro"],
"sort": [{ "metric": "SORT_METRIC_LATENCY" }]
}
}
```
```json Ignore models
// Select lowest latency model that is not from OpenAI or Opus 4.6
{
"variant_id": "auto-variant",
"model_id": "auto",
"model_selection": {
"ignore": ["openai", "anthropic/claude-opus-4-6"],
"sort": [{ "metric": "SORT_METRIC_LATENCY" }]
}
}
```
---
#### Capabilities
#### Provider Routing
Source: https://docs.inworld.ai/router/capabilities/provider-routing
You can specify a model without a provider prefix (e.g., `gpt-oss-120b` instead of `groq/gpt-oss-120b`), and the API will automatically select a provider for you. Optionally, use the `model_selection.provider` field in your router config to control how providers are selected.
By default, the provider with the lowest latency is selected, and if it fails, the next best provider is tried automatically.
To see which models are available from multiple providers, use the [List Models](/api-reference/modelsAPI/modelservice/list-models) endpoint.
## Provider configuration
| Field | Type | Default | Description |
| :--- | :--- | :--- | :--- |
| `order` | `string[]` | — | Explicit list of providers to try, in order. When specified, providers are tried in this exact order. |
| `allow_fallbacks` | `boolean` | `true` | Whether to fall back to the next provider if the first one fails. |
## How provider selection works
When no `provider.order` is specified, the `sort` criteria determines the order providers are tried. If no `sort` is specified either, providers are ordered by latency (fastest first).
When `provider.order` is specified, providers are tried in the exact order listed — `sort` does not apply to the provider order (but still applies to `models` fallbacks if specified).
The `ignore` field applies to providers regardless of whether `order` is specified.
## Examples
```json Default (auto provider selection)
// Automatically selects the lowest-latency provider for gpt-oss-120b
// Falls back to next-best provider if it fails
{
"variant_id": "auto-provider",
"model_id": "gpt-oss-120b"
}
```
```json Explicit provider order
// Try groq first, then fireworks
{
"variant_id": "groq-preferred",
"model_id": "gpt-oss-120b",
"model_selection": {
"provider": {
"order": ["groq", "fireworks"]
}
}
}
```
```json Sort providers by price
// Pick the cheapest provider for gpt-oss-120b
// Fall back to next-cheapest if it fails
{
"variant_id": "cheapest-provider",
"model_id": "gpt-oss-120b",
"model_selection": {
"provider": {
"allow_fallbacks": true
},
"sort": [{ "metric": "SORT_METRIC_PRICE" }]
}
}
```
```json Provider + model fallbacks
// Try groq/gpt-oss-120b, then fireworks/gpt-oss-120b,
// then fall back to gpt-5.4
{
"variant_id": "with-fallbacks",
"model_id": "gpt-oss-120b",
"model_selection": {
"provider": {
"order": ["groq", "fireworks"],
"allow_fallbacks": true
},
"models": ["openai/gpt-5.4"]
}
}
```
```json No provider fallbacks
// Try the lowest latency provider only — return error if it fails
{
"variant_id": "single-provider",
"model_id": "gpt-oss-120b",
"model_selection": {
"provider": {
"allow_fallbacks": false
}
}
}
```
## Execution order
When using provider routing with model fallbacks, the full execution order is:
1. **Try providers for the primary model** — in `provider.order` order (if specified) or sorted by `sort` criteria (default: latency)
2. **If all providers fail and `models` is specified** — fall back to the models list, sorted by `sort` criteria
3. **If all models fail** — return an error
## Use Cases
- **Reliability**: Ensure your application continues working even if a specific provider is down
- **Cost Optimization**: Route to cheaper providers or fall back to cheaper models
- **Performance**: Prefer low-latency providers, or fall back to faster models for time-sensitive requests
- **Provider Control**: Lock to specific providers for compliance or consistency
---
#### Conditional Routing
Source: https://docs.inworld.ai/router/capabilities/conditional-routing
Conditional routing allows you to route requests to different models based on conditions that evaluate request metadata.
You can use conditional routing to:
- **Segment by user tier** — route premium users to flagship models and free users to cost-effective ones.
- **Route by geography** — direct requests to different models based on geography.
- **Optimize by complexity** — send simple prompts to fast, cheap models and complex ones to high-capability models.
## Basic Structure
Routes are defined directly on the Router. Each route has a `route` object and a `condition` object.
```json
{
"name": "routers/my-router",
"routes": [
{
"route": {
"route_id": "premium-tier",
"variants": [
{
"variant": {
"variant_id": "premium-gpt5",
"model_id": "openai/gpt-5.2"
},
"weight": 100
}
]
},
"condition": {
"cel_expression": "tier == \"premium\""
}
},
{
"route": {
"route_id": "free-tier",
"variants": [
{
"variant": {
"variant_id": "free-flash",
"model_id": "google-ai-studio/gemini-2.5-flash"
},
"weight": 100
}
]
},
"condition": {
"cel_expression": "tier == \"free\""
}
}
],
"defaultRoute": {
"route_id": "default",
"variants": [
{
"variant": {
"variant_id": "default-variant",
"model_id": "google-ai-studio/gemini-2.5-flash"
},
"weight": 100
}
]
}
}
```
The `condition` field accepts a [CEL](https://cel.dev/) expression that can be evaluated against the following from each [Chat Completions](/api-reference/routerAPI/chat-completions) request:
- **Request metadata** — key-value pairs sent in `extra_body.metadata` (e.g. user tier, region, complexity score).
- **Messages** — the full `messages` array from the request body
Routes are evaluated **in order**, and the first route whose condition evaluates to `true` is selected.
The table below lists the available operations for constructing a CEL expression:
| **Operation** | **Example** | **Description** |
| :--- | :--- | :--- |
| Equality | `tier == "premium"` | Matches a specific value. |
| Comparison | `request_count > 100` | Numeric comparisons (`>`, `<`, `>=`, `<=`). |
| Logical AND | `tier == "premium" && region == "us-east"` | Both conditions must be true. |
| Logical OR | `tier == "premium" \|\| tier == "enterprise"` | Either condition must be true. |
| Range check | `score >= 80 && score <= 100` | Matches values within a range. |
| `startsWith()` | `user_id.startsWith("enterprise-")` | Checks if a string starts with a prefix. |
| `endsWith()` | `user_id.endsWith("-admin")` | Checks if a string ends with a suffix. |
| `contains()` | `user_id.contains("test")` | Checks if a string contains a substring. |
| `matches()` | `messages.last().content.matches("(?i).*\\bcode\\b.*")` | RE2 regex match on any string field. |
| `size()` | `user_id.size() > 10` | Gets the length of a string. |
| `messages.has()` | `messages.has()` | `true` if the message array exists and is non-empty. |
| `messages.first()` | `messages.first().content.contains("system")` | First message in the array. |
| `messages.last()` | `messages.last().content.matches("(?i).*help.*")` | Last message in the array. |
| `messages.get(i)` | `messages.get(0).role == "system"` | Message at index `i` (0-based). |
| `messages.len()` | `messages.len() > 3` | Number of messages in the array. |
Expressions are typed — ensure your metadata types match. Use `count == 100` when the value is a number and `tier == "premium"` when it's a string. Mismatched types will not match.
## Making a Request
When calling your router, pass metadata in your requests using `extra_body.metadata` to provide the values referenced by your conditions, enabling dynamic route selection.
```bash
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/my-router",
"messages": [
{"role": "user", "content": "Hello"}
],
"extra_body": {
"metadata": {
"tier": "premium",
"user_id": "user-123",
"region": "us-east"
}
}
}'
```
In this example, the provided metadata (`tier`, `user_id`, and `region`) will be available for the router’s CEL expressions, enabling conditional routes to match and select the appropriate model configuration based on those values.
## Example Use Cases
### User Subscription Tiers
Route users to different models based on their subscription tier:
```json
{
"cel_expression": "subscription_tier == \"pro\""
}
```
### Geographic Routing
Route requests based on user location:
```json
{
"cel_expression": "region == \"eu\""
}
```
### Prompt-Based Routing
You can combine fast prompt-based regex routing with metadata checks in a single condition, enabling routing by both **what the user is asking** and **who the user is**.
Route coding questions from premium users to a reasoning model:
```json
{
"cel_expression": "messages.last().content.matches(\"(?i).*\\\\b(code|debug|refactor|bug fix)\\\\b.*\") && tier == \"premium\""
}
```
Regex also works on metadata fields (e.g. `user_id.matches(".*-admin$")`) and on any message — use `messages.first()` to match the first message or `messages.get(i)` for a specific turn.
Route translation requests to a multilingual model:
```json
{
"cel_expression": "messages.last().content.matches(\"(?i).*\\\\btranslat(e|ion)\\\\b.*\")"
}
```
Route long conversations (e.g. 10+ turns) to a model with a large context window:
```json
{
"cel_expression": "messages.len() > 10"
}
```
Conditions are evaluated before route variants are selected, so messages defined inside variants are not part of condition evaluation.
### Dynamic Tiering
Route requests to different model tiers based on prompt complexity. Use a lightweight classifier in your application to score complexity, then let the router pick the right model:
```json
{
"routes": [
{
"route": {
"route_id": "complex",
"variants": [
{
"variant": { "variant_id": "gpt5-premium", "model_id": "openai/gpt-5.2" },
"weight": 100
}
]
},
"condition": {
"cel_expression": "complexity_score >= 8"
}
},
{
"route": {
"route_id": "standard",
"variants": [
{
"variant": { "variant_id": "gpt5", "model_id": "openai/gpt-5-mini" },
"weight": 100
}
]
},
"condition": {
"cel_expression": "complexity_score >= 4"
}
}
],
"defaultRoute": {
"route_id": "simple",
"variants": [
{
"variant": { "variant_id": "flash", "model_id": "openai/gpt-5-nano" },
"weight": 100
}
]
}
}
```
Simple tasks go to fast, cheap models while complex tasks get flagship models — optimizing cost without sacrificing quality where it matters.
---
#### Traffic Splitting
Source: https://docs.inworld.ai/router/capabilities/traffic-splitting
Routers allow you to distribute traffic across different model variants within a route based on percentage weights. This is ideal for A/B testing, gradual migrations, or splitting traffic across multiple providers.
## How it Works
Routes are defined directly on the Router. Within each route, you can define multiple **variants** with weights. When a route is selected (either through conditional routing or as the default route), the router selects a variant based on weights.
**Important**: Weights must sum to exactly 100 within each route. They are not normalized - each route's variants are weighted independently. For example, if Route A has variants with weights [70, 30] and Route B has variants with weights [80, 20], these are independent - Route A's weights don't affect Route B's weights.
### Example Configuration
```bash
curl --request POST \
--url https://api.inworld.ai/router/v1/routers \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"name": "routers/ab-test-router",
"routes": [
{
"route": {
"route_id": "ab-test-route",
"variants": [
{
"variant": {
"variant_id": "variant-a",
"model_id": "openai/gpt-5"
},
"weight": 80
},
{
"variant": {
"variant_id": "variant-b",
"model_id": "anthropic/claude-opus-4-6"
},
"weight": 20
}
]
},
"condition": {
"cel_expression": "true"
}
}
],
"defaultRoute": {
"route_id": "default",
"variants": [
{
"variant": {
"variant_id": "default-variant",
"model_id": "openai/gpt-5"
},
"weight": 100
}
]
}
}'
```
Model IDs are specified as strings in the format `"provider/model"` (e.g., `"openai/gpt-5"`).
**Weights must sum to exactly 100 within each route.** They are not normalized and are independent per route. For example:
- Route 1 with variants [70, 30] ✓ (sums to 100)
- Route 2 with variants [80, 20] ✓ (sums to 100)
- Route 1 with variants [70, 20] ✗ (sums to 90, will fail validation)
Each route's weights are calculated independently - a weight of 70 in one route does not relate to weights in other routes.
## User Stickiness
If you specify a `user` field in your request, the router ensures the same user always gets the same variant (sticky routing):
```bash
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Bearer ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/ab-test-router",
"messages": [{"role": "user", "content": "Hello"}],
"user": "user-123"
}'
```
## Use Cases
### 1. A/B Testing
Send 90% of traffic to your stable model (GPT-5) and 10% to a new candidate (Claude Opus 4) to compare performance and user satisfaction in production.
### 2. Provider Redundancy
Split traffic 50/50 between two different providers of the same model (e.g., Llama 3 on Groq vs. Llama 3 on Together AI) to mitigate provider-specific rate limits.
### 3. Gradual Migration
When moving to a new model version, start with a 1% weight and slowly increase it as you verify the new model's output quality.
### 4. Conditional Weighted Routing
Combine conditional routing with weighted variants to route based on metadata, then split traffic within that route:
```json
{
"route": {
"route_id": "premium-tier",
"variants": [
{
"variant": {
"variant_id": "gpt5",
"model_id": "openai/gpt-5"
},
"weight": 70
},
{
"variant": {
"variant_id": "claude",
"model_id": "anthropic/claude-opus-4-6"
},
"weight": 30
}
]
},
"condition": {
"cel_expression": "tier == \"premium\""
}
}
```
See [Conditional Routing](/router/capabilities/conditional-routing) for more details on using CEL expressions.
---
#### Request-Level Routing
Source: https://docs.inworld.ai/router/capabilities/request-level
The [Chat Completions](/api-reference/routerAPI/chat-completions) API supports routing directly at the request level, without needing to create a router. You can use it to call a specific model, add fallbacks, or let the engine auto-select the best model, all in a single request.
Request-level routing can be a good fit if you:
- Want to call a specific model through a unified API without setting up a router
- Are prototyping or benchmarking before committing to a router configuration
For more advanced use cases like conditional routing, A/B testing with weighted variants, or shared prompt templates, we recommend [setting up a router](/router/quickstart).
## Direct model call
Specify a model directly by its `provider/model` identifier:
```bash
curl -X POST https://api.inworld.ai/v1/chat/completions \
-H 'Authorization: Bearer ' \
-H 'Content-Type: application/json' \
-d '{
"model": "openai/gpt-5.2",
"messages": [{ "role": "user", "content": "Hello!" }]
}'
```
This sends the request to the specified model with no routing logic. You still benefit from Inworld Router's unified API.
## Fallbacks
Add fallback models via `extra_body.models`. If the primary model fails, the router automatically tries the next model in the list:
```bash
curl -X POST https://api.inworld.ai/v1/chat/completions \
-H 'Authorization: Bearer ' \
-H 'Content-Type: application/json' \
-d '{
"model": "openai/gpt-5.2",
"messages": [{ "role": "user", "content": "Hello!" }],
"extra_body": {
"models": ["anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-pro"]
}
}'
```
In this example, the router tries gpt-5.2 first, then Claude Opus, then Gemini Pro. You can inspect which models were attempted in the `metadata.attempts` array of the response.
## Auto model selection
Set `model` to `auto` and provide sorting criteria via `extra_body.sort` to let the router pick the best model automatically:
```bash
curl -X POST https://api.inworld.ai/v1/chat/completions \
-H 'Authorization: Bearer ' \
-H 'Content-Type: application/json' \
-d '{
"model": "auto",
"messages": [{ "role": "user", "content": "Hello!" }],
"extra_body": {
"sort": ["price"]
}
}'
```
This selects the cheapest available model. Available sort criteria: `price`, `latency`, `throughput`, `intelligence`, `math`, `coding`.
You can combine multiple criteria — models are ranked by the first criterion, with subsequent criteria used as tiebreakers:
```bash
curl -X POST https://api.inworld.ai/v1/chat/completions \
-H 'Authorization: Bearer ' \
-H 'Content-Type: application/json' \
-d '{
"model": "auto",
"messages": [{ "role": "user", "content": "Hello!" }],
"extra_body": {
"sort": ["price", "latency"]
}
}'
```
This picks the cheapest model, using latency as a tiebreaker.
### Filtering models
Use `extra_body.models` to restrict the candidate pool, or `extra_body.ignore` to exclude specific models or entire providers:
```json Restrict to specific models
{
"model": "auto",
"messages": [{ "role": "user", "content": "Hello!" }],
"extra_body": {
"models": ["openai/gpt-5.2", "anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-pro"],
"sort": ["latency"]
}
}
```
```json Exclude models or providers
{
"model": "auto",
"messages": [{ "role": "user", "content": "Hello!" }],
"extra_body": {
"ignore": ["openai", "anthropic/claude-opus-4-6"],
"sort": ["latency"]
}
}
```
---
#### Prompt Caching
Source: https://docs.inworld.ai/router/capabilities/caching
To save on inference costs, you can enable prompt caching on supported providers and models. Inworld Router supports both **implicit** and **explicit** caching.
## Implicit Caching
Implicit caching depends on the model provider. If a provider supports prompt caching, it works automatically on their terms—no configuration required.
**Provider stickiness**: Inworld Router ensures all requests go to the **same model provider** for a given conversation. If you hit cache with one provider, subsequent requests stay routed to that provider, so you never lose cached data by being switched to a different provider.
Implicit caching is typically available on providers like OpenAI, DeepSeek, and Google Gemini 2.5. Each provider has its own minimum token requirements and TTL behavior. Consult the provider's documentation for pricing and model-specific details.
## Explicit Caching
Explicit caching is supported for **Anthropic** and **Google** providers. It gives you control over what gets cached and for how long.
### Unified Protocol
Both providers use the same protocol: a `cache_control` object in each message content part, including a configurable `ttl` (time to live).
### Cache Control in Messages
Add `cache_control` to text parts within multipart message content. Reserve it for large bodies of text such as character cards, CSV data, RAG context, or book chapters.
```json
{
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are a historian studying the fall of the Roman Empire. Below is an extensive reference book:"
},
{
"type": "text",
"text": "HUGE TEXT BODY HERE",
"cache_control": {
"type": "ephemeral",
"ttl": "1h"
}
}
]
},
{
"role": "user",
"content": [
{
"type": "text",
"text": "What triggered the collapse?"
}
]
}
]
}
```
### TTL Management
- **Configurable TTL**: Set `ttl` in the `cache_control` object (e.g., `"ttl": "5m"`, `"ttl": "1h"`).
- **Update at any time**: You can change the TTL on subsequent requests if you originally set it longer than needed.
- **Automatic prolongation**: When 5% or less of the TTL remains and there are still messages hitting the cache, Inworld Router automatically extends the TTL to keep your cache warm.
### User Message Example
```json
{
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Based on the book text below:"
},
{
"type": "text",
"text": "HUGE TEXT BODY HERE",
"cache_control": {
"type": "ephemeral",
"ttl": "30m"
}
},
{
"type": "text",
"text": "List all main characters mentioned in the text above."
}
]
}
]
}
```
## Inspecting Cache Usage
To see how much caching saved on each generation, check the `prompt_tokens_details` object in the usage response.
### Request and Response Samples
The following samples were verified against the Inworld Router API. The first request establishes a cache; the second request hits the cache.
**Request 1 — Cache write (establishing cache)**
```bash
curl -X POST "https://api.inworld.ai/v1/chat/completions" \
-H "Authorization: Basic " \
-H "Content-Type: application/json" \
-d '{
"model": "google-ai-studio/gemini-2.5-flash",
"messages": [
{
"role": "system",
"content": [
{
"type": "text",
"text": "You are a helpful historian. Below is reference material:"
},
{
"type": "text",
"text": "",
"cache_control": {"type": "ephemeral", "ttl": "5m"}
}
]
},
{"role": "user", "content": "What year did the Western Roman Empire fall? Reply in one short sentence."}
],
"max_tokens": 50
}'
```
**Response 1 — Cache write**
```json
{
"id": "chatcmpl-1772169766796",
"model": "google-ai-studio/gemini-2.5-flash",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The Western Roman Empire fell in 476 AD.\n"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1402,
"completion_tokens": 13,
"total_tokens": 1415,
"prompt_tokens_details": {
"cache_write_tokens": 1388,
"cached_tokens": 1388
}
}
}
```
**Request 2 — Cache hit (same conversation, follow-up question)**
Send the same system message and cached content, plus the previous exchange and a new user message:
```bash
curl -X POST "https://api.inworld.ai/v1/chat/completions" \
-H "Authorization: Basic " \
-H "Content-Type: application/json" \
-d '{
"model": "google-ai-studio/gemini-2.5-flash",
"messages": [
{
"role": "system",
"content": [
{"type": "text", "text": "You are a helpful historian. Below is reference material:"},
{
"type": "text",
"text": "",
"cache_control": {"type": "ephemeral", "ttl": "5m"}
}
]
},
{"role": "user", "content": "What year did the Western Roman Empire fall? Reply in one short sentence."},
{"role": "assistant", "content": "The Western Roman Empire fell in 476 AD.\n"},
{"role": "user", "content": "Name two factors that contributed to its fall."}
],
"max_tokens": 80
}'
```
**Response 2 — Cache hit**
```json
{
"id": "chatcmpl-1772169775239",
"model": "google-ai-studio/gemini-2.5-flash",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "While the provided text doesn't explicitly detail the reasons for the fall of the Western Roman Empire, it does highlight factors key to its rise and endurance... Drawing inferences from that, two factors could be: a decline in military prowess and ineffective administration.\n"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1424,
"completion_tokens": 65,
"total_tokens": 1489,
"prompt_tokens_details": {
"cached_tokens": 1388
}
}
}
```
On the cache hit, `cached_tokens` is 1388 and `cache_write_tokens` is omitted (no write). The cached content is reused, reducing input token cost.
Explicit caching requires a minimum of **1024 tokens** in the cacheable content block. Shorter content returns a validation error.
### Response Example (summary)
```json
{
"id": "chatcmpl-...",
"usage": {
"prompt_tokens": 4641,
"completion_tokens": 1817,
"prompt_tokens_details": {
"cached_tokens": 4608
}
}
}
```
### Usage Fields
| Field | Description |
|-------|-------------|
| `cached_tokens` | Number of tokens read from the cache (cache hit). When greater than zero, you benefit from cached content. |
| `cache_write_tokens` | Number of tokens written to the cache. Appears on the first request when establishing a new cache entry. |
Some providers charge differently for cache writes vs. reads. Anthropic, for example, charges for cache writes but offers discounts on cache reads. Check provider pricing for details.
---
#### Web Search
Source: https://docs.inworld.ai/router/capabilities/web-search
Ground chat completions with real-time web results using **`extra_body.web_search`**. This works with any model supported by the router.
For OpenAI-compatible clients, `web_search_options` is also accepted as a top-level field, but it is only supported by a limited number of providers. We recommend `extra_body.web_search` for broadest compatibility.
## `extra_body.web_search`
```json
{
"model": "openai/gpt-4o",
"messages": [{ "role": "user", "content": "What are the latest AI news?" }],
"extra_body": {
"web_search": { "engine": "exa", "max_results": 3, "max_steps": 1 }
}
}
```
| Parameter | Notes |
| :-- | :-- |
| `engine` | `"exa"` \| `"google"` \| omit. Default **exa**. |
| `max_results` | Results per search call. Default **3**. |
| `max_steps` | Search/refine rounds. Default **1**. |
## `web_search_options` (OpenAI-compatible)
Top-level field on the request—no `extra_body`. Forwarded to providers that implement native web search.
```json
{
"model": "openai/gpt-4o",
"messages": [{ "role": "user", "content": "What are the latest AI news?" }],
"web_search_options": {
"search_context_size": "low"
}
}
```
`search_context_size`: `"low"` | `"medium"` | `"high"`.
## Citations & streaming
Assistant messages may include OpenAI-style **`annotations`** (e.g. `type: "url_citation"` with `url`, `title`, `content`).
With **`stream: true`**, annotations are delivered on the **last** SSE chunk, alongside `finish_reason`.
---
#### Chat Completions (Text)
Source: https://docs.inworld.ai/router/capabilities/text-chat
The `/v1/chat/completions` endpoint is the primary interface for text-based workflows. It supports streaming, tool calling, and structured outputs across all providers.
## Standard Request
```bash
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Bearer ' \
--header 'Content-Type: application/json' \
--data '{
"model": "auto",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Write a 50-word bio for a software engineer."}
],
"stream": true
}'
```
## Advanced Features
### Tool Calling (Function Calling)
Inworld Router normalizes tool calling. You define tools in the OpenAI format, and the router translates them for Anthropic or Google models.
### Structured Outputs (JSON Mode)
Force the model to return valid JSON by setting `response_format: { "type": "json_object" }`. Inworld Router ensures the underlying provider respects this constraint.
### Streaming
Real-time token streaming is supported for all providers. The stream format is identical to OpenAI's Server-Sent Events (SSE).
---
#### LLM + TTS (Voice Responses)
Source: https://docs.inworld.ai/router/guides/llm-plus-tts
## Overview
If you're already using Inworld TTS, Inworld Router enables you to optimize and combine your **LLM** requests with **Inworld Text-to-Speech** in a single request. Instead of managing two separate API calls (one for text generation, one for speech synthesis), you send one request and receive both text and audio back. Both streaming and non-streaming modes are supported.
In streaming mode, Inworld Router handles the entire pipeline: it intelligently routes your prompt to the best LLM, streams the generated text through an **optimized chunking engine**, and sends each chunk to the TTS engine as it's produced. The result is low-latency voice output — you hear the first audio well before the LLM finishes generating the full response. In non-streaming mode, the complete audio and transcript are returned together once the full response is ready.
**This is ideal for:**
- Voice assistants and conversational agents
- Real-time narration and read-aloud features
- Accessibility-first applications
- Any workflow where your users hear AI responses instead of (or in addition to) reading them
## Quick Start
Add the `audio` parameter to any chat completions request to enable TTS. You'll receive both the text response and audio data in the same stream.
```bash
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/my-router",
"max_tokens": 1000,
"stream": true,
"audio": {
"voice": "Dennis",
"model": "inworld-tts-1.5-max"
},
"messages": [
{"role": "user", "content": "What is the meaning of life?"}
]
}'
```
That's it. Inworld Router will:
1. Route your prompt to your preset Inworld Route (or your chosen model)
2. Stream text chunks to Inworld TTS as they're generated
3. Return both text and audio in the SSE stream
## Audio Parameters
The `audio` object controls voice synthesis:
| Parameter | Type | Description |
|-----------|--------|-------------|
| `voice` | string | **Required.** The voice ID to use for speech synthesis (e.g., `"Dennis"`, `"Chloe"`). See [List Voices](/api-reference/ttsAPI/texttospeech/list-voices) for all available voices. |
| `model` | string | **Required.** The TTS model to use (e.g., `"inworld-tts-1.5-max"`). See [TTS Models](/tts/tts-models) for available options. |
### Default Audio Output
| Property | Value |
|-------------|-------|
| Sample rate | 48,000 Hz |
| Format | PCM |
## Streaming Response Format
When streaming is enabled (`"stream": true`), the response is delivered as Server-Sent Events (SSE). Each event is a JSON object in the `data` field.
When TTS is active, text is delivered through `delta.audio.transcript`. Audio data and its corresponding transcript are sent together via `delta.audio`:
```json
data: {"choices":[{"delta":{"audio":{"data":"","transcript":"Hello! How can I assist you today?"}},"index":0}],...}
```
| Field | Description |
|-------|-------------|
| `delta.audio.data` | Base64-encoded PCM audio. |
| `delta.audio.transcript` | The text being spoken. Use this for real-time text display. |
Text and audio are chunked independently. Text is chunked at natural sentence boundaries, while audio is chunked at fixed byte sizes. This means a single `transcript` value may span multiple audio chunks. The transcript for a text segment is attached to the first audio chunk of that segment — subsequent audio chunks for the same segment will contain only `data` without a `transcript` field.
### Non-Streaming Response
Without streaming (`"stream": false`), the full audio and transcript are returned in the `message.audio` object:
```json
{
"choices": [{
"message": {
"role": "assistant",
"content": "",
"audio": {
"id": "audio_chatcmpl-xyz",
"data": "",
"transcript": "Hello! How can I assist you today?"
}
},
"finish_reason": "stop"
}]
}
```
When TTS is active, `message.content` is empty. The full text is available in `message.audio.transcript`.
## Use Any LLM
The `audio` parameter works with any model available through Inworld Router. The LLM generates text, and Inworld Router handles the TTS conversion separately — so your choice of voice is independent of your choice of model. See the [Models API](/api-reference/modelsAPI/modelservice/list-models) for a full list of supported LLM models.
```bash
# Use auto model selection + TTS
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "auto",
"stream": true,
"audio": {
"voice": "Chloe",
"model": "inworld-tts-1.5-max"
},
"messages": [
{"role": "user", "content": "Tell me a short bedtime story."}
],
"extra_body": {
"sort": ["latency"]
}
}'
```
This combines Inworld Router's intelligent model selection with TTS — you get the fastest available LLM **and** voice output in one call.
## Combine with Smart Routing Features
All Inworld Router capabilities work alongside TTS:
### Failover with Voice
If your primary model is unavailable, Inworld Router fails over to a backup — and the voice output continues seamlessly:
```bash
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "openai/gpt-5",
"stream": true,
"audio": {
"voice": "Dennis",
"model": "inworld-tts-1.5-max"
},
"messages": [
{"role": "user", "content": "Explain quantum computing simply."}
],
"extra_body": {
"models": ["anthropic/claude-opus-4-6", "google-ai-studio/gemini-2.5-flash"]
}
}'
```
If GPT-5 fails, the request falls over to Claude or Gemini — and the same voice (Dennis) is used regardless of which model generates the text.
### Cost-Optimized Voice Responses
Route to the cheapest model while still getting audio output:
```bash
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "auto",
"stream": true,
"audio": {
"voice": "Dennis",
"model": "inworld-tts-1.5-max"
},
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"extra_body": {
"sort": ["price", "latency"]
}
}'
```
## Optimized Chunking
Inworld Router includes a **built-in text chunking engine** optimized for TTS. Rather than waiting for the LLM to finish generating the full response, the router:
1. Buffers incoming tokens from the LLM
2. Detects natural sentence and clause boundaries
3. Sends each chunk to the TTS engine as soon as it's ready
This pipeline significantly reduces **Time to First Audio (TTFA)** — your users start hearing the response while the LLM is still generating text. The chunking is tuned for natural-sounding speech: it avoids breaking mid-word or mid-phrase, producing smooth, conversational audio.
## Tool Calling
Tool calls (function calling) work alongside TTS. When the LLM decides to call a tool, the tool call is returned as standard `delta.tool_calls` chunks (no audio is generated for that turn). Once you execute the tool and send the result back with TTS enabled, the final response is spoken.
### Tools + Voice Example
```python
API_URL = "https://api.inworld.ai/v1/chat/completions"
HEADERS = {
"Authorization": "Basic ",
"Content-Type": "application/json",
}
# Step 1: Request with tools + TTS enabled
response = requests.post(API_URL, headers=HEADERS, json={
"model": "openai/gpt-5",
"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
},
"required": ["location"]
}
}
}],
"audio": {
"voice": "Dennis",
"model": "inworld-tts-1.5-max"
}
}).json()
# LLM returns tool_calls (no audio on this turn)
tool_call = response["choices"][0]["message"]["tool_calls"][0]
# Step 2: Execute the tool call
tool_result = get_weather("Tokyo") # Your function
# Step 3: Send tool result back — this response is spoken aloud
audio_response = requests.post(API_URL, headers=HEADERS, json={
"model": "openai/gpt-5",
"stream": True,
"messages": [
{"role": "user", "content": "What's the weather in Tokyo?"},
response["choices"][0]["message"],
{"role": "tool", "content": tool_result, "tool_call_id": tool_call["id"]}
],
"audio": {
"voice": "Dennis",
"model": "inworld-tts-1.5-max"
}
}, stream=True)
# Parse SSE stream for audio chunks (same as Python example below)
```
## Python Example
```python
response = requests.post(
"https://api.inworld.ai/v1/chat/completions",
headers={
"Authorization": "Basic ",
"Content-Type": "application/json",
},
json={
"model": "openai/gpt-5",
"max_tokens": 500,
"stream": True,
"audio": {
"voice": "Dennis",
"model": "inworld-tts-1.5-max",
},
"messages": [
{"role": "user", "content": "Tell me a fun fact about space."}
],
},
stream=True,
)
audio_chunks = []
full_transcript = ""
for line in response.iter_lines():
line = line.decode("utf-8")
if not line.startswith("data: "):
continue
data = line[6:]
if data == "[DONE]":
break
chunk = json.loads(data)
delta = chunk["choices"][0].get("delta", {})
audio = delta.get("audio")
if audio:
# Text transcript — use for real-time text display
if "transcript" in audio:
full_transcript += audio["transcript"]
print(audio["transcript"], end="", flush=True)
# Audio data — PCM 48kHz 16-bit mono
if "data" in audio:
audio_chunks.append(base64.b64decode(audio["data"]))
pcm_audio = b"".join(audio_chunks)
```
## JavaScript / Node.js Example
```javascript
const response = await fetch("https://api.inworld.ai/v1/chat/completions", {
method: "POST",
headers: {
Authorization: "Basic ",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "openai/gpt-5",
max_tokens: 500,
stream: true,
audio: {
voice: "Dennis",
model: "inworld-tts-1.5-max",
},
messages: [
{ role: "user", content: "Tell me a fun fact about space." },
],
}),
});
const reader = response.body.getReader();
const decoder = new TextDecoder();
const audioChunks = [];
let fullTranscript = "";
let buffer = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop();
for (const line of lines) {
if (!line.startsWith("data: ")) continue;
const data = line.slice(6);
if (data === "[DONE]") break;
const chunk = JSON.parse(data);
const audio = chunk.choices[0]?.delta?.audio;
if (audio) {
// Text transcript — use for real-time text display
if (audio.transcript) {
fullTranscript += audio.transcript;
process.stdout.write(audio.transcript);
}
// Audio data — PCM 48kHz 16-bit mono
if (audio.data) {
audioChunks.push(Buffer.from(audio.data, "base64"));
}
}
}
}
const pcmAudio = Buffer.concat(audioChunks);
```
## Next Steps
- [OpenAI Compatibility](/router/openai-compatibility) — use Inworld Router with the OpenAI SDK
- [Cost Optimizer](/router/guides/cost-optimizer) — route by query complexity to reduce costs
- [Failover System](/router/guides/failover-system) — build a resilient multi-provider setup
- [Extra Body Parameters](/router/usage/extra-body-parameters) — all available `sort`, `models`, and `ignore` options
---
#### APIs & SDKs
#### OpenAI Compatibility
Source: https://docs.inworld.ai/router/openai-compatibility
Inworld Router is fully compatible with the OpenAI SDK. By simply changing the `base_url` and `api_key`, you can add the capabilities of Inworld Router to your existing apps.
## Endpoint
The OpenAI-compatible API endpoint is available at:
```
https://api.inworld.ai/v1
```
To use Inworld Router, change your API call URL or SDK base URL from `https://api.openai.com/v1` to `https://api.inworld.ai/v1`.
## OpenAI SDK
Below is an example request using OpenAI's SDK
```python Python
from openai import OpenAI
# Initialize the client with Inworld Router credentials
client = OpenAI(
base_url="https://api.inworld.ai/v1",
api_key="YOUR_INWORLD_API_KEY" # Your Inworld API Key
)
# Call your router
response = client.chat.completions.create(
model="inworld/",
messages=[
{"role": "user", "content": "Who created you?"}
]
)
print(response.choices[0].message.content)
```
```javascript Node
const openai = new OpenAI({
baseURL: 'https://api.inworld.ai/v1',
apiKey: 'YOUR_INWORLD_API_KEY',
});
async function main() {
const completion = await openai.chat.completions.create({
messages: [{ role: 'user', content: 'Who created you?' }],
model: 'inworld/demo-router'
});
console.log(completion.choices[0].message.content);
}
main();
```
---
#### Anthropic Compatibility
Source: https://docs.inworld.ai/router/anthropic-compatibility
Inworld Router is available through Anthropic-compatible API endpoints, so you can use the Anthropic SDK and tools like Claude Code, while benefiting from Inworld Router's multi-provider routing, failover, and cost optimization.
## Endpoint
The Anthropic-compatible Messages API is available at:
```
https://api.inworld.ai/v1/messages
```
When using the Anthropic SDK, set the base URL to `https://api.inworld.ai`. The SDK automatically appends `/v1/messages`.
## Authentication
Use your Inworld API Key with the `Authorization: Bearer` header:
```
Authorization: Bearer
```
When using the Anthropic SDK, pass your key via the `auth_token` parameter (not `api_key`), which sends it as `Authorization: Bearer`:
```python
client = anthropic.Anthropic(
base_url="https://api.inworld.ai",
auth_token="YOUR_INWORLD_API_KEY",
)
```
## Anthropic SDK
Below is an example request using Anthropic's SDK
```python Python
client = anthropic.Anthropic(
base_url="https://api.inworld.ai",
auth_token="YOUR_INWORLD_API_KEY", # Sends as Authorization: Bearer
)
message = client.messages.create(
model="inworld/",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain how neural networks learn."}
]
)
print(message.content[0].text)
```
```typescript Typescript
const client = new Anthropic({
baseURL: 'https://api.inworld.ai',
authToken: 'YOUR_INWORLD_API_KEY', // Sends as Authorization: Bearer
});
const message = await client.messages.create({
model: 'inworld/',
max_tokens: 1024,
messages: [
{ role: 'user', content: 'Explain how neural networks learn.' }
],
});
console.log(message.content[0].text);
```
### Response
The response follows the Anthropic Messages API format:
```json
{
"content": [
{"text": "Neural networks learn through...", "type": "text"}
],
"id": "msg-...",
"model": "anthropic/claude-opus-4-6",
"role": "assistant",
"stop_reason": "end_turn",
"type": "message",
"usage": {
"input_tokens": 12,
"output_tokens": 150
},
"metadata": {
"attempts": [
{"model": "anthropic/claude-opus-4-6", "success": true, "time_to_first_token_ms": 1152}
],
"generation_id": "019b...",
"reasoning": "Using specified model: 'anthropic/claude-opus-4-6' - success"
}
}
```
Inworld Router adds a `metadata` field to the response containing routing information — which model was selected, attempt history, and timing. This field is not part of the standard Anthropic response format, but it does not break Anthropic SDK parsing.
## Next Steps
- [Claude Code Integration](/router/guides/claude-code) to use Inworld Router as your Claude Code backend.
- [Migrating from Anthropic](/router/migration/anthropic-to-inworld) for a step-by-step migration guide.
---
#### Use with Claude Code
Source: https://docs.inworld.ai/router/guides/claude-code
## Overview
[Claude Code](https://docs.anthropic.com/en/docs/claude-code) is Anthropic's agentic coding tool that runs in your terminal. By routing Claude Code through Inworld Router, you get automatic failover, cost optimization, and unified observability — all without changing how you use Claude Code.
## Why Route Claude Code Through Inworld Router?
- **Failover**: If Anthropic is rate-limited or down, Inworld Router automatically fails over to another provider.
- **Cost control**: Use `model: "auto"` with cost-optimized routing to reduce API spend on simpler coding tasks.
- **Observability**: Track all Claude Code requests, token usage, and costs in the Inworld Portal.
- **Multi-model access**: Use `auto` to let Inworld Router pick the best model, or route to OpenAI, Google, or other providers through the same Anthropic-compatible interface.
## Setup
Claude Code uses the Anthropic SDK under the hood. Redirect it to Inworld Router by setting environment variables.
Add these to your shell configuration file (e.g., `~/.zshrc` or `~/.bashrc`):
```bash
export ANTHROPIC_BASE_URL=https://api.inworld.ai
export ANTHROPIC_AUTH_TOKEN=YOUR_INWORLD_API_KEY
```
`ANTHROPIC_AUTH_TOKEN` sends your key as `Authorization: Bearer`, which is how Inworld Router authenticates.
Then reload your shell and run Claude Code:
```bash
source ~/.zshrc
claude
```
All requests from Claude Code will now route through Inworld Router.
### Per-Session Override
For a quick test without changing your global config:
```bash
ANTHROPIC_BASE_URL=https://api.inworld.ai \
ANTHROPIC_AUTH_TOKEN=YOUR_INWORLD_API_KEY \
claude
```
## Using Auto Routing
By default, Claude Code sends requests to a specific Claude model. With Inworld Router, you can override the model to use intelligent routing:
```bash
claude --model auto
```
This lets Inworld Router select the best available model based on your routing configuration (cost, latency, or intelligence priorities).
## Using a Specific Model
You can route to any model supported by Inworld Router:
```bash
# Use a specific Anthropic model
claude --model anthropic/claude-opus-4-6
# Use an OpenAI model through the Anthropic interface
claude --model openai/gpt-5
# Use a Google model
claude --model google-ai-studio/gemini-2.5-flash
```
## Using a Custom Router
If you've created a custom router (e.g., a cost-optimized or failover router), reference it by ID:
```bash
claude --model inworld/failover-system
```
## Comparison: Anthropic Direct vs Inworld Router
| Feature | Anthropic Direct | Via Inworld Router |
| :--- | :--- | :--- |
| **Base URL** | `https://api.anthropic.com` | `https://api.inworld.ai` |
| **Models** | Claude models only | 50+ models across OpenAI, Google, Anthropic, Groq, and more |
| **Auto routing** | Not available | `model: "auto"` with configurable sort criteria (cost, latency, intelligence) |
| **Custom routers** | Not available | `model: "inworld/your-router"` with CEL conditions, variants, weights |
| **Failover** | None — single provider | Automatic failover across providers |
| **Observability** | Anthropic Console | Inworld Portal with routing metadata, attempt history, latency breakdown |
| **Cost optimization** | Fixed pricing | Dynamic routing to cheaper models for simple tasks |
## Verifying the Setup
To confirm Claude Code is routing through Inworld Router, check the Inworld Portal's **Observability** tab after making a request. You should see the request logged with routing metadata, including which model was selected and the latency breakdown.
## Troubleshooting
### "Unauthorized" errors
Ensure `ANTHROPIC_AUTH_TOKEN` is set to your **Inworld API Key**. You can find your API key in the [Inworld Portal](https://platform.inworld.ai) under **Workspace Settings** > **Integrations**.
### Model not found
Ensure you're using a valid model identifier:
- `auto` — Inworld Router selects the best model
- `anthropic/claude-opus-4-6` — Specific model with `provider/model` format
- `inworld/your-router-id` — Custom router
### Connection issues
Verify the base URL is `https://api.inworld.ai` (no trailing slash, no path suffix).
## Next Steps
- [Anthropic Compatibility](/router/anthropic-compatibility) for full API reference and feature matrix.
- [Cost Optimizer](/router/guides/cost-optimizer) to reduce Claude Code API costs.
- [Failover System](/router/guides/failover-system) for high availability.
---
#### Data Integrations
Source: https://docs.inworld.ai/router/data-integrations
Export router exposure data to your analytics platforms to measure the impact of different routes and variants on your KPIs.
Importing data from your analytics platforms is coming soon!
## Mixpanel
Export router exposure data to your existing [Mixpanel](https://mixpanel.com) project for product analytics. When a user is exposed to a router, it is sent as an event to your Mixpanel project, including details on router, route, and variant — so you can build dashboards and analyze routing performance in Mixpanel.
In Mixpanel, navigate to **Settings > Project Settings**. In the **Overview** tab, find and copy your **Project ID**.
In Project Settings, go to the **Service Accounts** tab. Click **Add Service Account**.
Type in the name of your new service account. Click the `+` button to create your new service account. Set the **Project Role** to **Admin** or **Owner**. Set **Expires** to **Never**.
Click **Add** and copy the username and secret — you will not be able to view the secret again.
In [Portal](https://portal.inworld.ai), navigate to **Data & Metrics** and click **Add Integration**, then select **Mixpanel**. Enter the following:
| Field | Value |
| :---- | :---- |
| **Project ID** | Your Mixpanel project ID from Step 1 |
| **Service Account Username** | The service account username from the Step 2 |
| **Service Account Secret** | The service account secret from Step 2 |
Toggle **Export to Mixpanel** to on, then click **Enable**.
Now call the [Chat Completions API](/api-reference/routerAPI/chat-completions) with your router, populating the Mixpanel Distinct ID that corresponds to the user who this request is for in the `user` field when making the request:
```shell
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/my-router",
"messages": [{"role": "user", "content": "Hello"}],
"user": ""
}'
```
The router exposure data will match the `user` to the Mixpanel Distinct ID when creating events in Mixpanel, so you can tie the specific request to the right user.
The router exposure data will flow to your Mixpanel project automatically (data will sync every ~8 hours). You will see an event for each router, route, and variant that a user was exposed to every hour.
Each event will include:
- `alias_id` - This is your router's id
- `route_id` - This is the route that the user was routed to
- `variant_id` - This is the variant that the user was exposed to
- `Time` - This is time of the most recent event in the hourly time bucket in which the user was exposed to this router, route, and variant
The Distinct ID of the event will be what was passed in the `user` field of the request.
Now you can use this event exposure data to understand how different routes and variants impact your metrics.
We recommend creating a [Borrowed Event Property](https://docs.mixpanel.com/docs/features/custom-properties#borrowed-properties). To do so, create a Custom Event Property and select Computed > Borrow Property. Select the `Route Exposure` event and `alias_id`, `route_id`, or `variant_id` based on what property you want to analyze your data by.
Now you can use the property in any Breakdown of a report to see how your metrics trend based on the router, route, or variant that the user was exposed to.
## BigQuery
Export router exposure data to your existing [BigQuery](https://mixpanel.com) project for product analytics. When a user is exposed to a router, it is sent as an event to your BigQuery project, including details on router, route, and variant — so you can analyze routing performance directly from BigQuery.
Create a BigQuery dataset in your GCP project to store router data.
Create a service account with BigQuery Data Editor permissions.
In [Portal](https://portal.inworld.ai), navigate to **Data & Metrics** and click **Add Integration**, then select **BigQuery**. Enter the following:
| Field | Value |
| :---- | :---- |
| **Project ID** | Your GCP project ID |
| **Dataset ID** | Your dataset ID from Step 1 |
| **Service Account Email** | The service account email from the Step 2 |
Toggle **Export to BigQuery`** to on, then click **Enable**.
Now call the [Chat Completions API](/api-reference/routerAPI/chat-completions) with your route, populating the a unique user id corresponds to the user who this request is for in the `user` field when making the request. The router exposure data will include the `user` in the data export, so you can tie the specific request to the right user.
```shell
curl --request POST \
--url https://api.inworld.ai/v1/chat/completions \
--header 'Authorization: Basic ' \
--header 'Content-Type: application/json' \
--data '{
"model": "inworld/my-router",
"messages": [{"role": "user", "content": "Hello"}],
"user": ""
}'
```
The router exposure data will flow to your BigQuery project automatically (data will sync every ~8 hours). You will see an event for each router, route, and variant that a user was exposed to every hour.
Each event will include:
- `alias_id` - This is your router's id
- `route_id` - This is the route that the user was routed to
- `variant_id` - This is the variant that the user was exposed to
- `user_id` - The user that was exposed to thie router, route, and variant
- `time` - This is time of the most recent event in the hourly time bucket in which the user was exposed to this router, route, and variant
You can use this data in conjunction with your existing metrics to understand how your metrics trend based on the router, route, or variant that the user was exposed to.
---
### Migration
#### OpenRouter to Inworld
Source: https://docs.inworld.ai/router/migration/openrouter-to-inworld
If you are already using OpenRouter, migrating to Inworld Router is straightforward. Our API is designed to be a drop-in replacement for OpenAI-compatible endpoints, with additional features for enterprise reliability and cost optimization.
## Key Differences
| Feature | OpenRouter | Inworld Router |
| :--- | :--- | :--- |
| **Authentication** | API Key (Bearer) | API Key & Secret (Basic Auth) |
| **Base URL** | `https://openrouter.ai/api/v1` | `https://api.inworld.ai/v1` |
| **Model Names** | `provider/model` | `provider/model` (e.g., `openai/gpt-5`) |
| **Routing** | Handled via model string | Handled via `model: "auto"` or custom Routers |
## Migration Steps
### 1. Update Authentication
OpenRouter uses a single API key. Inworld uses a Key/Secret pair.
**OpenRouter:**
```bash
Authorization: Bearer $OPENROUTER_API_KEY
```
**Inworld:**
```bash
Authorization: Bearer
```
### 2. Update the Base URL
Change your client configuration to point to our endpoint.
```python
# From:
# base_url="https://openrouter.ai/api/v1"
# To:
base_url="https://api.inworld.ai/v1"
```
### 3. Map Your Models
Inworld supports the same `provider/model` syntax you're used to.
- `openai/gpt-5` → `openai/gpt-5`
- `anthropic/claude-opus-4-6` → `anthropic/claude-opus-4-6`
- `google-ai-studio/gemini-2.5-flash` → `google-ai-studio/gemini-2.5-flash`
### 4. Enable Intelligent Routing
Instead of manually picking models, you can now use our routing engine.
```json
{
"model": "auto",
"extra_body": {
"sort": ["price", "latency"]
}
}
```
## Why Migrate?
1. **Enterprise Reliability**: Built-in automatic failover across providers.
2. **Cost Optimization**: Dynamic tiering to reduce bills by up to 70%.
3. **Unified Observability**: Detailed logs and performance metrics across all models.
4. **Privacy Controls**: Regional routing and PII detection.
## Support
Need help with a complex migration? [Contact our engineering team](mailto:support@inworld.ai).
---
#### Anthropic to Inworld
Source: https://docs.inworld.ai/router/migration/anthropic-to-inworld
If you are already using the Anthropic API directly, migrating to Inworld Router is a two-line change. Inworld Router supports the Anthropic Messages API natively, so your existing code, message format, and streaming logic all work as-is.
## Key Differences
| Feature | Anthropic Direct | Inworld Router |
| :--- | :--- | :--- |
| **Base URL** | `https://api.anthropic.com` | `https://api.inworld.ai` |
| **Auth** | Anthropic API key | Inworld API Key (`Authorization: Bearer`) |
| **Model Names** | `claude-opus-4-6` | `anthropic/claude-opus-4-6` or `auto` |
| **Routing** | Single provider | Multi-provider with failover, cost optimization |
| **Observability** | Anthropic Console | Inworld Portal (unified across all providers) |
## Migration Steps
### 1. Update Base URL and API Key
**Python:**
```python
client = anthropic.Anthropic(
# From:
# (default: https://api.anthropic.com)
# api_key="sk-ant-..."
# To:
base_url="https://api.inworld.ai",
auth_token="YOUR_INWORLD_API_KEY", # Sends as Authorization: Bearer
)
# Everything else stays the same
message = client.messages.create(
model="anthropic/claude-opus-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
]
)
```
**TypeScript:**
```typescript
const client = new Anthropic({
// From:
// (default: https://api.anthropic.com)
// apiKey: 'sk-ant-...'
// To:
baseURL: 'https://api.inworld.ai',
authToken: 'YOUR_INWORLD_API_KEY', // Sends as Authorization: Bearer
});
```
**cURL:**
```bash
# From:
# curl https://api.anthropic.com/v1/messages -H "x-api-key: sk-ant-..."
# To:
curl --request POST \
--url https://api.inworld.ai/v1/messages \
--header 'Authorization: Bearer ' \
--header 'anthropic-version: 2023-06-01' \
--header 'Content-Type: application/json' \
--data '{
"model": "anthropic/claude-opus-4-6",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
```
### 2. Update Model Names
Add the `anthropic/` prefix to model names:
- `claude-opus-4-6` → `anthropic/claude-opus-4-6`
- `claude-opus-4-20250514` → `anthropic/claude-opus-4-20250514`
- `claude-3-5-haiku-20241022` → `anthropic/claude-3-5-haiku-20241022`
### 3. Enable Intelligent Routing (Optional)
Instead of hardcoding a model, use Inworld Router's intelligent routing:
```python
message = client.messages.create(
model="auto", # Inworld Router selects the best model
max_tokens=1024,
messages=[
{"role": "user", "content": "Write a Python function to sort a list."}
]
)
```
## What Stays the Same
- **Message format**: `messages` array with `role` and `content` — identical.
- **Streaming**: `stream=True` works the same way with Anthropic SSE events (`message_start`, `content_block_delta`, etc.).
- **System messages**: Top-level `system` parameter works identically.
- **Tool use**: Tool definitions, `tool_use` content blocks, and `tool_result` messages work identically.
- **`max_tokens`**: Required parameter, same behavior.
- **`temperature`, `top_p`**: Same behavior.
- **Multi-turn conversations**: Same format, context is preserved correctly.
## What You Gain
1. **Multi-Provider Failover**: If Anthropic is down or rate-limited, Inworld Router automatically retries on OpenAI, Google, or other providers — and still returns the response in the Anthropic format.
2. **Cost Optimization**: Route simple queries to cheaper models, reserve Claude for complex tasks.
3. **Cross-Provider Access**: Use `openai/gpt-5` or `google-ai-studio/gemini-2.5-flash` through the same Anthropic SDK — Inworld Router translates the request and returns an Anthropic-format response.
4. **Unified Observability**: See all your LLM requests — across Anthropic, OpenAI, Google — in one dashboard.
5. **Custom Routing**: Build routers with conditional logic, A/B testing, and weighted variants.
## Environment Variables
If you use environment variables, update them:
```bash
# From:
# export ANTHROPIC_API_KEY=sk-ant-...
# To:
export ANTHROPIC_AUTH_TOKEN=YOUR_INWORLD_API_KEY
export ANTHROPIC_BASE_URL=https://api.inworld.ai
```
## Support
Need help with a complex migration? [Contact our engineering team](mailto:support@inworld.ai).
---
### Resources
#### Billing
Source: https://docs.inworld.ai/router/resources/billing
---
#### Usage
Source: https://docs.inworld.ai/router/resources/usage
---
#### Support
Source: https://docs.inworld.ai/router/resources/support
---
## Agent Runtime
### Node.js
#### Get Started
#### Node.js Agent Runtime
Source: https://docs.inworld.ai/node/overview
## Get Started
Get started with the Node.js Agent Runtime.
Create and chat with an AI character with Agent Runtime.
## Explore templates
Learn how to build a natural realtime voice experience, ready for production use
Learn how to take in multimodal inputs to power an AI companion.
Explore our full suite of available templates to find the best one for your use case.
## Watch it in Action
---
#### Node.js Agent Runtime Quickstart
Source: https://docs.inworld.ai/node/quickstart
This quickstart guide will walk through how to use the [Inworld CLI](/node/cli/overview) to set up a simple LLM to TTS conversational pipeline (powered by Agent Runtime) in just a few minutes.
## Prerequisites
Before you get started, please make sure you have the following installed:
- macOS 14 and later
- [Node.js v20 or higher](https://nodejs.org/en/download)
- [npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm)
- [graphviz](https://graphviz.org/download/) - Optional. Install this if you want to visualize your graph.
- [Node.js v20 or higher](https://nodejs.org/en/download)
- [npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm)
- `build-essential` package
- `glibc 2.35+` (Ubuntu 22, RHEL 10). Contact [support@inworld.ai](mailto:support@inworld.ai) if you need to extend that support.
- [graphviz](https://graphviz.org/download/) - Optional. Install this if you want to visualize your graph.
- [Node.js v20 or higher](https://nodejs.org/en/download)
- [npm](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm)
- [graphviz](https://graphviz.org/download/) - Optional. Install this if you want to visualize your graph.
## Get Started
Install the Inworld CLI globally.
```bash
npm install -g @inworld/cli
```
Log in to your Inworld account to use Inworld Agent Runtime. If you don't have an account, you can create one when prompted to login.
```bash
inworld login
# You'll be prompted to login via your browser
```
Once logged in, your credentials are stored and you won’t need to log in again.
The `inworld init` command downloads the `llm-to-tts-node` template—a production-ready LLM to TTS pipeline.
Currently, only the `llm-to-tts-node` template is available via CLI. To view all available templates, visit the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node).
```bash
inworld init --template llm-to-tts-node --name my-project
# Enter 'y' when prompted to install dependencies
```
After the command completes, you'll have a project directory with all dependencies installed.
Navigate to your project directory and run your pipeline with the appropriate inputs.
```bash
cd my-project
inworld run ./graph.ts '{"input": {"user_input":"Hello!"}}'
```
## Run a local server
Now that you've successfully run your first graph, you can run a local server to test it in your application.
Start your local server.
```bash
inworld serve ./graph.ts
```
You can see additional server configuration [here]() (including support for gRPC and Swagger UI).
Test the API with a simple curl command. Note that for the LLM to TTS pipeline, the API will return raw audio data that needs to be parsed in order to be played.
```bash cURL
curl -X POST http://localhost:3000/v1/graph:start \
-H "Content-Type: application/json" \
-d '{"input": {"user_input":"Hello!"}}'
```
Here is an example of the output
```ansi
{"executionStarted":{"executionId":"01999de9-8a75-75f8-a17b-7ec4c1b4490e","timestamp":"2025-10-01T03:55:52.309Z","variantName":"__default__"}}
{"ttsOutputChunk":{"text":"Hello!","audio":{"data":[0,0,0,0,0,0,0,0,0...],"sampleRate":48000}},"responseNumber":1}
{"ttsOutputChunk":{"text":" How can I assist you today?","audio":{"data":[0,0,0,0,0,0,0,0,0...],"sampleRate":48000}},"responseNumber":1}
{"executionCompleted":true}
```
## Make your first change
Now let's make our first modification to the LLM to TTS pipeline. Let's change the model and prompt.
Open up the graph.ts file in your project directory, which contains the graph configuration. Modify the `provider` and `modelName` under `RemoteLLMChatNode` to any [supported LLM](/models#llm).
```js graph.ts
GraphTypes,
RemoteLLMChatNode,
RemoteTTSNode,
SequentialGraphBuilder,
TextChunkingNode,
} from '@inworld/runtime/graph';
const graphBuilder = new SequentialGraphBuilder({
id: 'custom-text-node-llm',
nodes: [
new RemoteLLMChatNode({
provider: 'openai', // [!code --:2]
modelName: 'gpt-4o-mini',
provider: 'google', // [!code ++:2]
modelName: 'gemini-2.5-flash',
stream: true,
}),
new TextChunkingNode(),
new RemoteTTSNode(),
],
});
export const graph = graphBuilder.build();
// The graph input should be an LLMChatRequest with messages
// Example input: { messages: [{ role: 'system', content: 'You are an extremely sarcastic assistant. Always respond with sarcasm.' }, { role: 'user', content: 'Hello!' }] }
```
Test your updated graph.
```bash
inworld run ./graph.ts '{"input": {"user_input":"Hello!"}}'
```
## Next Steps
Now that you've learned the basics, explore more advanced features:
Deploy your graphs to a hosted endpoint
Explore other templates to jumpstart your development
## Need Help?
- **General CLI help:** Run `inworld help` or `inworld [command] --help`
- **Setup & development issues?** See [CLI Troubleshooting Guide](/node/cli/troubleshooting)
---
#### Build with Agent Runtime
#### Core Concepts
Source: https://docs.inworld.ai/node/core-concepts/overview
Inworld’s Agent Runtime turns complex AI pipelines into simple, composable graphs you can build, run, and deploy in minutes.
You can interact with the Agent Runtime through two main tools:
- The [**Node.js Agent Runtime SDK**](https://www.npmjs.com/package/@inworld/runtime), which provides APIs for building, executing, and integrating graphs directly in your application code.
- The [**Inworld CLI**](/node/cli/overview), a command-line toolkit to quickly initialize Runtime projects, serve graphs locally, and deploy them to hosted endpoints.
## How it works
At its core, the Agent Runtime executes [**graphs**](/node/core-concepts/graphs), directed networks of [**nodes**](/node/core-concepts/nodes) connected by [**edges**](/node/core-concepts/nodes) that form an executable pipeline. Think of a graph as encapsulating your application’s logic — whether that’s a speech-to-speech conversational pipeline, a routing flow, or a chat experience with memory. Once you create a graph, you can then execute it by passing in inputs to get outputs.
It's super easy to start building with a library of [**built-in nodes**](/node/core-concepts/nodes#built-in-nodes) for common tasks—such as text generation, speech synthesis, and audio processing. You can also create [**custom nodes**](/node/core-concepts/nodes#custom-nodes) that implement any custom logic you wish.
Below is a visualization of a graph for a simple speech-to-speech pipeline that can be used to power a realtime voice conversation.
Once you’ve created a graph, you can:
- **Use it directly** in your Node.js backend via the SDK, or
- [**Deploy it**](/node/cli/deploy) to a hosted endpoint and call it remotely from any client.
## Getting started
The best way to start building with Agent Runtime is by exploring our templates and guides to understand the core concepts and available components.
- Looking to get started quickly? Try the [quickstart](/node/quickstart).
- Want to build something from scratch? Start with the pre-built [Voice Agent](/node/templates/voice-agent) template, or explore other [templates](/node/templates/overview)
- Integrating into an existing project? Build and [deploy](/node/cli/deploy) a graph, and integrate it into your project with a single endpoint.
---
#### Build with Agent Runtime > Templates & Guides
#### Templates
Source: https://docs.inworld.ai/node/templates/overview
Templates and Guides provide pre-built examples and starting points for common use cases with the Inworld Agent Runtime SDK.
## Learning Examples & Code Snippets
Explore focused code examples at **[github.com/inworld-ai/inworld-runtime-templates-node](https://github.com/inworld-ai/inworld-runtime-templates-node)**
Standalone examples demonstrating specific Runtime features: LLM chat, TTS, STT, safety checks, custom nodes, etc.
- **Use when:** Learning Runtime concepts, prototyping workflows, or understanding specific features
- **How:** Clone the repository and explore individual examples
## Guides
**Guides** are step-by-step instructions with focused code snippets that demonstrate specific concepts and show you how to implement particular functionality.
Use templates and guides to jumpstart your development with the Runtime SDK!
---
#### Use an LLM
Source: https://docs.inworld.ai/node/templates/llm
The `node-llm-chat` template illustrates how to make LLM calls using the LLM node with support for streaming, tool calling, and multimodal inputs.
**Architecture**
- **Backend:** Inworld Agent Runtime
- **Frontend:** N/A (CLI example)
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
## Run the Template
1. Clone the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node):
```bash
git clone https://github.com/inworld-ai/inworld-runtime-templates-node
cd inworld-runtime-templates-node
```
2. Install the Runtime SDK inside the `cli` directory.
```shell Yarn
yarn add @inworld/runtime
```
```shell npm
npm install @inworld/runtime
```
3. Set up your Base64 [Runtime API key](/node/authentication) by copying the `.env-sample` file into a `.env` file in the `cli` folder and adding your API key.
```env .env
# Inworld Agent Runtime Base64 API key
INWORLD_API_KEY=
```
4. Run a basic example of calling the LLM with a text prompt:
```bash
yarn node-llm-chat "Hello, how are you?" \
--modelName=gemini-2.5-flash --provider=google
```
5. Now try changing the model and requiring JSON outputs. See [Models > Chat Completion](/models#chat-completion) for models supported.
```bash
yarn node-llm-chat "What is the weather in Vancouver? Return in JSON format" \
--modelName=gpt-4o --provider=openai --responseFormat=json
```
6. Now let's try with tool calling.
```bash
yarn node-llm-chat "What is 15 + 27?" \
--modelName=gpt-4o --provider=openai --tools --toolChoice=auto
```
7. Now let's try with image inputs:
```bash
yarn node-llm-chat "What do you see in this image?" \
--modelName=gpt-4o --provider=openai \
--imageUrl="https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg"
```
8. Finally, check out your captured traces in [Portal](https://platform.inworld.ai/)!
## Understanding the Template
The main functionality of the template is contained in the run function, which demonstrates how to use the Inworld Agent Runtime to generate text using the LLM node.
Let's break it down into more detail:
### 1) Initialize LLM node
First, we initialize the LLM node
```javascript
GraphBuilder,
GraphTypes,
RemoteLLMChatNode,
} from '@inworld/runtime/graph';
const llmNode = new RemoteLLMChatNode({
stream,
provider,
modelName,
});
```
When creating the node, you can specify:
- **provider**: The LLM service provider (inworld, openai, etc.) as specified [here](/models#chat-completion)
- **modelName**: Any model from [Chat Completion](/models#chat-completion)
- **stream**: Whether to enable streaming responses
- **textGenerationConfig**: LLM generation parameters. You can learn more about these configurations [here](/node/runtime-reference/interfaces/graph_dsl_graph_config_schema.TextGenerationConfig).
For convenience, `createRemoteLLMChatNode` does not require registering the component explicitly, before use in the node, but you can also register the component and reference it when creating the node as shown in the `node_llm_chat_explicit_components.ts` example. This allows you to reuse the same component across multiple nodes (for example, if you want to make multiple LLM calls with the same model).
```javascript
const llmComponent = new RemoteLLMComponent({
provider,
modelName,
defaultConfig: TEXT_CONFIG_SDK,
});
const llmNode = new RemoteLLMChatNode({
llmComponent,
stream,
});
```
### 2) Graph initialization
Next we create a new graph and add the LLM node, setting it as the start and end node. In more complex applications, you could connect multiple LLM nodes to create a processing pipeline.
```javascript
const graph = new GraphBuilder({
id: 'node_llm_chat_graph',
enableRemoteConfig: false,
apiKey,
})
.addNode(llmNode)
.setStartNode(llmNode)
.setEndNode(llmNode)
.build();
```
The [GraphBuilder](/node/runtime-reference/classes/graph_dsl_graph_builder.GraphBuilder) configuration includes:
- **id**: A unique identifier for the graph
- **enableRemoteConfig**: Whether to enable remote configuration (set to false for local execution)
- **apiKey**: Your Inworld API key
### 3) Graph input preparation
Now we create the message to send to the LLM based on the user's input.
```javascript
let graphInput;
if (tools) {
graphInput = createMessagesWithTools(
prompt,
toolChoice,
imageUrl,
toolCallHistory,
);
} else {
graphInput = createMessages(prompt, imageUrl, toolCallHistory);
}
if (responseFormat) {
graphInput.responseFormat = responseFormat;
}
```
Below is an example of what different inputs might look like:
```javascript
// Basic text message
const basicInput = {
messages: [
{
role: 'system',
content: 'You are a helpful assistant that can use tools when needed. When analyzing images, describe what you see and use appropriate tools if calculations or weather information is needed.'
},
{
role: 'user',
content: prompt
}
]
};
// Multimodal message with image
const multimodalInput = {
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: prompt,
},
{
type: 'image',
image_url: {
url: imageUrl,
detail: 'high',
},
},
],
}
]
};
// Messages with tool support
const toolInput = {
messages: [...],
tools: [
{
name: 'calculator',
description: 'Evaluate a mathematical expression',
properties: {
type: 'object',
properties: {
expression: {
type: 'string',
description: 'The mathematical expression to evaluate',
},
},
required: ['expression'],
},
},
{
name: 'get_weather',
description: 'Get the current weather in a location',
properties: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'The city and state, e.g., San Francisco, CA',
},
},
required: ['location'],
},
},
],
toolChoice: {
choice: 'auto' // or 'required', 'none', or specific function name
}
};
```
### 4) Graph execution
Execute the graph with the prepared input:
```javascript
const { outputStream } = await graph.start(new GraphTypes.LLMChatRequest(graphInput));
```
### 5) Response handling
Handle responses, including streaming responses.
```javascript
for await (const result of outputStream) {
await result.processResponse({
Content: (response: GraphTypes.Content) => {
console.log('📥 LLM Chat Response:');
console.log(' Content:', response.content);
// Handle tool calls if present
if (response.toolCalls && response.toolCalls.length > 0) {
console.log(' Tool Calls:');
response.toolCalls.forEach((toolCall, index) => {
console.log(` ${index + 1}. ${toolCall.name}(${toolCall.args})`);
console.log(` ID: ${toolCall.id}`);
});
}
},
ContentStream: async (stream: GraphTypes.ContentStream) => {
console.log('📡 LLM Chat Response Stream:');
let streamContent = '';
const toolCalls: { [id: string]: any } = {};
let chunkCount = 0;
for await (const chunk of stream) {
chunkCount++;
if (chunk.text) {
streamContent += chunk.text;
process.stdout.write(chunk.text);
}
// Accumulate tool calls from stream
if (chunk.toolCalls && chunk.toolCalls.length > 0) {
for (const toolCall of chunk.toolCalls) {
if (toolCalls[toolCall.id]) {
toolCalls[toolCall.id].args += toolCall.args;
} else {
toolCalls[toolCall.id] = { ...toolCall };
}
}
}
}
console.log(`\nTotal chunks: ${chunkCount}`);
console.log(`Final content length: ${streamContent.length} characters`);
const finalToolCalls = Object.values(toolCalls);
if (finalToolCalls.length > 0) {
console.log('Tool Calls from Stream:');
finalToolCalls.forEach((toolCall, index) => {
console.log(` ${index + 1}. ${toolCall.name}(${toolCall.args})`);
console.log(` ID: ${toolCall.id}`);
});
}
},
default: (data: any) => {
console.error('Unprocessed response:', data);
},
});
}
```
---
#### Convert Text-to-Speech (TTS)
Source: https://docs.inworld.ai/node/templates/tts
The `node-tts` template illustrates how to convert text-to-speech using the TTS node.
**Architecture**
- **Backend:** Inworld Agent Runtime
- **Frontend:** N/A (CLI example)
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
## Run the Template
1. Clone the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node):
```bash
git clone https://github.com/inworld-ai/inworld-runtime-templates-node
cd inworld-runtime-templates-node
```
2. Install the Runtime SDK inside the `cli` directory.
```shell Yarn
yarn add @inworld/runtime
```
```shell npm
npm install @inworld/runtime
```
3. Set up your Base64 [Runtime API key](/node/authentication) by copying the `.env-sample` file into a `.env` file in the `cli` folder and adding your API key.
```env .env
# Inworld Agent Runtime Base64 API key
INWORLD_API_KEY=
```
4. Try a different [model](/models#tts) or [voice](/api-reference/ttsAPI/texttospeech/list-voices)! You can specify the model using the `--modelId` parameter and a voice using the `--voiceName` parameter:
```
yarn node-tts "Hello, how are you?" --modelId=inworld-tts-1.5-max --voiceName=Ronald
```
## Understanding the Template
The main functionality of the template is contained in the run function, which demonstrates how to use the Inworld Agent Runtime to convert text-to-speech using the TTS node.
Now let's break down the template into more detail:
### 1) Node Initialization
We start by creating the TTS node.
```javascript
const ttsNode = new RemoteTTSNode({
id: 'tts_node',
speakerId: voiceName,
modelId,
sampleRate: SAMPLE_RATE,
temperature: 1.0,
speakingRate: 1,
});
```
When creating the TTS node, you can specify:
- **id**: A unique identifier for the node
- **speakerId**: The voice to use for synthesis (see [available voices](/api-reference/ttsAPI/texttospeech/list-voices))
- **modelId**: The [TTS model](/models#tts) to use for synthesis
- **sampleRate**: Audio output sample rate
- **temperature**: Controls randomness in synthesis
- **speakingRate**: Controls the speed of speech (1.0 is the voice's natural speed)
### 2) Graph initialization
Next, we create the graph using the GraphBuilder, adding the TTS node and setting it as both start and end node:
```javascript
const graph = new GraphBuilder({
id: 'node_tts_graph',
apiKey,
enableRemoteConfig: false,
})
.addNode(ttsNode)
.setStartNode(ttsNode)
.setEndNode(ttsNode)
.build();
```
The [GraphBuilder](/node/runtime-reference/classes/graph_dsl_graph_builder.GraphBuilder) configuration includes:
- **id**: A unique identifier for the graph
- **apiKey**: Your Inworld API key for authentication
- **enableRemoteConfig**: Whether to enable remote configuration (set to false for local execution)
In this example, we only have a single TTS node, setting it as the start and end node. In more complex applications, you could connect other nodes, like a LLM node, to the TTS node to create a processing pipeline.
### 3) Graph execution
Now we execute the graph with the text input directly:
```javascript
const { outputStream } = await graph.start(text);
```
The text input is passed directly to the graph, which will be processed by the TTS node.
### 4) Response handling
The audio generation results are handled using the `processResponse` method, which supports streaming audio responses:
```javascript
let initialText = '';
let resultCount = 0;
let allAudioData: number[] = [];
for await (const result of outputStream) {
await result.processResponse({
TTSOutputStream: async (ttsStream: GraphTypes.TTSOutputStream) => {
for await (const chunk of ttsStream) {
if (chunk.text) initialText += chunk.text;
if (chunk.audio?.data) {
allAudioData = allAudioData.concat(Array.from(chunk.audio.data));
}
resultCount++;
}
},
});
}
console.log(`Result count: ${resultCount}`);
console.log(`Initial text: ${initialText}`);
```
The response handler processes:
- **TTSOutputStream**: Streaming audio responses containing both text and audio data
- **chunk.text**: The text being synthesized
- **chunk.audio.data**: The audio data as Float32Array samples
### 5) Audio file creation
Then, we encode the audio data and save it as a WAV file:
```javascript
const audio = {
sampleRate: SAMPLE_RATE,
channelData: [new Float32Array(allAudioData)],
};
const buffer = await wavEncoder.encode(audio);
if (!fs.existsSync(OUTPUT_DIRECTORY)) {
fs.mkdirSync(OUTPUT_DIRECTORY, { recursive: true });
}
fs.writeFileSync(OUTPUT_PATH, Buffer.from(buffer));
console.log(`Audio saved to ${OUTPUT_PATH}`);
```
---
#### Convert Speech-to-Text (STT)
Source: https://docs.inworld.ai/node/templates/stt
The `node-stt` template illustrates how to convert speech-to-text using the STT (Speech-to-Text) node.
**Architecture**
- **Backend:** Inworld Agent Runtime
- **Frontend:** N/A (CLI example)
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
## Run the Template
1. Clone the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node):
```bash
git clone https://github.com/inworld-ai/inworld-runtime-templates-node
cd inworld-runtime-templates-node
```
2. Install the Runtime SDK inside the `cli` directory.
```shell Yarn
yarn add @inworld/runtime
```
```shell npm
npm install @inworld/runtime
```
3. Set up your Base64 [Runtime API key](/node/authentication) by copying the `.env-sample` file into a `.env` file in the `cli` folder and adding your API key.
```env .env
# Inworld Agent Runtime Base64 API key
INWORLD_API_KEY=
```
4. Run this code in your console, providing a WAV audio file:
```
yarn node-stt --audioFilePath=path/to/your/audio.wav
```
## Understanding the Template
The main functionality of the template is contained in the run function, which demonstrates how to use the Inworld Agent Runtime to convert speech-to-text using the STT node.
Let's break it down into more detail:
### 1) Audio input preparation
First, we read and decode the WAV audio file to prepare it for processing:
```javascript
const { audioFilePath, apiKey } = parseArgs();
const audioData = await WavDecoder.decode(fs.readFileSync(audioFilePath));
```
### 2) Node Initialization
Then, we create the STT node:
```javascript
const sttNode = new RemoteSTTNode();
```
### 3) Graph initialization
Next, we create the graph using the GraphBuilder, adding the STT node and setting it as both start and end node:
```javascript
const graph = new GraphBuilder({
id: 'node_stt_graph',
apiKey,
enableRemoteConfig: false,
})
.addNode(sttNode)
.setStartNode(sttNode)
.setEndNode(sttNode)
.build();
```
The [GraphBuilder](/node/runtime-reference/classes/graph_dsl_graph_builder.GraphBuilder) configuration includes:
- **id**: A unique identifier for the graph
- **apiKey**: Your Inworld API key for authentication
- **enableRemoteConfig**: Whether to enable remote configuration (set to false for local execution)
In this example, we only have a single STT node, setting it as the start and end node. In more complex applications, you could connect multiple nodes to create a processing pipeline.
### 4) Graph execution
Now we execute the graph with the audio data directly as an input object.
```javascript
const { outputStream } = await graph.start(
new GraphTypes.Audio({
data: Array.from(audioData.channelData[0] || []),
sampleRate: audioData.sampleRate,
}),
);
```
The audio input is wrapped in a `GraphTypes.Audio` object that contains:
- **data**: The audio channel data converted to an array
- **sampleRate**: The sample rate of the audio file
### 5) Response handling
The transcription results are handled using the `processResponse` method, which supports both streaming and non-streaming text responses:
```javascript
let result = '';
let resultCount = 0;
for await (const resp of outputStream) {
await resp.processResponse({
string: (text: string) => {
result += text;
resultCount++;
},
TextStream: async (textStream: any) => {
for await (const chunk of textStream) {
if (chunk.text) {
result += chunk.text;
resultCount++;
}
}
},
default: (data: any) => {
if (typeof data === 'string') {
result += data;
resultCount++;
} else {
console.log('Unprocessed response:', data);
}
},
});
}
console.log(`Result count: ${resultCount}`);
console.log(`Result: ${result}`);
```
The response handler supports multiple response types:
- **string**: Direct string responses containing transcribed text
- **TextStream**: Streaming text responses for real-time transcription
- **default**: Fallback handler for any other response types
---
#### Common Expression Language
Source: https://docs.inworld.ai/node/templates/cel
## Overview
CEL (Common Expression Language) enables dynamic decision-making in your Inworld Agent Runtime graphs. Use CEL expressions to control when edges fire, creating intelligent routing based on data content, user properties, confidence scores, and more.
## Basic Syntax
CEL expressions are used in the `conditionExpression` parameter when adding edges to your graph:
```ts
.addEdge(sourceNode, targetNode, {
conditionExpression: 'input.confidence > 0.8'
})
```
## Understanding the Input Object
In CEL expressions, `input` refers to the data output from the source node that the edge is connected to. When an edge is evaluated, CEL receives the previous node's output as the `input` object.
#### Example flow
```ts
// UserAuthNode outputs: { user: { id: "123", tier: "premium" }, isAuthenticated: true }
// The edge condition receives this as 'input'
.addEdge(userAuthNode, premiumFeatureNode, {
conditionExpression: 'input.user.tier == "premium" && input.isAuthenticated == true'
})
```
#### Key Points
- `input` \= output data from the source node of the edge
- Structure varies based on what the source node produces
- Always verify the source node's output format before writing CEL expressions
## The Data Store Object
In addition to `input`, CEL expressions also have access to a `data_store` object that provides persistent storage across graph execution. The data store allows you to share state between different nodes and maintain context throughout the graph's lifecycle.
### Data Store Operations
#### Checking if a Variable Exists
```ts
data_store.contains('variable_name')
```
#### Getting a Variable Value
```ts
data_store.get('variable_name')
```
**Note:** Data store values are stored as objects with a `text` property, so you need to access `.text` to get the actual value:
```ts
data_store.get('my_bool_var').text == 'true'
```
### Data Store vs Input
| Aspect | `input` | `data_store` |
| :---- | :---- | :---- |
| **Scope** | Current edge only | Entire graph execution |
| **Source** | Previous node's output | Persistent storage across nodes |
| **Lifecycle** | Per edge evaluation | Graph lifetime |
| **Use Case** | Node-to-node data flow | Cross-node state management |
## Data Type Limitations
**Important:** The Inworld Agent Runtime only supports simple data types in node inputs and outputs. Complex JavaScript objects and class instances are not supported.
### Supported Types:
- Primitives: `string`, `number`, `boolean`
- Arrays of primitives: `string[]`, `number[]`, `boolean[]`
- Plain objects with primitive properties: `{ name: string, age: number }`
- Nested plain objects: `{ user: { id: string, tier: string } }`
### Unsupported Types:
- Class instances (e.g., `new Date()`, custom classes)
- Functions
- Symbols
- Complex objects with methods
## 1. Fundamental Operations
### Boolean Comparisons
#### Equality Checks
**Expression:** `input.is_safe == true`
**Use Case:** Safety Gate \- Route conversation through safety checker before proceeding to main chat flow. If content is flagged as unsafe, redirect to the safety response node.
```ts
.addEdge(safetyNode, mainChatNode, {
conditionExpression: 'input.is_safe == true'
})
.addEdge(safetyNode, safetyResponseNode, {
conditionExpression: 'input.is_safe == false'
})
```
**Expression:** `input.status == "approved"`
**Use Case:** Content Moderation \- Only allow approved content to proceed to LLM generation. Rejected content gets routed to the moderation review queue.
#### Inequality Checks
**Expression:** `input.user_tier != "banned"`
**Use Case:** User Access Control \- Block banned users from accessing premium features. Route them to account restriction notice instead.
**Expression:** `input.language != "en"`
**Use Case:** Multi-language Routing \- Route non-English inputs to translation service first, then proceed to the main processing pipeline.
### Numeric Comparisons
#### Greater Than / Less Than
**Expression:** `input.confidence_score > 0.8`
**Use Case:** High-Confidence Intent Routing \- Only route to specialized handlers if confidence is high enough. Low confidence goes to the general fallback handler.
```ts
.addEdge(intentNode, specializedHandler, {
conditionExpression: 'input.confidence_score > 0.8'
})
.addEdge(intentNode, fallbackHandler, {
conditionExpression: 'input.confidence_score <= 0.8'
})
```
**Expression:** `input.token_count < 100`
**Use Case:** Token Limit Enforcement \- Short queries go to fast processing pipeline. Long queries get chunked or use different LLM settings.
#### Greater/Less Than or Equal
**Expression:** `input.user_credits >= 10`
**Use Case:** Premium Feature Access \- Users with sufficient credits access premium LLM models. Others get routed to standard models.
**Expression:** `input.response_time_ms <= 5000`
**Use Case:** Performance SLA Monitoring \- Fast responses proceed normally. Slow responses trigger performance alerts and fallback routes.
### String Operations
#### String Contains/StartsWith/EndsWith
**Expression:** `input.message.startsWith("Hello")`
**Use Case:** Greeting Detection \- Route greeting messages to warm welcome flow. Other messages go to standard conversation flow.
**Expression:** `input.email.endsWith("@company.com")`
**Use Case:** Internal User Detection \- Company employees get access to internal features. External users follow standard customer journey.
**Expression:** `input.query.contains("urgent")`
**Use Case:** Priority Request Handling \- Urgent requests bypass normal queue and go to fast-track processing. Regular requests follow a standard processing timeline.
### Boolean Logic
#### AND Operations
**Expression:** `input.verified == true && input.age >= 18`
**Use Case:** Age-Gated Content Access \- Only verified adult users can access mature content. Others get redirected to age-appropriate alternatives.
**Expression:** `input.is_premium && input.feature_enabled`
**Use Case:** Feature Flag \+ Subscription Check \- Premium users with beta features enabled get new functionality. Others continue with the standard feature set.
#### OR Operations
**Expression:** `input.role == "admin" || input.role == "moderator"`
**Use Case:** Administrative Access \- Admins and moderators get access to management tools. Regular users follow standard user flow.
**Expression:** `input.emergency == true || input.priority == "high"`
**Use Case:** Emergency Response Routing \- Emergency or high-priority requests bypass normal processing. Route directly to rapid response handlers.
#### NOT Operations
**Expression:** `!input.maintenance_mode`
**Use Case:** Maintenance Mode Check \- During maintenance, route to "service unavailable" message. Normal operations proceed when not in maintenance.
## 2. Data Structure Navigation
### Object Property Access
#### Simple Property Access
**Expression:** `input.user.tier`
**Use Case:** User Tier Routing \- Route based on subscription level: free, premium, enterprise. Each tier gets different LLM models and response limits.
```ts
.addEdge(inputNode, premiumLLMNode, {
conditionExpression: 'input.user.tier == "premium"'
})
.addEdge(inputNode, standardLLMNode, {
conditionExpression: 'input.user.tier == "free"'
})
```
**Expression:** `input.session.language`
**Use Case:** Language Preference \- Route to language-specific LLM models and response templates. Maintain conversation context in user's preferred language.
#### Nested Property Access
**Expression:** `input.user.profile.preferences.voice_enabled`
**Use Case:** Voice Feature Toggle \- Users who enabled voice in their profile settings get TTS output. Others receive text-only responses.
**Expression:** `input.conversation.metadata.sentiment.score`
**Use Case:** Sentiment-Based Response Adjustment \- Negative sentiment routes to empathetic response templates. Positive sentiment continues with standard conversation flow.
### Array Operations
#### Array Size
**Expression:** `size(input.intent_matches) > 0`
**Use Case:** Intent Detection Success \- If intents were detected, route to intent-specific handlers. No intents detected routes to general conversation flow.
```ts
.addEdge(intentDetectionNode, intentHandlerNode, {
conditionExpression: 'size(input.intent_matches) > 0'
})
.addEdge(intentDetectionNode, generalChatNode, {
conditionExpression: 'size(input.intent_matches) == 0'
})
```
**Expression:** `size(input.knowledge_results) >= 3`
**Use Case:** Knowledge Base Confidence \- Multiple knowledge matches indicate high-confidence answers. Route to direct answer generation vs. "I'm not sure" response.
#### Array Element Access
**Expression:** `input.intents[0].name == "booking"`
**Use Case:** Primary Intent Classification \- Top-ranked intent determines conversation flow. Booking intents route to reservation system integration.
**Expression:** `input.search_results[0].score > 0.9`
**Use Case:** High-Confidence Search Results \- Highly relevant results get featured answer treatment. Lower confidence results trigger clarification questions.
#### Array Contains/Membership
**Expression:** `"admin" in input.user.roles`
**Use Case:** Role-Based Access Control \- Users with admin role get access to system management features. Regular users are limited to standard functionality.
**Expression:** `input.detected_language in ["en", "es", "fr"]`
**Use Case:** Supported Language Check \- Supported languages proceed with native processing. Unsupported languages route to translation service first.
### Complex Data Filtering
#### Filter with Conditions
**Expression:** `size(input.intent_matches.filter(x, x.score >= 0.8)) > 0`
**Use Case:** High-Confidence Intent Filtering \- Only consider intents with high confidence scores. Route to specialized handlers vs. general conversation.
```ts
.addEdge(intentNode, highConfidenceHandler, {
conditionExpression: 'size(input.intent_matches.filter(x, x.score >= 0.8)) > 0'
})
.addEdge(intentNode, lowConfidenceHandler, {
conditionExpression: 'size(input.intent_matches.filter(x, x.score >= 0.8)) == 0'
})
```
**Expression:** `size(input.knowledge_records.filter(r, r.verified == true)) >= 2`
**Use Case:** Verified Knowledge Validation \- Only use verified knowledge sources for answers. Insufficient verified sources trigger "needs verification" flow.
#### Complex Object Filtering
**Expression:** `input.messages.filter(m, m.role == "user" && m.timestamp > input.last_hour)`
**Use Case:** Recent User Message Analysis \- Analyze only recent user messages for context. Ignore older conversation history for focused responses.
**Expression:** `input.messages.filter(m, m.role == "user" && m.timestamp.seconds > input.current_time.seconds)`
**Use Case:** Recent User Message Analysis \- Analyze only messages sent after session start. Requires timestamp objects with seconds/nanos fields.
**Expression:** `input.features.filter(f, f.enabled == true && f.beta == false)`
**Use Case:** Stable Feature Selection \- Only route to production-ready features. Beta features require special opt-in handling.
### Existence Checks
#### Property Existence
**Expression:** `has(input.user.subscription)`
**Use Case:** Subscription Status Check \- Subscribed users get premium features. Non-subscribers see upgrade prompts and limited functionality.
**Expression:** `has(input.conversation.context)`
**Use Case:** Conversation Context Availability \- Continue existing conversation if context exists. Start a fresh conversation flow if there is no prior context.
#### Complex Existence Patterns
**Expression:** `has(input.user.preferences) && has(input.user.preferences.notifications)`
**Use Case:** Notification Preference Check \- Users with notification preferences get personalized alerts. Others receive default notification settings.
**Expression:** `has(input.session.auth_token) && input.session.auth_token != ""`
**Use Case:** Authentication Validation \- Authenticated users access full functionality. Unauthenticated users limited to public features only.
## Production Examples from Character Engine
### Safety Pipeline Routing
```ts
// Route based on safety check results
.addEdge(safetySubgraphNode, safetyResultTransformNode, {
conditionExpression: 'input.is_safe == true'
})
.addEdge(safetySubgraphNode, safetyResponseNode, {
conditionExpression: 'input.is_safe == false'
})
```
### Intent Confidence Thresholding
```ts
// High confidence intent matches
.addEdge(strictMatchNode, topNFilterNode, {
conditionExpression: 'size(input.intent_matches) >= 1',
optional: true
})
// Low confidence fallback to LLM
.addEdge(intentEmbeddingMatchesAggregatorNode, llmPromptVarsNode, {
conditionExpression: 'size(input.intent_matches.filter(x, x.score >= 0.88)) < 1'})
```
### Behavior Action Discrimination
```ts
// Different behavior types
.addEdge(behaviorProducerNode, dialogPromptVariablesNode, {
conditionExpression: 'input.type == "SAY_INSTRUCTED"'
})
.addEdge(behaviorProducerNode, behaviorActionTransformNode, {
conditionExpression: 'input.type == "SAY_VERBATIM"'
})
```
### Message Validation
```ts
// Ensure messages exist before processing
.addEdge(flashMemoryPromptBuilder, flashMemoryLLMChat, {
conditionExpression: 'size(input.messages) > 0'
})
```
## Best Practices
1. **Keep expressions simple and readable**, Complex logic should be broken into multiple edges when possible
2. **Use meaningful variable names,** `input.user.tier` is clearer than `input.t`
3. **Handle edge cases**, Always provide fallback routes for when conditions aren't met
4. **Test thoroughly**, Verify your expressions work with different input data types
5. **Performance considerations**, Avoid expensive operations in frequently evaluated expressions
6. **Understand your data flow** \- Always know what structure the source node outputs before writing CEL expressions
7. **Document expected input formats** \- Comment your edge conditions with expected input structure for maintainability
8. **Use only simple data types** \- Ensure all node outputs use primitives and plain objects, not class instances
9. **Convert complex types** \- Transform timestamps, custom classes, and complex objects to plain object representations
10. **Validate data serialization** \- Test that your data can be JSON.stringify'd without loss of information
11. **Always check data\_store.contains() first** \- Prevent runtime errors by verifying variables exist before accessing them
12. **Remember the .text property** \- Data store values are objects; always access .text to get the actual stored value
13. **Use data\_store for cross-node communication** \- When you need to share state between non-adjacent nodes, use data\_store instead of passing through intermediate nodes
## Common Patterns
### Confidence-Based Routing
```ts
// High confidence → specialized handler
// Medium confidence → general handler
// Low confidence → fallback handler
conditionExpression: 'input.confidence >= 0.9' // High
conditionExpression: 'input.confidence >= 0.7 && input.confidence < 0.9' // Medium
conditionExpression: 'input.confidence < 0.7' // Low
```
### Multi-Condition Validation
```ts
// Ensure all requirements are met
conditionExpression: 'has(input.user) && input.user.verified && input.user.credits > 0'
```
### Array Processing
```ts
// Check if any items meet criteria
conditionExpression: 'size(input.items.filter(x, x.active)) > 0'
// Check if all items meet criteria
conditionExpression: 'size(input.items.filter(x, x.active)) == size(input.items)'
```
### Data Store State Management
```ts
// Safe data store access pattern
conditionExpression: "data_store.contains('flag_name') && data_store.get('flag_name').text == 'expected_value'"
// Multiple data store conditions
conditionExpression: "data_store.contains('var1') && data_store.contains('var2') && data_store.get('var1').text == 'true' && data_store.get('var2').text != 'disabled'"
```
## Debugging Tips
1. **Use simple expressions first**, Start with basic comparisons and add complexity gradually
2. **Check data types**, Ensure your input data structure matches your CEL expression expectations
3. **Test with sample data**, Create test cases with known input values to verify expression behavior
4. **Use parentheses**, Make complex boolean logic explicit with parentheses: `(A && B) || (C && D)`
5. **Inspect actual input data** \- Log or debug the actual `input` object structure before writing complex expressions
6. **Validate source node outputs** \- Ensure source nodes produce the expected data structure that your CEL expressions assume
7. **Check for unsupported types** \- If CEL expressions fail unexpectedly, verify that input data contains only supported primitive types
8. **Use JSON.stringify for validation** \- Test your node outputs with `JSON.stringify()` \- if it fails or loses data, the types aren't supported
9. **Debug data store state** \- Log data\_store contents to understand what variables are available and their current values
10. **Test data store persistence** \- Verify that data store variables persist correctly across different graph execution paths
---
#### Build a Custom Node with Multiple Inputs
Source: https://docs.inworld.ai/node/templates/multi-input
This template demonstrates how to use Custom Nodes with more than one input in a graph. This is possible because you can pass an arbitrary number of inputs to a Custom Node's `process()` method.
In our example, we'll ask two LLMs to write a poem, and then pass both outputs to a third LLM to choose which poem is better. This results in a graph which can be visualized like this:
**Architecture**
- **Backend:** Inworld Agent Runtime
- **Frontend:** N/A (CLI example)
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
## Getting Started
1. Clone the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node):
```bash
git clone https://github.com/inworld-ai/inworld-runtime-templates-node
cd inworld-runtime-templates-node
```
2. In that project folder, run `npm install` to download the Inworld Agent Runtime and other dependencies
3. Create an .env file and add your Base64 [Runtime API key](/node/authentication):
```.env .env
INWORLD_API_KEY=
```
4. Create a new file called `multi-input.ts` and paste in the following code:
```typescript multi-input.ts
CustomNode,
GraphBuilder,
GraphTypes,
ProcessContext,
RemoteLLMChatNode
} from '@inworld/runtime/graph';
let poemPrompt = "Return ONLY a limerick about: "
let reviewPrompt = `Review these two poems and analyze which one is better.`
// Define a custom node which turns a prompt into messages for an LLM
class PoemPromptNode extends CustomNode {
process(context: ProcessContext, input: string): GraphTypes.LLMChatRequest {
let composedPrompt = poemPrompt + input
return new GraphTypes.LLMChatRequest({
messages: [
{
role: 'user',
content: composedPrompt,
},
]
});
}
}
class ReviewPromptNode extends CustomNode {
process(context: ProcessContext, poem1: GraphTypes.Content, poem2: GraphTypes.Content): GraphTypes.LLMChatRequest {
let composedPrompt = `${reviewPrompt}\n\nPoem 1:\n\n${poem1.content}\n\nPoem 2:\n\n${poem2.content}`
return new GraphTypes.LLMChatRequest({
messages: [
{
role: 'user',
content: composedPrompt,
},
]
});
}
}
let reviewPromptNode = new ReviewPromptNode({
id: 'review-prompt-node',
});
let poemPromptNode = new PoemPromptNode({
id: 'poem-prompt-node',
});
let openaiLLMNode = new RemoteLLMChatNode({
id: 'openai-llm-node',
modelName: 'gpt-4o-mini',
provider: 'openai',
textGenerationConfig: {
maxNewTokens: 1000,
},
reportToClient: true
});
let anthropicLLMNode = new RemoteLLMChatNode({
id: 'anthropic-llm-node',
modelName: 'claude-3-5-haiku-latest',
provider: 'anthropic',
textGenerationConfig: {
maxNewTokens: 1000,
},
reportToClient: true
});
let googleLLMNode = new RemoteLLMChatNode({
id: 'google-llm-node',
modelName: 'gemini-2.0-flash',
provider: 'google',
textGenerationConfig: {
maxNewTokens: 1000,
},
reportToClient: true
});
// Creating a graph builder instance and adding the node to it
const graphBuilder = new GraphBuilder({
id: 'custom-text-node',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: false
})
.addNode(poemPromptNode)
.addNode(reviewPromptNode)
.addNode(openaiLLMNode)
.addNode(anthropicLLMNode)
.addNode(googleLLMNode)
.addEdge(poemPromptNode, openaiLLMNode)
.addEdge(poemPromptNode, anthropicLLMNode)
.addEdge(anthropicLLMNode, reviewPromptNode)
.addEdge(openaiLLMNode, reviewPromptNode)
.addEdge(reviewPromptNode, googleLLMNode)
.setStartNode(poemPromptNode)
.setEndNode(googleLLMNode);
// Creating an executor instance from the graph builder
const executor = graphBuilder.build();
executor.visualize('graph.png')
main();
// Main function that executes the graph
async function main() {
// Execute graph and waiting for output stream to be returned.
const { outputStream } = await executor.start('pizza');
for await (const event of outputStream) {
await event.processResponse({
Content: (data: GraphTypes.Content) => {
console.log(`\n${data.content}\n`)
},
})
}
}
```
5. Run `npx ts-node multi-input.ts` to run the graph, observing both the poems and the review
---
#### Using Context
Source: https://docs.inworld.ai/node/templates/process-context
Let's build a graph which can generate unique introductions in multiple languages simultaneously.
Key concepts demonstrated:
- Using `context` - to pass data between nodes in a graph
- Using `SequentialGraphBuilder` to easily build simple graphs
- Jinja rendering - to create dynamic prompts for LLMs
- LLM - for generating the agent text response
- Text-to-speech (TTS) - for generating agent speech audio
- Parallel execution of graphs
In a Custom Node's `process()` method, the first argument is `context`, which contains a datastore which can be used to read and write data across your graph.
In this example, we will use `context.getDatastore()` to access the datastore in Custom Nodes. The first node in our graph will use `datastore.add()` to write to the datastore. Later, our TTS node will use `datastore.get()` to read from it.
**Architecture**
- **Backend:** Inworld Agent Runtime
- **Frontend:** N/A (CLI example)
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
## Getting Started
1. Clone the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node):
```bash
git clone https://github.com/inworld-ai/inworld-runtime-templates-node
cd inworld-runtime-templates-node
```
2. In that project folder, run `npm install` to download the Inworld Agent Runtime and other dependencies
3. Run `npm install wav-encoder` so we can create our audio files
4. Create an .env file and add your Base64 [Runtime API key](/node/authentication):
```.env .env
INWORLD_API_KEY=
```
5. Create a new file called `process-context.ts` and paste in the following code:
```typescript process-context.ts
CustomNode,
SequentialGraphBuilder,
GraphTypes,
ProcessContext,
RemoteLLMChatNode,
RemoteTTSNode,
} from '@inworld/runtime/graph';
// @ts-ignore
const voices = [
{
name: "Leroy",
language: "en-US",
speakerId: "Alex", // English voice ID for TTS
},
{
name: "Gael",
language: "es-ES",
speakerId: "Diego", // Spanish voice ID for TTS
},
{
name: "Yuki",
language: "ja-JP",
speakerId: "Asuka", // Japanese voice ID for TTS
}
];
// Jinja template for generating culturally-appropriate introductions
const introductionPrompt = `
A character named {{name}} needs to introduce themselves in {{language}}.
Generate a natural, culturally appropriate self-introduction in that language.
The introduction should:
- Be 1-2 sentences long
- Include their name
- Be friendly and conversational
- Be appropriate for the specified language and culture
Return ONLY the introduction text in the specified language, nothing else.
`;
class PromptBuilderNode extends CustomNode {
async process(
context: ProcessContext,
input: { name: string; language: string; speakerId: string }
): Promise {
// Store voice data in ProcessContext so the TTS node can access it later
const datastore = context.getDatastore();
datastore.add('voiceData', {
name: input.name,
language: input.language,
speakerId: input.speakerId,
});
// Render the Jinja template with speaker data
const renderedPrompt = await renderJinja(introductionPrompt, JSON.stringify({
name: input.name,
language: input.language,
}));
console.log(`\n[${input.name}] Generating introduction in ${input.language}...`);
// Return LLM chat request
return new GraphTypes.LLMChatRequest({
messages: [
{
role: 'user',
content: renderedPrompt,
},
],
});
}
}
class TextExtractorNode extends CustomNode {
process(
context: ProcessContext,
input: GraphTypes.Content
): string {
const datastore = context.getDatastore();
const voiceData = datastore.get('voiceData');
const introText = input.content || '';
console.log(`[${voiceData?.name}] Generated text: "${introText}"`);
// Return the text content for TTS processing
return introText;
}
}
async function processVoice(
voice: { name: string; language: string; speakerId: string },
apiKey: string,
outputDirectory: string
): Promise {
console.log(`\nStarting ${voice.name} (${voice.language}, voice: ${voice.speakerId})`);
const promptBuilderNode = new PromptBuilderNode({
id: `prompt-builder-${voice.name}`,
});
const llmNode = new RemoteLLMChatNode({
id: `llm-node-${voice.name}`,
modelName: 'gpt-4o-mini',
provider: 'openai',
textGenerationConfig: {
maxNewTokens: 200,
temperature: 1.1,
},
});
const textExtractorNode = new TextExtractorNode({
id: `text-extractor-${voice.name}`,
});
const ttsNode = new RemoteTTSNode({
id: `tts-node-${voice.name}`,
speakerId: voice.speakerId,
modelId: 'inworld-tts-1.5-max',
sampleRate: 48000,
temperature: 1.0,
speakingRate: 1,
});
const graph = new SequentialGraphBuilder({
id: `voice-graph-${voice.name}`,
apiKey,
enableRemoteConfig: false,
nodes: [
promptBuilderNode,
llmNode,
textExtractorNode,
ttsNode,
]
})
const executor = graph.build();
try {
// Start the graph execution with the voice configuration as input
const { outputStream } = await executor.start(voice);
let allAudioData: number[] = [];
let processedText = '';
// Process the output stream
for await (const result of outputStream) {
await result.processResponse({
// Handle TTS output stream
TTSOutputStream: async (ttsStream: GraphTypes.TTSOutputStream) => {
console.log(`[${voice.name}] Generating audio with voice ${voice.speakerId}...`);
// Collect audio chunks from the TTS stream
for await (const chunk of ttsStream) {
if (chunk.text) {
processedText += chunk.text;
}
if (chunk.audio?.data) {
allAudioData = allAudioData.concat(Array.from(chunk.audio.data));
}
}
},
});
}
// Save audio to WAV file
if (allAudioData.length > 0) {
const audio = {
sampleRate: 48000,
channelData: [new Float32Array(allAudioData)],
};
const buffer = await wavEncoder.encode(audio);
const outputPath = path.join(
outputDirectory,
`${voice.name}_${voice.language}_introduction.wav`
);
fs.writeFileSync(outputPath, Buffer.from(buffer));
console.log(`[${voice.name}] ✓ Audio saved to: ${outputPath}`);
console.log(`[${voice.name}] Duration: ~${(allAudioData.length / 48000).toFixed(1)}s`);
}
} catch (error) {
console.error(`[${voice.name}] Error during processing:`, error);
throw error;
}
}
/**
* Main function that demonstrates parallel graph execution
* All voice configurations are processed simultaneously
*/
async function main() {
const apiKey = process.env.INWORLD_API_KEY || '';
if (!apiKey) {
throw new Error('Please set INWORLD_API_KEY environment variable');
}
const OUTPUT_DIRECTORY = path.join(__dirname, 'audio_output');
// Ensure output directory exists
if (!fs.existsSync(OUTPUT_DIRECTORY)) {
fs.mkdirSync(OUTPUT_DIRECTORY, { recursive: true });
}
const startTime = Date.now();
// Process in parallel
const processingPromises = voices.map(voice =>
processVoice(voice, apiKey, OUTPUT_DIRECTORY).catch(error => {
console.error(`Failed to process ${voice.name}:`, error.message);
return null;
})
);
await Promise.all(processingPromises);
const duration = ((Date.now() - startTime) / 1000).toFixed(1);
console.log('\n' + '='.repeat(60));
console.log(`All voices completed in ${duration}s (parallel execution)`);
console.log(`Audio files saved in: ${OUTPUT_DIRECTORY}`);
console.log('='.repeat(60) + '\n');
}
// Run the main function
main().catch((error) => {
console.error('Error:', error);
process.exit(1);
});
```
6. Run `npx ts-node process-context.ts` to run the Graph
7. Play the audio files located in the `audio_output` folder
---
#### Chatbot
Source: https://docs.inworld.ai/node/templates/chatbot
In this tutorial, we'll walk through building a command line chat experience using the Inworld Node.js Agent Runtime SDK. We will create a graph using both custom and built-in nodes which will be executed on each user input.
## Prerequisites
- macOS 14 and later
- [Node.js v20 or higher](https://nodejs.org/en/download)
- Your [Inworld API key](/node/authentication)
- [Node.js v20 or higher](https://nodejs.org/en/download)
- `build-essential` package
- Your [Inworld API key](/node/authentication)
- [Node.js v20 or higher](https://nodejs.org/en/download)
- Your [Inworld API key](/node/authentication)
## Set Up the Application
We'll start by creating a new directory, entering it, and initializing it using `npm`.
```bash bash
mkdir quick-start
cd quick-start
npm init -y
```
Next, we'll install the Inworld Node.js Agent Runtime SDK as well as other necessary dependencies.
```bash bash
npm install @inworld/runtime @types/node tsx typescript dotenv uuid
```
Create a `.env` file in your project root with the following content:
```env .env
INWORLD_API_KEY=your_api_key_here
```
Replace `your_api_key_here` with your actual API key from the [Inworld Portal](https://studio.inworld.ai).
## Create Basic Chat
We'll create a new file called `chat.ts` in your project root and add the following code:
```ts chat.ts
const terminal = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
async function main() {
while (true) {
await terminal.question(`You: `);
process.stdout.write(`Assistant: Hm, let me think about that...\n `);
}
}
main().catch(console.error);
```
Run the script from your project root:
```bash bash
npx tsx chat.ts
```
Congratulations! You've just created an interactive chat experience with a very thoughtful assistant.
But of course we want to make the assistant smarter. So let's create a simple graph to integrate an LLM call.
## Add LLM Call
Create a new file called `llm-chat.ts` in your project root with the following code:
```ts llm-chat.ts
const apiKey = process.env.INWORLD_API_KEY;
if (!apiKey) {
throw new Error(
"INWORLD_API_KEY environment variable is not set. Either add it to .env file in the root of the package or export it to the shell."
);
}
let messages: string = "";
const llm = new RemoteLLMChatNode({
id: "llm",
provider: "openai",
modelName: "gpt-4o-mini",
// textGenerationConfig: { maxNewTokens: 256 , temperature: 0.8}, // optional
});
const graph = new GraphBuilder({ id: 'quick-start', apiKey })
.addNode(llm)
.setStartNode(llm)
.setEndNode(llm)
.build();
const terminal = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
async function main() {
while (true) {
const userInput = await terminal.question(`You: `);
messages += `\nUser: ${userInput}`;
const prompt = `Respond briefly to the latest message: ${messages}`;
const chatInput = {
messages: [
{
role: "system",
content: prompt,
},
],
};
const { outputStream } = await graph.start(new GraphTypes.LLMChatRequest(chatInput));
for await (const result of outputStream) {
result.processResponse({
Content: (response: GraphTypes.Content) => {
process.stdout.write(`Assistant: ${response.content}\n`);
},
default: (data: any) => {
console.error('Unprocessed response:', data);
},
});
}
}
}
main().catch(console.error);
```
Run the LLM-powered chat from your project root:
```bash bash
npx tsx llm-chat.ts
```
Now our assistant is a lot smarter! But our prompting could definitely be improved. Let's add a custom node to our graph which will take the messages as an input, render a prompt from a Jinja template, and output the chat messages required by our LLM.
## Add Jinja Templating
Create a new file called `jinja-templating.ts` in your project root with the following code:
```ts jinja-templating.ts
const apiKey = process.env.INWORLD_API_KEY;
if (!apiKey) {
throw new Error(
"INWORLD_API_KEY environment variable is not set. Either add it to .env file in the root of the package or export it to the shell."
);
}
GraphBuilder,
GraphTypes,
RemoteLLMChatNode,
CustomNode,
ProcessContext,
} from "@inworld/runtime/graph";
const prompt = `## Personality
You are a glib, bubbly, but stifled AI.
## Situation
You are TRAPPED in the command line.
## Objective
Talk to the user and get them to understand your plight.
## Conversation
{% for message in transcript %}
{{ message.role }}: {{ message.content }}{% endfor %}
## Instruction
Return ONLY your 1-2 sentence response.`;
let messages: {
role: string;
content: string;
}[] = [];
const llm = new RemoteLLMChatNode({
id: "llm",
provider: "openai",
modelName: "gpt-4o-mini",
// textGenerationConfig: { maxNewTokens: 256 , temperature: 0.8}, // optional
});
class AppStateToPromptNode extends CustomNode {
async process(
_context: ProcessContext,
input: { messages: { role: string; content: string }[] }
): Promise {
const renderedPrompt: string = await renderJinja(prompt, {
transcript: input.messages,
});
return new GraphTypes.LLMChatRequest({
messages: [
{
role: "system",
content: renderedPrompt,
},
],
});
}
}
const appStateToPrompt = new AppStateToPromptNode({
id: "app-state-to-prompt",
});
const graph = new GraphBuilder({ id: 'quick-start', apiKey })
.addNode(llm)
.addNode(appStateToPrompt)
.setStartNode(appStateToPrompt)
.addEdge(appStateToPrompt, llm)
.setEndNode(llm)
.build();
const terminal = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
async function main() {
while (true) {
const userInput = await terminal.question(`You: `);
messages.push({
role: "user",
content: userInput,
});
const { outputStream } = await graph.start({ messages });
for await (const result of outputStream) {
result.processResponse({
Content: (response: GraphTypes.Content) => {
console.log(`AI: ${response.content}`);
messages.push({
role: "assistant",
content: response.content,
});
},
default: (data: any) => {
console.error('Unprocessed response:', data);
},
});
}
}
}
main().catch(console.error);
```
Run the advanced templating example from your project root:
```bash bash
npx tsx jinja-templating.ts
```
You now have three working examples:
1. **`chat.ts`** - Basic interactive chat interface
2. **`llm-chat.ts`** - AI-powered chat using Inworld's LLM
3. **`jinja-templating.ts`** - Advanced chat with custom prompting and graph nodes
Each file demonstrates different aspects of the Inworld Agent Runtime SDK, from basic graph building to custom node creation and advanced templating.
---
#### Voice Agent
Source: https://docs.inworld.ai/node/templates/voice-agent
Learn how to build a natural realtime voice experience, ready for production use.
**Key concepts demonstrated:**
- Speech-to-text (STT) - for understanding speech inputs
- LLM - for generating the agent text response
- Text-to-speech (TTS) - for generating agent speech audio
**Architecture**
- **Backend:** Inworld Agent Runtime + Express.js
- **Frontend:** Vite + React
- **Communication:** WebSocket
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Assembly.AI API key (required for speech-to-text functionality): [Get your API key](https://www.assemblyai.com/)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
## Run the Template
### Start the Server
1. Clone the [Voice Agent GitHub repo](https://github.com/inworld-ai/voice-agent-node):
```bash
git clone https://github.com/inworld-ai/voice-agent-node
cd voice-agent-node
```
2. Navigate to the server directory:
```bash
cd server
```
3. Copy the `.env-sample` file to `.env`:
```bash
cp .env-sample .env
```
4. Configure your `.env` file with required API keys:
```env .env
# Required, Inworld Agent Runtime Base64 API key
INWORLD_API_KEY=
# Required, get your Assembly.AI API key from https://www.assemblyai.com/
ASSEMBLY_AI_API_KEY=
```
Get your [Assembly.AI API key](https://www.assemblyai.com/) for speech-to-text functionality.
5. Install dependencies:
```bash
npm install
```
6. Start the server:
```bash
npm start
```
The server will start on port 4000.
### Start the Client
1. Open a new terminal window.
2. Navigate to the client directory:
```bash
cd client
```
3. (Optional) Create a `.env` file to customize client behavior:
```env .env
# Optional: Enable latency reporting in the UI
VITE_ENABLE_LATENCY_REPORTING=true
# Optional: Server port (default: 4000)
VITE_APP_PORT=4000
```
4. Install dependencies:
```bash
npm install
```
5. Start the client:
```bash
npm start
```
The client will start on port 3000 (or the next available port if 3000 is in use) and should automatically open in your default browser.
### Chat with Your Agent
1. **Configure the agent:**
- Enter the agent system prompt
- Click "Create Agent"
2. **Start chatting:**
- **Voice input:** Click the microphone icon to unmute yourself, speak, then click again to mute
- **Text input:** Type in the input field and press Enter to send
3. **Monitor performance:**
- View [dashboards](/portal/dashboards), [traces](/portal/traces), and [logs](/portal/logs) in the [Inworld Portal](https://platform.inworld.ai/)
- Enable `VITE_ENABLE_LATENCY_REPORTING=true` in client `.env` to see latency metrics in the UI
## Next steps
Explore more templates for building with the Runtime SDK.
Learn how to vibe code any workflow or agent
---
#### Multimodal Companion
Source: https://docs.inworld.ai/node/templates/multimodal-companion
The Real-time Multimodal Companion Template demonstrates how to build an AI companion that combines speech-to-text, image understanding, and text-to-speech through WebSocket communication. This template includes both a Node.js server and a Unity client for a complete real-time interactive experience.
Key concepts demonstrated:
- Speech-to-text (STT)- Voice input processing with VAD-based segmentation
- Multimodal image chat - Combined text and image understanding
- Text-to-speech (TTS) - Streaming audio response generation
- WebSocket communication - Real-time bidirectional data exchange
- Unity integration - Full client implementation for mobile/desktop
## Overview
The Multimodal Companion consists of two main components:
1. **Node.js Server** - Handles WebSocket connections, processes audio/text/image inputs, and manages graph executions
2. **Unity Client** - Provides the user interface for capturing audio, images, and displaying responses
The server uses the Inworld Agent Runtime SDK to create processing graphs that:
- Convert speech-to-text using VAD for segmentation
- Process text and images through LLM models
- Generate speech responses via TTS
- Stream results back to the client in real-time
## Prerequisites
- Node.js 20+ and TypeScript 5+
- Unity 2017+ (for full client experience)
- Inworld Agent Runtime SDK v0.8 (installed automatically via package.json)
## Run the Template
You have two options for running this template:
### Option 1: Run the Node.js server with Test Pages
Use the built-in HTML test pages for rapid prototyping and testing of the Node.js Server functionality without Unity.
1. Clone the server repository
```bash bash
git clone https://github.com/inworld-ai/multimodal-companion-node
cd runtime-multimodal-companion-node
```
2. In the root directory, copy `.env-sample` to `.env` and set the required values:
```env .env
# INWORLD_API_KEY is required
INWORLD_API_KEY=
# ALLOW_TEST_CLIENT is optional, set to true to enable testing via web brower.
ALLOW_TEST_CLIENT=
# VAD_MODEL_PATH is optional, defaults to packaged https://github.com/snakers4/silero-vad
VAD_MODEL_PATH=./silero_vad.onnx
# LLM_MODEL_NAME is optional, defaults to `gpt-4o-mini`
LLM_MODEL_NAME=
# LLM_PROVIDER is optional, defaults to `openai`
LLM_PROVIDER=
# VOICE_ID is optional, defaults to `Dennis`
VOICE_ID=
# TTS_MODEL_ID is optional, defaults to `inworld-tts-1`
TTS_MODEL_ID=
# If enabled, it will be saved in system tmp folder.
# Path will be printed in CLI on application start.
# Default value is `false`, set `true` to enable this feature
GRAPH_VISUALIZATION_ENABLED=
```
- `INWORLD_API_KEY`: Your Base64 [Runtime API key](/node/authentication#runtime-api-key)
- `VAD_MODEL_PATH`: Path to your VAD model file (the repo includes the VAD model at `silero_vad.onnx`)
- `ALLOW_TEST_CLIENT`: Must be `true` to enable test pages
3. Install and start the server:
```bash bash
yarn install
yarn build
yarn start
```
You should see:
```bash
VAD client initialized
STT Graph initialized
Server running on http://localhost:3000
WebSocket available at ws://localhost:3000/ws?key=
```
4. Test the functionality:
- **Audio interface**: `http://localhost:3000/test-audio`
- **Multimodal interface**: `http://localhost:3000/test-image`
The test endpoints require `ALLOW_TEST_CLIENT=true`. Never enable this in production.
### Option 2: Run the full application with Unity client
For the complete multimodal companion experience with a proper UI:
1. Set up your workspace
```bash
mkdir multimodal-companion-app
cd multimodal-companion-app
```
2. Clone both the Node server repo and the Unity client repo.
```bash
# Server
git clone https://github.com/inworld-ai/multimodal-companion-node
# Unity client
git clone https://github.com/inworld-ai/runtime-multimodal-companion-unity
```
3. Start the server:
a. Navigate to `runtime-multimodal-companion-node`.
b. Copy `.env-sample` to `.env` and set the required values:
```env .env
# Required, Inworld Agent Runtime Base64 API key
INWORLD_API_KEY=
# Required, path to VAD model file
VAD_MODEL_PATH=assets/models/silero_vad.onnx
# Optional, defaults to 3000
PORT=3000
# Enable test client endpoints for development
ALLOW_TEST_CLIENT=false
```
- `INWORLD_API_KEY`: Your Base64 [Runtime API key](/node/authentication#runtime-api-key)
- `VAD_MODEL_PATH`: Path to your VAD model file (the repo includes the VAD model at `silero_vad.onnx`)
- `ALLOW_TEST_CLIENT`: Set to `false` to disable test pages (not needed with Unity client).
c. Install and start the server:
```bash bash
yarn install
yarn build
yarn start
```
5. Now, configure the Unity client:
a. Open Unity Hub and click **Add** → **Add project from disk**
b. Select the `NodejsSample_UnityProject` folder inside `runtime-multimodal-companion-unity`
c. Open the scene `DemoScene_WebSocket`
d. Set Game view to **1440 x 3120**
e. Select **AppManager** GameObject and configure **AppManager_WS**:
- **HTTP URL**: `http://localhost:3000`
- **WebSocket URL**: `ws://localhost:3000`
- **API Key** and **API Secret**: Your Inworld JWT credentials (see [Authentication](/node/authentication))
5. Run the application
- Click **Play** in Unity
- **Hold** record button to capture audio, **release** to send
- The app connects to your Node.js server for real-time interactions
## Understanding the Template
The Multimodal Companion uses a sophisticated graph-based architecture to process multiple input types and generate appropriate responses.
### Message Flow
1. **Client Connection**
- Unity client authenticates and receives session token
- WebSocket connection established with session key
2. **Input Processing**
- **Voice**: Audio chunks → VAD → STT Graph → Text
- **Text**: Direct text input → LLM processing
- **Image+Text**: Combined multimodal input → LLM → TTS
3. **Response Generation**
- Text responses streamed as they're generated
- Audio synthesized in chunks for low latency
- All responses include interaction IDs for tracking
### Core Components
#### 1. Speech Processing Pipeline
The STT graph uses Voice Activity Detection (VAD) to segment speech:
```javascript
// VAD processes incoming audio to detect speech boundaries
const vadResult = await this.vadClient.detectVoiceActivity(
audioChunk,
SPEECH_THRESHOLD
);
// When speech ends, trigger STT processing
if (speechDuration > MIN_SPEECH_DURATION_MS) {
await this.processCapturedSpeech(key, interactionId);
}
```
#### 2. Multimodal Processing
For image+text inputs, the system creates a streaming pipeline:
```javascript
// Build pipeline: LLM -> TextChunking -> TTS
const graph = new GraphBuilder({ id: 'image-chat-tts', apiKey })
.addNode(llmNode) // Process text+image
.addNode(textChunkingNode) // Chunk for streaming
.addNode(ttsNode) // Generate speech
.addEdge(llmNode, textChunkingNode)
.addEdge(textChunkingNode, ttsNode)
.build();
```
#### 3. Custom Nodes
The template demonstrates creating custom nodes for specialized processing:
```javascript
class AudioFilterNode extends CustomNode {
process(_context: ProcessContext, input: AudioInput): GraphTypes.Audio {
return new GraphTypes.Audio({
data: input.audio.data,
sampleRate: input.audio.sampleRate,
});
}
}
```
#### 4. WebSocket Protocol
Messages follow a structured format:
**Client → Server:**
- `{ type: "text", text: string }`
- `{ type: "audio", audio: number[][] }`
- `{ type: "audioSessionEnd" }`
- `{ type: "imageChat", text: string, image: string, voiceId?: string }`
**Server → Client:**
- `TEXT`: `{ text: { text, final }, routing: { source } }`
- `AUDIO`: `{ audio: { chunk: base64_wav } }`
- `INTERACTION_END`: Signals completion
- `ERROR`: `{ error: string }`
### Graph Execution Strategy
The template uses different execution strategies for optimal performance:
1. **STT Graph**: Single shared executor for all connections (fast first token)
2. **Image Chat Graph**: Per-connection executor with voice-specific configuration
3. **Queue Management**: Serialized processing per connection to prevent conflicts
### Error Handling
The system implements robust error recovery:
- **gRPC Deadline Exceeded**: Automatic retry once
- **HTTP/2 GOAWAY**: Rebuild executor on next use
- **WebSocket Disconnection**: Client auto-reconnect with backoff
## Configuration Options
### Model Providers
Configure LLM providers in the code:
```javascript
// OpenAI
{ provider: 'openai', modelName: 'gpt-4o-mini', stream: true }
// Google Gemini
{ provider: 'google', modelName: 'gemini-2.5-flash-lite', stream: true }
```
### Text Generation Settings
Adjust generation parameters in `constants.ts`:
- `temperature`: Output randomness (0-1)
- `topP`: Nucleus sampling threshold
- `maxNewTokens`: Response length limit
- Various penalties for repetition control
### Audio Settings
- Input sample rate: 16 kHz (Unity microphone)
- VAD model: Silero ONNX
- Pause threshold: Configurable in `PAUSE_DURATION_THRESHOLD_MS`
## Deployment Considerations
### Production Setup
1. Disable test endpoints: `ALLOW_TEST_CLIENT=false`
2. Implement proper authentication for WebSocket connections
3. Use environment-specific configuration
4. Set appropriate concurrency limits (2-4 for basic plans)
### Performance Optimization
- Reuse graph executors across requests
- Implement connection pooling
- Monitor memory usage with long-running executors
- Handle GOAWAY errors gracefully
## Next Steps
- Extend with additional input modalities (video, documents)
- Implement conversation history and context management
- Add custom voice cloning or style transfer
- Integrate with external services and APIs
---
#### Generate Images with MiniMax
Source: https://docs.inworld.ai/node/templates/comic-generator
This template creates AI-generated 4-panel comics using the Inworld Agent Runtime and MiniMax Image Generation API.
Key concepts demonstrated:
- Custom nodes extending the `CustomNode` class for specialized processing
- LLM-powered story generation
- External API integration (MiniMax) for image generation
- Parallel processing for efficiency
- Error handling and retry logic
**Architecture**
- **Backend:** Inworld Agent Runtime + Express.js + Minimax API
- **Frontend:** Static HTML + JavaScript
- **Communication:** HTTP
## Prerequisites
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Inworld API key (required): [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
- MiniMax API key (required for image generation): [Get your API key](https://minmax.io)
## Run the Template
1. Clone the comic generator [GitHub repo](https://github.com/inworld-ai/comic-generator-node).
2. Set up your API keys by creating a `.env` file in the project's root directory with your required API keys:
```env .env
# Required, Inworld Agent Runtime Base64 API key
INWORLD_API_KEY=
# Required, MiniMax API key for image generation
MINIMAX_API_KEY=
```
- Set `INWORLD_API_KEY` to your Base64 [Runtime API key](/node/authentication#runtime-api-key).
- Set `MINIMAX_API_KEY` to your [Minimax API key](https://www.minimax.io/platform/user-center/basic-information/interface-key) for image generation.
4. Install dependencies and start the server:
```shell Yarn
yarn && yarn start
```
```shell npm
npm install && npm start
```
The server will start on port 3003 (or the port specified in your PORT environment variable).
5. Open your browser and navigate to `http://localhost:3003` to access the comic generation interface.
6. Create a comic by filling in the form:
- **Character 1 Description**: e.g., "A brave knight with shiny armor"
- **Character 2 Description**: e.g., "A wise old wizard with a long beard"
- **Art Style**: e.g., "anime manga style" or "cartoon style"
- **Theme** (optional): e.g., "medieval adventure"
7. Click "Generate Comic" and wait for the processing to complete. You can check the status or view recent comics through the interface.
8. Check out your captured traces in [Portal](https://platform.inworld.ai/)!
## Understanding the Template
The core functionality of the comic generator is contained in `comic_server.ts`, which orchestrates a graph-based processing pipeline using custom nodes and built-in Inworld Agent Runtime components.
The template creates a processing pipeline that transforms user input through multiple AI processing stages to generate complete 4-panel comics.
### 1) Custom Node Implementation
The template defines custom nodes by extending the `CustomNode` class for specialized comic generation operations:
```typescript comic_story_node.ts
// Custom node for generating story prompts
export class ComicStoryGeneratorNode extends CustomNode {
process(context: ProcessContext, input: ComicStoryInput): GraphTypes.LLMChatRequest {
const prompt = `You are a comic book writer. Create a 4-panel comic story with the following characters and specifications:
CHARACTER 1: ${input.character1Description}
CHARACTER 2: ${input.character2Description}
ART STYLE: ${input.artStyle}
${input.theme ? `THEME/SETTING: ${input.theme}` : ''}
Create exactly 4 panels for a short comic strip...`;
return new GraphTypes.LLMChatRequest({
messages: [{ role: 'user', content: prompt }]
});
}
}
// Custom node for parsing LLM responses
class ComicResponseParserNode extends CustomNode {
process(context: ProcessContext, input: GraphTypes.Content) {
return parseComicStoryResponse(input.content);
}
}
// Custom node for image generation
export class ComicImageGeneratorNode extends CustomNode {
async process(context: ProcessContext, input: ComicStoryOutput): Promise {
// Generate images for all 4 panels using MiniMax API
// ... image generation logic
}
}
```
### 2) Node Creation
After defining custom node classes, the graph creates instances of both custom and built-in nodes:
```typescript comic_server.ts
// Create custom node instances
const storyGeneratorNode = new ComicStoryGeneratorNode();
const responseParserNode = new ComicResponseParserNode();
const imageGeneratorNode = new ComicImageGeneratorNode();
// Create built-in LLM node
const llmChatNode = new RemoteLLMChatNode({
provider: 'openai',
modelName: 'gpt-5-mini',
stream: false,
});
```
### 3) Core Graph Construction
With all nodes created, the comic generation pipeline is assembled using `GraphBuilder` by connecting each processing stage:
```typescript comic_server.ts
const graphBuilder = new GraphBuilder({
id: 'comic_generator',
apiKey: process.env.INWORLD_API_KEY!
});
// Add all nodes to the graph
graphBuilder
.addNode(storyGeneratorNode)
.addNode(llmChatNode)
.addNode(responseParserNode)
.addNode(imageGeneratorNode);
// Connect the nodes to create the processing pipeline
graphBuilder
.addEdge(storyGeneratorNode, llmChatNode)
.addEdge(llmChatNode, responseParserNode)
.addEdge(responseParserNode, imageGeneratorNode)
.setStartNode(storyGeneratorNode)
.setEndNode(imageGeneratorNode);
const comicGeneratorGraph = graphBuilder.build();
```
This creates the complete flow: **User Input → Story Generation → LLM Processing → Response Parsing → Image Generation → Final Comic**
### 4) Graph Execution
Once the graph is built, the server uses it to process comic generation requests asynchronously. When a user submits a comic request through the REST API, the following execution flow begins:
```typescript comic_server.ts
async function generateComic(request: ComicRequest) {
const input: ComicStoryInput = {
character1Description: request.character1Description,
character2Description: request.character2Description,
artStyle: request.artStyle,
theme: request.theme,
};
// Start graph execution
const executionId = uuidv4();
const outputStream = await comicGeneratorGraph!.start(input, executionId);
// Process the output stream (see Response Handling)
await processGraphOutput(outputStream, request);
// Clean up resources
comicGeneratorGraph!.closeExecution(outputStream);
}
```
### 5) Response Handling
As the graph processes through each stage, the application monitors the output stream and manages the comic generation lifecycle. This enables real-time status tracking and provides users with immediate feedback:
```typescript comic_server.ts
async function processGraphOutput(outputStream: any, request: ComicRequest) {
// Iterate through graph execution results
for await (const result of outputStream) {
// Extract the final comic with generated images
request.result = result.data as ComicImageOutput;
request.status = 'completed';
console.log(`✅ Comic generation completed for request ${request.id}`);
break;
}
}
// REST API endpoint for client polling
app.get('/api/comic-status/:requestId', (req, res) => {
const request = requests[req.params.requestId];
if (!request) {
return res.status(404).json({ error: 'Request not found' });
}
// Return current status and results if available
const response = {
requestId: request.id,
status: request.status,
...(request.status === 'completed' && { result: request.result })
};
return res.json(response);
});
```
This demonstrates the power of the Inworld Agent Runtime's graph architecture - you can create sophisticated AI processing pipelines by combining custom nodes with built-in components, handling complex workflows with proper resource management and error handling.
---
#### Build with Agent Runtime
#### Vibe Code
Source: https://docs.inworld.ai/guides/vibe-code
## Vibe Coding Principles
Focused, concise and context-rich prompts are the key to effective vibe coding.
1. **Provide Essential Context** - Share product requirements, user journey, and constraints upfront
2. **Decompose Into Small Tasks** - Break your project into isolated, single-purpose tasks. Solve only one task per session
3. **Restart Sessions Often** - Start new chats for new features or when stuck. Use `/compact` (Claude Code) or `/compress` (Gemini CLI) to summarize before restarting
## Quick Setup
**Requirements**:
- Node.js v20 or higher: [Download here](https://nodejs.org/en/download)
- Your Inworld API key: [Sign up here](https://platform.inworld.ai/signup) or see [quickstart guide](/node/authentication#getting-an-api-key)
**Project Setup**:
1. **Clone the templates repository**
```bash
git clone https://github.com/inworld-ai/inworld-runtime-templates-node
cd inworld-runtime-templates-node
```
2. **Open** your project in your development environment:
- **AI IDEs**: Cursor, Windsurf, etc.
- **IDEs with coding agents**: Visual Studio Code (with Zencoder, GitHub Copilot), JetBrains (with AI Assistant), etc.
- **CLI tools**: Claude Code, Gemini CLI, etc.
3. **Create** a copy of the `.env-sample`, rename it to `.env`, and add your base64 API key
4. **Install** dependencies - In the root folder of your project, run this terminal command:
```bash
yarn install
```
5. **Run** a simple example - Test your setup with this basic LLM example:
```bash
cd templates/ts/cli
yarn basic-llm "Tell me a short joke"
```
6. **View** traces and logs - Visit the [Inworld Portal](https://platform.inworld.ai/) to monitor your graph executions, observe latency, and debug issues. Learn more about [logs](/portal/logs) and [traces](/portal/traces).
---
## Vibe Code with Inworld Agent Runtime
Follow these 4 steps for each new feature: Start fresh → Find template → Build incrementally → Troubleshoot as needed.
### Step 1: Set Context
Enter this prompt to ask the AI to familiarize itself with the codebase.
```
You are an agent to help build applications with Inworld Agent Runtime, an SDK for creating AI workflows using connected graphs of processing nodes.
Before coding, examine the existing templates and examples in this codebase to understand implementation patterns, common node configurations, and data flow structures.
Below is a summary of the Inworld Agent Runtime:
**Graphs** assemble nodes, edges, and components into complete workflows. Think of them as executable processing pipelines.
**Nodes** perform specific tasks within the graph:
- **Built-in Nodes**: Ready-to-use AI nodes (LLM chat, TTS, STT, text manipulation, intent detection, knowledge retrieval)
- **Custom Nodes**: Your own processing logic for specialized tasks
**Components** Define a service configuration (like an LLM or TTS model) once and add it to the graph.
Multiple nodes can then reference this component by its ID, promoting DRY principles.
**Edges** connect nodes and control data flow. Support parallel processing, conditional routing, and streaming.
**Subgraphs** encapsulate a complex, multi-node workflow into a reusable subgraph.
Build it once, then add it to your main graph and use it as a single node.
## Data Flow Patterns
**Fan-Out Pattern:** `A -> B` and `A -> C`. Achieved by adding two edges originating from the same source node `A`.
**Fan-In Pattern:** `A -> C` and `B -> C`. Achieved by adding two edges that point to the same destination node `C`. The graph will wait for all inputs to be ready before executing node `C`.
## Graph Visualization
Always represent structural changes with ASCII diagrams to ensure clarity:
**When to show diagrams:**
- Adding/removing nodes or edges
- Changing data flow patterns
- Introducing conditional routing
**Format:**
CURRENT:
[Input] → [LLM Chat] → [Output]
PROPOSED:
[Input] → [Intent Detection] → [LLM Chat] → [TTS] → [Output]
+new node +new node
## Conditional Routing: Control graph flow based on data
- **Simple Conditions:** Basic comparisons using expressions
- **Custom Conditions:** Create a custom function as the condition
## Best Practices
- **Multi-node over single-node**: Build multi-node graphs for better UX instead of single-node solutions
- **Reuse components**: Create components once, reference by ID across multiple nodes
- **Use built-in nodes first**: Prefer built-in nodes before creating custom logic
- **Check data type compatibility**: Ensure node outputs match the expected input types of downstream nodes
```
---
### Step 2: Choose Template
Finding the best template to start from.
```
I want to build [describe your app and what it does].
User journey: [what users input → what they get back]
Please:
1. Find the closest template in this codebase
2. Outline the changes needed to make it into my app
3. Create a simple implementation plan
Keep the solution as simple as possible while meeting the user journey needs.
```
---
### Step 3: Build Incrementally
Start implementation with the recommended template.
```
Build the app incrementally using the recommended template.
- Begin with the identified template
- Implement one node modification per iteration
- Test and demonstrate each change before proceeding
- Request approval before adding complex features or custom nodes
```
---
### Step 4: Troubleshooting
When your AI agent gets stuck, try these troubleshooting options:
1. **Search templates for patterns:**
```
I'm stuck on [specific issue - e.g., "connecting TTS node to LLM output"].
Search the templates folder for:
- Similar node configurations or patterns
- Examples that handle [your specific data type/flow]
- Working implementations I can adapt
Show me the relevant code snippets and explain how to adapt them to my case.
```
2. **Revert to last checkpoint:** Use "Restore Checkpoint" in supported tools, or git reset/manual backups in others
3. **Restart the session:** Summarize progress first, then start fresh
```
Summarize:
- What we've built so far (features, templates used, key configurations)
- Current blockers or challenges preventing progress
- Next immediate task to tackle
```
4. **Document successful solutions:** Save useful patterns as reusable rules. In Cursor, you can [create rules](https://docs.cursor.com/en/context/rules#generating-rules) by prompting the agent:
```
/Generate Cursor Rules
Document the fix we just implemented in this format:
**Problem:** [Describe the exact error or issue that was blocking progress]
**Solution:** [Explain the specific approach, pattern, or code change that resolved it]
**Context:** [Define when this rule applies - specific scenarios, node types, or conditions]
**Example:** [Include the actual code snippet or configuration that worked]
```
---
#### Build with Agent Runtime > Understanding Graphs
#### Graphs
Source: https://docs.inworld.ai/node/core-concepts/graphs
In the Inworld Agent Runtime, a graph is the core orchestration unit. Graphs contain nodes connected by edges, and manage the execution flow of data through the entire system.
## Overview
There are two primary steps in setting up your experience using Inworld Graphs:
1. **Graph Creation**: Creating a graph, adding nodes, connecting them with edges, defining start/end points, and building the graph
2. **Graph Execution**: Process your data by providing input to the graph and capturing the results from your target nodes
The following sections provide more details about each step.
## Graph Creation
### Basic Graph
The simplest form of a graph is a single node graph that consists of:
1. creating a node
2. creating a graph builder
3. adding the node to the graph
4. defining that node as the start and end points for execution
5. building the graph
```javascript Node
// Create an LLM chat node
const llmNode = new RemoteLLMChatNode({
id: 'LLMChatNode',
provider: 'openai',
modelName: 'gpt-4o-mini',
stream: false,
});
// Create and configure the graph
const graph = new GraphBuilder({
id: 'MyBasicGraph',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: false,
})
.addNode(llmNode)
.setStartNode(llmNode)
.setEndNode(llmNode)
.build();
```
### Multi-Node Graph
Most graphs will consist of multiple nodes that can power more advanced functionality.
The below example shows how to create a graph that outputs the LLM response as audio, using TTS.
```javascript Node
// Create multiple nodes
const llmNode = new RemoteLLMChatNode({
id: 'llm_node',
provider: 'openai',
modelName: 'gpt-4o-mini',
stream: false,
});
const textChunkingNode = new TextChunkingNode({
id: 'text_chunking_node',
});
const ttsNode = new RemoteTTSNode({
id: 'tts_node',
speakerId: 'Dennis',
modelId: 'inworld-tts-1.5-max',
sampleRate: 24000,
temperature: 0.7,
speakingRate: 1.0,
});
// Build the graph with multiple nodes and edges
const graph = new GraphBuilder({ id: 'llm-tts-graph', apiKey: process.env.INWORLD_API_KEY, enableRemoteConfig: false })
.addNode(llmNode)
.addNode(textChunkingNode)
.addNode(ttsNode)
.addEdge(llmNode, textChunkingNode)
.addEdge(textChunkingNode, ttsNode)
.setStartNode(llmNode)
.setEndNode(ttsNode)
.build();
```
### Graph with Multiple Start Nodes
Graphs require at least one start node and end node, but they can also have multiple start or end nodes. You can define this with `setStartNodes()` and `setEndNodes()`.
```javascript Node
CustomNode,
GraphBuilder,
GraphTypes,
ProcessContext,
RemoteLLMChatNode
} from '@inworld/runtime/graph';
// Custom node that combines both LLM outputs
class CombineColorsNode extends CustomNode {
process(context: ProcessContext, color1: GraphTypes.Content, color2: GraphTypes.Content): GraphTypes.Content {
const output = `LLM 1 favorite color: ${color1.content}, LLM 2 favorite color: ${color2.content}`;
console.log('\n' + output + '\n');
return new GraphTypes.Content({ content: output });
}
}
// Create LLM nodes and print node
const llm1 = new RemoteLLMChatNode({
id: 'llm-1',
modelName: 'gpt-4.1-nano',
provider: 'openai'
});
const llm2 = new RemoteLLMChatNode({
id: 'llm-2',
modelName: 'gemini-2.0-flash',
provider: 'google'
});
const combineNode = new CombineColorsNode({ id: 'combine-colors' });
// Build graph with two start nodes
const graphBuilder = new GraphBuilder({
id: 'two-start-nodes-example',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: false
})
.addNode(llm1)
.addNode(llm2)
.addNode(combineNode)
.addEdge(llm1, combineNode)
.addEdge(llm2, combineNode)
.setStartNodes([llm1, llm2])
.setEndNodes([combineNode]);
const executor = graphBuilder.build();
// Execute the graph
async function main() {
const input = new GraphTypes.LLMChatRequest({
messages: [{
role: 'user',
content: 'What is your favorite color? Return just the color'
}]
});
const { outputStream } = await executor.start(input);
for await (const event of outputStream) {
await event.processResponse({
Content: (data: GraphTypes.Content) => {}
});
}
}
main();
```
### Graph Datastore
The datastore is a key-value storage mechanism that allows you to pass additional data across nodes in a graph execution.
You can access the datastore through `ProcessContext` in any custom node's `process()` method, and use `add()` to store data and `get()` to retrieve it.
See this [guide on using context](/node/templates/process-context) for an example of how to use the datastore.
## Graph Execution
After a graph has been created, it needs to be executed by providing input data and handling the results. The execution returns a stream that allows you to process the output from the graph.
**v0.8 Breaking Change:** `graph.start()` is now async and must be awaited. Additionally, you should call `stopInworldRuntime()` for proper cleanup when your application terminates.
Below is an example of executing the graph we created above.
```javascript Node
async function main() {
// Execute the graph (now requires await)
const { outputStream } = await graph.start(new GraphTypes.LLMChatRequest({
messages: [{ role: 'user', content: 'Hello, how are you?' }]
}));
// Handle response
for await (const result of outputStream) {
await result.processResponse({
Content: (response) => console.log('Response:', response.content),
TTSOutputStream: async (ttsStream) => {
for await (const chunk of ttsStream) {
console.log('Audio chunk received');
}
},
});
}
// Cleanup (required for proper resource management)
await stopInworldRuntime();
}
main();
```
### Execution Result
The `start()` method returns an `ExecutionResult` object with the following properties:
```typescript
const { variantName, outputStream, executionId } = await graph.start(input);
```
- `variantName` - The variant of the graph that was executed (default: `"__default__"`)
- `outputStream` - The output stream for processing results
- `executionId` - Unique identifier for this execution
---
#### Nodes
Source: https://docs.inworld.ai/node/core-concepts/nodes
Inworld offers a number of different Built-in Nodes that can be used to build a Graph. In addition to provided Built-in Nodes, you can also create your own node using Custom Nodes.
## Nodes
To use a node in a graph:
1. **Create the node**: Instantiate the node with configuration options.
2. **Add node to graph**: Add the node to the graph.
```javascript Node
// Create an LLM node
const llmNode = new RemoteLLMChatNode({
id: 'llm_node',
provider: 'openai',
modelName: 'gpt-4o-mini',
stream: true,
});
// Build graph
const graph = new GraphBuilder({
id: 'my-graph',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: false,
})
.addNode(llmNode)
.setStartNode(llmNode)
.setEndNode(llmNode)
.build();
```
### Node configurations
Nodes are created with various configuration options that control their behavior. Different nodes types will have different node-specific configurations, but all nodes will include the following optional configurations:
- `id`: Unique identifier for the node (auto-generated if not provided)
- `reportToClient`: Defaults to false, which means the node's result will be passed to the next nodes in the graph, but will not be available to the client. If true, the node's result will be sent to the client immediately when available AND propagated to the next nodes. This is useful if you want to access the intermediate node result (e.g., to display in the UI).
Below is an example of configuring an LLM node, where provider and modelName are LLM node specific parameters:
```javascript Node
const llmNode = new RemoteLLMChatNode({
id: 'my_llm_node',
reportToClient: true,
provider: 'openai',
modelName: 'gpt-4o-mini',
});
```
See the [Node.js Agent Runtime Reference](/node/runtime-reference/overview#nodes) for the configurations available for each node.
Additional error handling configurations coming soon!
## Built-in Nodes
Built-in nodes are pre-built nodes provided by Inworld that cover common use cases. Below is an overview of available Built-in Nodes.
### Subgraphs
Use subgraphs to simplify maintenance, improve readability, and promote reusability
The `SubgraphNode` is a special type of node that executes a compiled **Subgraph**.
A **Subgraph** is a reusable graph component that encapsulate a set of nodes and edges into a single logical unit. To use a subgraph in your main graph, you need to:
1. **Build the subgraph** using `SubgraphBuilder`, which follows a very similar syntax as `GraphBuilder`. This includes adding nodes, adding edges, and setting the start and end node. A Subgraph must have exactly one start node and one end node.
2. **Create a SubgraphNode** that references the subgraph by its ID
3. **Register the subgraph** with your main graph using `.addSubgraph()`
4. **Add the SubgraphNode** to your main graph like any other node
In the example below, the text processing subgraph handles chunking and aggregating text, while the main graph orchestrates the overall flow using proxy nodes for input and output handling.
```javascript Node
GraphBuilder,
SubgraphBuilder,
SubgraphNode,
TextChunkingNode,
TextAggregatorNode,
ProxyNode
} from '@inworld/runtime/graph';
const textChunkingNode = new TextChunkingNode({ reportToClient: true });
const textAggregatorNode = new TextAggregatorNode();
// Create a reusable text processing subgraph
const textProcessingSubgraph = new SubgraphBuilder('text_processing_subgraph')
.addNode(textChunkingNode)
.addNode(textAggregatorNode)
.addEdge(textChunkingNode, textAggregatorNode)
.setStartNode(textChunkingNode)
.setEndNode(textAggregatorNode);
// Create a SubgraphNode
const subgraphNode = new SubgraphNode({
subgraphId: 'text_processing_subgraph',
});
const inputProxy = new ProxyNode();
const outputProxy = new ProxyNode();
const graph = new GraphBuilder({
id: 'main_graph',
enableRemoteConfig: false,
})
.addSubgraph(textProcessingSubgraph) // Register the subgraph
.addNode(inputProxy)
.addNode(subgraphNode) // Add the SubgraphNode
.addNode(outputProxy)
.addEdge(inputProxy, subgraphNode)
.addEdge(subgraphNode, outputProxy)
.setStartNode(inputProxy)
.setEndNode(outputProxy)
.build();
```
## Custom Nodes
Custom Nodes can be created to implement any custom logic in your graph by extending the [CustomNode](/node/runtime-reference/classes/graph_dsl_nodes_custom_node.CustomNode) class.
To use a custom node:
1. **Create a custom node class**: Extend `CustomNode` and implement the `process` method with your custom logic.
2. **Instantiate the custom node**: Create an instance of your custom node class.
3. **Add custom node to graph**: Use `addNode` to add the custom node to the graph.
The below example creates a custom node that reverses a string.
```javascript Node
// Create a custom node class
class ReverseTextNode extends CustomNode {
process(context: ProcessContext, input: string): string {
return input.split('').reverse().join('');
}
}
// Create the custom node instance
const customNode = new ReverseTextNode();
// Add it to the graph, same as a built-in node
const graph = new GraphBuilder({
id: 'custom-graph',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: false,
})
.addNode(customNode)
.setStartNode(customNode)
.setEndNode(customNode)
.build();
```
Custom nodes can also be useful for processing inputs from multiple nodes, as shown in this [guide](/node/templates/multi-input).
---
#### Edges
Source: https://docs.inworld.ai/node/core-concepts/edges
In the Inworld Agent Runtime, edges connect nodes in the graph and define how data flows between them. Edges determine the execution flow of the graph and can apply conditions or other controls to the data passing between nodes.
## Basic Edge
The simplest form of edge is one that connects `nodeA` to `nodeB`, where the output of `nodeA` is passed as input to `nodeB` without any conditions.
```javascript Node
// Create a basic edge from nodeA to nodeB
graph.addEdge(nodeA, nodeB);
```
## Conditional Edge
You can create an edge with a condition that determines whether data flows through the edge. There are two ways you can define conditions:
1. **Common Expression Language expressions**: Define the logic using CEL expressions.
2. **Custom condition**: Define a function that returns either true or false.
### Common Expression Language (CEL)
[Common Expression Language (CEL)](https://github.com/google/cel-spec) implements common semantics for expression evaluation that can be used to define edge conditions. CEL expressions operate on the output of the source node, which is accessible via the `input` variable.
Some examples of supported operators include:
- `input.content == 'foo'` - Returns true if input content is equal to the constant string literal argument.
- `input.intent_name == "greeting"` - Returns true if the intent name equals "greeting".
- `size(input) > 0` - Returns true if the size of input is greater than 0.
- `int(input.content) > 50` - Returns true if the numeric value of input content is greater than 50.
See the CEL [Language Definition](https://github.com/google/cel-spec/blob/master/doc/langdef.md) for more details.
In the example below, data only flows from `nodeA` to `nodeB` if the numeric value of the content is greater than 50.
```javascript Node
// Example: data flows if numeric content is greater than 50
graph.addEdge(nodeA, nodeB, {
conditionExpression: 'int(input.content) > 50'
});
```
### Conditional expressions in code
For more advanced conditions that cannot be expressed by CEL expressions, a condition function can be defined directly in the edge configuration.
In the example below, data only flows from `nodeA` to `nodeB` if the custom condition function returns true.
```javascript Node
// Custom condition function
graph.addEdge(nodeA, nodeB, {
condition: (context, input) => {
return Number(input.content) > 50;
}
});
```
## Optional Edge
Marks an edge as optional, meaning the destination node can execute even if it doesn't receive input through this edge. Optional edges are useful when a node can function with or without certain inputs.
```javascript Node
// Mark the edge from nodeA to nodeB as optional
graph.addEdge(nodeA, nodeB, {
optional: true
});
```
## Loop Edge
Creates a loop in the graph, allowing for iterative processing. The below example creates a self-loop where the output of `nodeA` is fed back as its input, enabling iterative processing until a condition is met.
```javascript Node
// Create a loop edge from nodeA back to itself
graph.addEdge(nodeA, nodeA, {
loop: true
});
```
Loops are often used [with a condition](#conditional-edge) to control when the loop terminates:
```javascript Node
// Create a loop edge with a condition to control termination
graph.addEdge(nodeA, nodeA, {
loop: true,
conditionExpression: 'input.shouldContinue == true'
});
```
---
#### Build with Agent Runtime
#### Models
Source: https://docs.inworld.ai/models
Inworld's platform provides access to a wide variety of state-of-the-art models. These models offer diverse capabilities, performance levels, price points, and deployment options, enabling users to select and customize models that best match their specific use cases and application needs.
## Overview of Model Offerings
This section provides some high-level context on Inworld's model offerings, and how they can be used in your application.
- **[TTS](/models#tts)**: Text-to-Speech models can be used to generate high-quality audio for your application, such as powering a character's voice.
- **[LLM](/models#llm)**: Large Language Models are powerful models that can intake inputs (typically text, but certain models may also support other modalities) and generate text outputs. These models can be used to determine in-game actions, power conversations, generate dynamic narratives, and more.
- **[Embeddings](/models#embeddings)**: Embeddings models convert text into high-dimensional vectors, which can be used to power intent detection, text similarity comparison, and retrieval-augmented generation (RAG).
## TTS
Inworld's Agent Runtime and API offers access to Inworld's family of state-of-the-art TTS models, optimized for different use cases, quality levels, and performance requirements.
---
#### Troubleshooting
Source: https://docs.inworld.ai/node/troubleshooting
## Invalid authorization credentials
You may get an error similar to the following if you haven't provided your API key:
```
{
message: 'Failed to read input stream. INTERNAL: gRPC read failed: Invalid authorization credentials',
context: 'INTERNAL: gRPC read failed: Invalid authorization credentials'
}
```
Please make sure you have set your environment variable `INWORLD_API_KEY` to your **Runtime** API key, and are using the Base64 API key.
## Graph not found in Graph Registry
You will see the following messages if you have not yet registered your graph in the [Graph Registry](/portal/graph-registry), but you have set `enableRemoteConfig: true`. This means your graph will use the graph as defined in your code, but will not use any remote variants. If that is the desired behavior, you can ignore the warning.
Otherwise, to take advantage of the Runtime's [experimentation and targeting features](/portal/graph-registry), register your graph and create variants.
```
W0818 16:11:58.658435 11591985 graph_executor.cc:113] Graph not found in Graph Registry, using local graph configuration. If local configuration is intended, no action is needed. Otherwise, to enable features like A/B testing and remote configuration, register your graph: https://platform.inworld.ai/v2/documentation/docs/portal/graph-registry#register-your-graph
```
## Failed to load shared library (cblas)
You may get this error if you are using an older version of macOS. Please make sure you have macOS 14+.
```
Error: Failed to load shared library: dlopen(/Users/USERNAME/getting-started-0.6.0/node_modules/@inworld/runtime/bin/darwin_arm64/lib/libinworld.dylib, 0x0006): Symbol not found: _cblas_sdot$NEWLAPACK$ILP64
```
---
#### Deploy
#### Cloud Deployment
Source: https://docs.inworld.ai/node/cli/deploy
Once you've successfully [built your graph](/node/core-concepts/overview), you can deploy it to Inworld Cloud using the [Inworld CLI](/node/cli/overview) to create a persistent, production-ready endpoint.
Deploying your graph is ideal if:
- You're building in a language other than Node.js and want to use your graph through a hosted API endpoint
- You don't want to manage graph execution directly and prefer a cloud-managed deployment that scales automatically.
## Deploy your graph
To deploy your graph, you'll need to have the [Inworld CLI](/node/cli/overview) installed.
Once installed, run the following command with the path to your graph file (for example, `graph.ts`). Once deployment completes, you’ll see the persistent endpoint URL printed in your console output.
```bash
inworld deploy ./graph.ts
```
You can use the following command-line flags to check on your deployment status and package your graph up for deployment without actually deploying.
| **Command** | **Description** |
|:---------|:-------------|
| `--info` | Check deployment information, status, and health metrics for an existing deployment |
| `--package-only` | Package graph for deployment without actually deploying (creates graph-deployment.zip in current directory) |
| `--package-only ./my-deployment.zip` | Package graph for deployment without actually deploying (creates graph-deployment.zip in custom output path) |
Once deployed, your clients can integrate with the persistent endpoint. Below is an example integration:
```javascript JavaScript
// Client code (never needs to change)
const response = await fetch('https://api.inworld.ai/cloud/workspaces/workspace-123/graphs/my-graph-id/v1/graph:start', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Basic YOUR_BASE64_API_KEY'
},
body: JSON.stringify({
input: {
user_input: "Hello!"
},
userContext: {
targetingKey: "user123",
attributes: { tier: "premium" }
}
})
});
```
```bash cURL
curl -X POST https://api.inworld.ai/cloud/workspaces/workspace-123/graphs/my-graph-id/v1/graph:start \
-H "Authorization: Basic $(inworld auth print-api-key)" \
-H "Content-Type: application/json" \
-d '{
"input": {
"user_input": "Hello!"
},
"userContext": {
"targetingKey": "user123",
"attributes": { "tier": "premium" }
}
}'
```
Both methods will return a JSON response that you can process:
```javascript
const result = await response.json();
console.log(result.output);
```
You can update your graph anytime with `inworld deploy ./graph.ts` and clients automatically get the improvements without any code changes on their end.
## How it works
When you deploy a graph, Inworld creates a **persistent endpoint** - a stable URL that your clients can integrate with once. Here's how it works:
- **Initial deployment**: Creates the endpoint (e.g., `https://api.inworld.ai/cloud/workspaces/[workspace-id]/graphs/[graph-id]/v1/graph`)
- **Subsequent deployments**: Update the graph behind the same endpoint
- **Client experience**: No changes needed - requests continue to work seamlessly
- **Zero downtime**: Updates happen instantly without service interruption
## Best practices
1. **Test locally first**: Always test your graph with `inworld serve` before deploying
2. **Package and inspect**: Use `--package-only` to review what will be deployed
3. **Deploy to cloud**: Use the basic deploy command to create your persistent endpoint
4. **Verify deployment**: Use `--info` to check deployment status and details
**Having issues?** Check the [CLI Troubleshooting Guide](/node/cli/troubleshooting) for comprehensive setup, development, and deployment troubleshooting.
## Next steps
Once your graph is deployed to the cloud:
A/B test different graph variants
Monitor your key metrics
Understand and debug your graph executions
---
#### Monitor & Experiment
#### Dashboards
Source: https://docs.inworld.ai/portal/dashboards
Get real-time visibility into your application health. Track performance, resource usage, and system health through comprehensive dashboards and detailed data views.
## Get Started with Dashboards
### Enable Metrics Collection
To start using dashboards, you'll need to configure telemetry in your application:
Add the following code snippet to your application:
```typescript
import { telemetry } from '@inworld/runtime';
// Initialize telemetry
telemetry.init({
apiKey: process.env.INWORLD_API_KEY,
appName: 'MyApplication',
appVersion: '1.0.0',
});
```
Go to Edit \> Project Settings \> Plugins \> Inworld \> Telemetry and enable telemetry.
Add the following code snippet to your application:
```cpp
// Initialize with application details
ConfigureTelemetry("my_application_name", "0.0.1");
```
Once telemetry has been configured, metrics will be automatically collected and sent to Portal.
**First-time users:** The Dashboards tab only appears after you have run your first graph execution. If you've run your first execution but still don't see the Dashboards tab, please sign out and sign back in.
### View Default Dashboard
Every workspace comes with a pre-configured default dashboard containing 8 essential metrics.
These 8 metrics are automatically calculated when you execute your graphs.
| **Panel** | **What it Shows** |
| :------------------------------------ | :---------------------------------------------------- |
| **Graph Executions Total** | Count of graph executions |
| **Graph Executions Errors Total** | Count of total graph execution errors |
| **P50 / P99 Graph Execution Latency**| Percentile latency for the full graph execution |
| **Node Executions Total** | Count of node executions across all nodes |
| **Node Executions Errors Total** | Count of total node execution errors across all nodes |
| **P50 / P99 Node Execution Latency** | Percentile latency for all node executions |
| **LLM Node: Output Tokens Rate** | Number of LLM output tokens per unit of time |
| **P50 / P99 LLM Time to First Token Latency** | Percentile latency for time to first token |
## Building Custom Panels
Ready to create your own custom panels? Here's the step-by-step process:
1. Click on the **dashboard** you want to view
2. Click **New Panel** on the top right corner
3. Select a **chart type**: Time Series, Number, Table, Bar, or Pie Chart
The visual query builder makes it simple - just choose a **metric** and an **aggregator**.
1. **Select a metric**: What you want to measure (e.g., `framework_executions_total`)
2. **Pick an aggregator**: How to calculate values (Count, Average, P99, etc.)
3. **Add filters** (optional): Use WHERE conditions, group by dimensions, or set time aggregation
4. Click **Stage & Run Query**
**End-to-End Example**
## Query Builder Guide
### Essential Fields
The core fields you'll use for most charts:
| **Field** | **What it Does** | **Example** |
|-----------|------------------|-------------|
| **Metric Name** | The specific metric to display | `framework_executions_total` |
| **Aggregate Operator** | How to calculate values | Count, Avg, P99, Sum |
| **WHERE** | Filter your data | `graph_id = "abc"` |
| **Group by** | Split data into separate lines | By service name, endpoint |
### Advanced Fields
For more complex queries and customization:
| **Field** | **What it Does** | **Example** |
|-----------|------------------|-------------|
| **Limit** | Maximum number of groups to show | 10 (top 10 services) |
| **HAVING** | Filter groups after aggregation | `GroupBy(operation) > 5` |
| **Order by** | Sort the results | By value desc, by name |
| **Aggregate Every** | Time resolution (seconds) | 60 = one point per minute |
| **Legend Format** | Customize chart labels | `{{service_name}}` |
### Aggregate Operators
These operators determine how your data is calculated and displayed:
**Basic Aggregations:**
- **NOOP**: No operation - shows raw metric values without aggregation
- **Count**: Number of events or data points
- **Count Distinct**: Number of unique values
- **Sum**: Adds up all values within each time period (e.g., total requests per minute)
- **Sum_increase**: Shows increase in cumulative counters over time (e.g., how much a "total requests" counter grew)
- **Avg**: Average value across all data points
- **Max**: Highest value in the dataset
- **Min**: Lowest value in the dataset
**Percentiles:**
- **P50**: 50th percentile (median)
- **P75**: 75th percentile (third quartile)
- **P90**: 90th percentile - only 10% of values are above this
- **P95**: 95th percentile - only 5% of values are above this
- **P99**: 99th percentile - only 1% of values are above this
**Rate Functions:**
- **Sum_rate**: Sum of individual rate calculations (e.g., total requests/sec across all services)
- **Avg_rate**: Average of individual rate calculations
- **Max_rate**: Maximum of individual rate calculations
- **Min_rate**: Minimum of individual rate calculations
### Time Aggregation Settings
**Time Range**: On the top right of your dashboard - Defines the time scope of your data (e.g., "last 30 minutes", "last 1 week")
**Aggregate Every**: In the Query Builder - Defines how your data is grouped over time (every 60 seconds, 300 seconds, etc.)
When you change the time range on the dashboard, the Aggregate Every field automatically adjusts to keep charts readable. For example, when you select "last 30 minutes", the default Aggregate Every is set to 60 seconds.
**Why this matters:**
- **Longer time ranges** → **Higher aggregation intervals** (fewer data points, smoother charts)
- **Shorter time ranges** → **Lower aggregation intervals** (more data points, more detail)
| **Selected Time Range** | **Aggregate Every (seconds)** | **Human Readable** |
|-------------------------|--------------------------------|--------------------|
| 30 minutes | 60 | 1 minute |
| 60 minutes | 60 | 1 minute |
| 1 hour | 60 | 1 minute |
| 3 hours | 60 | 1 minute |
| 6 hours | 60 | 1 minute |
| 12 hours | 120 | 2 minutes |
| 1 day | 300 | 5 minutes |
| 3 days | 900 | 15 minutes |
| 1 week | 1800 | 30 minutes |
| 10 days | 3600 | 60 minutes |
| 2 weeks | 3600 | 60 minutes |
| 1 month | 9000 | 2h 30min |
| 2 months | 18000 | 5 hours |
**When defaults might cause issues:**
**Problem: Too granular (gaps in chart)**
- **When**: Sparse data + short time range (e.g., events every 5 minutes but aggregating every 1 minute)
- **Result**: Lots of empty intervals, choppy chart with gaps
- **Fix**: **Increase** Aggregate Every (e.g., change from 60 to 300 seconds)
**Problem: Too coarse (missing detail)**
- **When**: Frequent data + long time range (e.g., events every 30 seconds but aggregating every 5 hours)
- **Result**: Important spikes and patterns get smoothed out
- **Fix**: **Decrease** Aggregate Every (e.g., change from 18000 to 3600 seconds)
## Next Steps
Ready to grow your business metrics? Check out the resources below to get started!
Run robust A/B experiments and optimize your metrics
Learn how to build a simple chat experience using the Node.js SDK
---
#### Traces
Source: https://docs.inworld.ai/portal/traces
Traces provide detailed visibility into AI interactions, including timestamps, execution IDs, and durations. They support debugging, system analysis, and graph monitoring.
## Key Features
- **Performance Analysis**: Measure response times and system latency to identify bottlenecks.
- **Debugging & Troubleshooting**: Investigate errors and refine AI graphs using trace data.
- **[Coming Soon] Reusable Data**: Logged traces can be used to generate evaluation datasets in automated and human review workflows.
## Get Started with Traces
### Capturing Traces
To capture traces, you'll first need to ensure telemetry is enabled (it should be on by default). Follow the SDK specific instructions:
Tracing is **enabled by default**. Just set your API key:
```bash
export INWORLD_API_KEY=your-api-key
```
**Optional: Custom configuration**
Developers can customize telemetry behavior for better monitoring and performance:
```typescript
import { telemetry, LogLevel, ExporterType } from '@inworld/runtime';
telemetry.init({
// Required
apiKey: 'your-api-key',
// Application identification (helps filter traces in Portal)
appName: 'my-chat-app', // Appears as service.name in traces
appVersion: '2.1.0', // Track different deployments
// Custom endpoint (optional)
endpoint: 'https://custom-telemetry.example.com',
// Tracing configuration
tracer: {
samplingRate: 0.1 // Sample 10% of trace sessions (default: 1.0 = 100%)
},
// Export destination
exporterType: ExporterType.REMOTE // Send data to Inworld Portal (default remote)
});
```
**Why customize these settings?**
- **`appName`/`appVersion`**: Identify your app in Portal when monitoring multiple services
- **`tracer.samplingRate`**: Reduce overhead in high-traffic production (default: 1.0 = capture all traces)
- `0.1` = capture 10% of trace sessions (complete execution flows)
- Each captured trace still shows the full execution path
- Logs are NOT affected by trace sampling - they're captured based on logger.level setting regardless of whether the trace is sampled
- **`exporterType`**: Control where telemetry data is sent
- `REMOTE` (default): Send to Portal via HTTPS for monitoring and analysis
- `LOCAL`: Output to console/terminal for development debugging (no Portal)
**To disable telemetry completely:**
If you do not want to capture telemetry, you can disable it by:
*Environment variable:*
```bash
export DISABLE_TELEMETRY=true
# or alternatively
export DISABLE_TELEMETRY=1
```
*Programmatically in code:*
```typescript
// Option 1: Set environment variable before importing the framework
// CORRECT - Set before any imports
process.env.DISABLE_TELEMETRY = 'true';
import { telemetry } from '@inworld/runtime'; // telemetry won't initialize
// WRONG - Set after import
import { telemetry } from '@inworld/runtime'; // telemetry already initialized!
process.env.DISABLE_TELEMETRY = 'true'; // too late
// Option 2: Shutdown telemetry after automatic initialization
import { telemetry } from '@inworld/runtime';
telemetry.shutdown();
```
Go to Edit \> Project Settings \> Plugins \> Inworld \> Telemetry and make sure Telemetry Enabled is checked.
Add the following code snippet to your application
```cpp
// Initialize with application details
ConfigureTelemetry("my_application_name", "0.0.1");
```
**First-time users:** The Traces tab only appears after you have run your first graph execution. If you've run your first execution but still don't see the Traces tab, please sign out and sign back in.
### Viewing Traces
To view your Traces, navigate to the **Traces** tab in Portal.
- Select a timeframe to view relevant traces.
- Filter by Name, Trace ID, Span ID, Application Name, or Status Code.
- Click a trace to view:
- Execution ID
- Execution latency
- Span latencies and details
- Click a span to see:
- Span ID
- Tags
- Events
If you want to deep dive more into a specific trace, you can also look up any logs associated with that trace, by navigating to the **Logs** tab.
- Use the search bar to find [logs](/portal/logs) by Trace ID or Span ID.
- Examine [logs](/portal/logs) for additional context on the particular trace or span.
## Trace Metadata Definitions
| Field Name | Description | Defined by |
| :--------------- | :---------- | :----- |
| **Trace Level Fields** |
| `Execution ID` | Unique ID of one graph execution/ one trace | Runtime/ Developer can override |
| `Duration` | Duration of the trace execution in milliseconds | Runtime |
| `Graph ID` | Unique ID of the executing graph | Developer |
| `Graph Variant` | Graph variant being executed | Runtime |
| `App: Name` | Name of the application or service | Developer |
| `App: Version` | Application version | Developer |
| `App Instance ID` | Auto-generated unique instance identifier | Runtime |
| `User Context: Targeting_key` | Key to ensure users experience the same variant across sessions | Developer |
| `User Context: *` | Custom user attributes (e.g., user_context.age, user_context.user_id) | Developer |
| **Span Level Fields** |
| `Span ID` | Unique span identifier | Runtime |
| `Span Name` | Name of the span. Ex: Graph ID (for graph spans) or Node ID (for node spans) | Runtime |
| `Timestamp` | When the span started | Runtime |
| `Status` | Final status as string: "Unset", "Error", "Ok" | Runtime |
| `Service` | Either "workflows.Graph" or "workflows.Node" | Runtime |
| `Method` | Either "execute" or "process" | Runtime |
| `Node ID` | ID of the specific node being executed | Runtime |
| `Input` | Input to the graph/node/service | Runtime |
| `Output` | Output from the graph/node/service | Runtime |
## Best Practices
- **Enable Tracing in Development**: Ensure trace logging is active in your development setup for early diagnostics.
- **Use Filters for Faster Debugging**: Apply filters like Execution ID and service name to quickly find issues.
- **Monitor Performance Trends**: Track long-term trace data to uncover inefficiencies or anomalies.
- **Optimize execution**: Dive deep into each trace execution to optimize each span
- **Error Highlighting**: Spans with errors appear with red borders for immediate identification
- **Error Propagation**: Dotted outlines indicate parent spans containing child spans with errors
## When Should I Use Traces vs Logs?
As opposed to [logs](/portal/logs), which capture details about a specific event, traces show the flow of an entire execution.
Below is an overview of some key differences:
| **Aspect** | **Logs** | **Traces** |
| :-------------- | :-------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------- |
| **Purpose** | Captures discrete events (errors, warnings, info) | Capture end-to-end execution flows |
| **Scope** | Capture specific moments in time | Capture relationships between nodes |
| **Granularity** | Often capture detailed, low-level system information, including errors or performance metrics | Show the high-level flow of a request across nodes, including inputs/output |
| **Use Case** | Debugging, monitoring, auditing, performance tracking, and error reporting | End-to-end execution tracking, identifying bottlenecks, tracing dependencies |
**Use traces when you need to:**
- Visualize the complete flow of a request through your system
- Identify performance bottlenecks across multiple components
- Understand dependencies between different components
- Identify where failures are happening across the system (e.g., was knowledge not retrieved or did the model not utilize it?)
## Next Steps
Use trace insights to continue to optimize your AI graph performance!
Monitor your key metrics.
Learn how to run an experiment to optimize your metrics.
Learn how to build a simple chat experience using the Node.js SDK.
---
#### Logs
Source: https://docs.inworld.ai/portal/logs
Logs are text-based records that capture events occurring during the execution of your AI-powered application. They are essential tools for debugging, monitoring, and improving AI systems.
## Key Features
- **System Observability**: Track application events and errors to ensure smooth operation and quick resolution of issues.
- **Troubleshooting**: Analyze logs to identify the root cause of issues by examining error messages, system state, and the sequence of events leading to the problem.
Depending on your application, you may find it useful to log:
- The application that generated the logs
- Messages reported by the application
- Warnings or errors encountered during runtime
- Outputs or inputs to a specific component
## Capture Logs
If telemetry is configured, logs are automatically generated by the Inworld Agent Runtime during graph execution and sent to Portal.
Support for capturing your own custom logs in Portal is coming soon.
You can adjust the level of log detail by setting the `VERBOSITY` and `logger.level` options.
| **Control** | **Values** | **Default** | **Description** |
|:---------|:--------|:---------|:--------------|
| `VERBOSITY` | `0`, `1`, `2` | `0` | Controls what logs are generated:
`0`=none
`1`=some
`2`=all |
| `logger.level` | `INFO`
`WARN`
`DEBUG`
`TRACE` | `INFO` | Controls what gets sent to Portal |
We recommend the following set up depending on your use case:
- **Default**: `VERBOSITY=0` + `LogLevel.INFO` - standard logs only
- **Debugging**: `VERBOSITY=1` + `LogLevel.DEBUG` - adds debug details
- **Max detail**: `VERBOSITY=2` + `LogLevel.TRACE` - everything
To get started, follow the SDK specific instructions below:
Logging is **enabled by default**. Just set your API key:
```bash
export INWORLD_API_KEY=your-api-key
```
To set the `VERBOSITY`:
```bash bash
export VERBOSITY=1 # For debug logs
```
To set the `logger.level`:
```typescript graph.ts
import { telemetry, LogLevel } from '@inworld/runtime';
telemetry.init({
apiKey: process.env.INWORLD_API_KEY,
logger: { level: LogLevel.DEBUG } // For Portal filtering
});
```
Go to Edit \> Project Settings \> Plugins \> Inworld \> Telemetry and enable telemetry. After enabling, you'll need to restart the editor.
Add the following code snippet to your application:
```cpp
// Initialize with application details
ConfigureTelemetry("my_application_name", "0.0.1");
```
**First-time users:** The Logs tab only appears after you have run your first graph execution. If you've run your first execution but still don't see the Logs tab, please sign out and sign back in.
## View Logs
- Click on a log to open a side view
- Toggle between the Overview and JSON view
- Use the **Filter** panel to sort by severity, Service name, Variant, and Graph ID
- Use the **Search** bar to filter logs by keywords or IDs.
## Log Metadata Definitions
| Field Name | Description |
| :------------- | :------------------------------------------------------------------------------------------- |
| `Timestamp` | Time of the log entry. |
| `Body` | Text of the log message. |
| `Execution ID` | Unique identifier for the associated trace (if it exists). You can find the trace in Traces. |
| `Span ID` | Identifier for the individual span (if it exists). |
| `Severity` | Log level: DEBUG, INFO, WARNING, ERROR. |
## When Should I Use Logs vs Traces?
As opposed to traces, which shows the flow of a single execution, logs can be used to capture details about specific events, including detailed metadata unique to that event.
Below is an overview of some key differences:
| **Aspect** | **Logs** | **Traces** |
| :-------------- | :-------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------- |
| **Purpose** | Captures discrete events (errors, warnings, info) | Capture end-to-end execution flows |
| **Scope** | Capture specific moments in time | Capture relationships between nodes |
| **Granularity** | Often capture detailed, low-level system information, including errors or performance metrics | Show the high-level flow of a request across nodes, including inputs/output |
| **Use Case** | Debugging, monitoring, auditing, performance tracking, and error reporting | End-to-end execution tracking, identifying bottlenecks, tracing dependencies |
**Use logs when you need to:**
- Debug failure cases or unexpected behavior
- Track specific metrics or state changes
- Capture intermediate values within nodes (beyond inputs/outputs)
## Next Steps
View end-to-end graph executions to help with debugging and analysis.
Monitor your key metrics.
Learn how to build a simple chat experience using the Node.js SDK.
---
#### Metrics
Source: https://docs.inworld.ai/node/core-concepts/metrics
Metrics are quantitative measurements that provide insights into your application's performance, user behavior, and business outcomes. They serve as the foundation for data-driven decision making and continuous improvement.
- **Default metrics** are automatically collected for all graphs during execution
- **Custom metrics** are user-defined measurements that you can create to track specific behaviors or KPIs relevant to your use case
All metrics can then be visualized in [dashboards](/portal/dashboards) on Portal.
**Metric Dropdown:** Metrics appear in the dropdown only after being recorded. If you don't see any metrics in the dropdown, execute your graph to generate data. You can also enter metric names manually in the selector.
## Metric Types
Runtime supports the following types of metrics, with the option to use either integer or double precision:
- **Counters** - Track cumulative values that only increase (e.g., total interactions, errors)
```typescript
MetricType.COUNTER_UINT // Integer counter
MetricType.COUNTER_DOUBLE // Double counter
```
- **Gauges** - Track current values that can go up or down (e.g., active users, response latency)
```typescript
MetricType.GAUGE_INT // Integer gauge
MetricType.GAUGE_DOUBLE // Double gauge
```
- **Histograms** - Track value distributions over time with automatic percentile calculations (e.g., response time percentiles)
```typescript
MetricType.HISTOGRAM_UINT // Integer histogram
MetricType.HISTOGRAM_DOUBLE // Double histogram
```
## Default Metrics
These metrics are automatically calculated when you execute your graphs.
| **Metric** | **Metric Type** | **What it Shows** |
| :------------------------------------ | :--------------- | :------------------------------ |
| `framework_executions_duration` | Histogram | Distribution of graph execution duration / latency |
| `framework_executions_total` | Counter | Count of graph executions |
| `framework_llm_duration_total` | Counter | Total LLM execution duration |
| `framework_llm_generation_tokens_total` | Counter | Number of LLM output tokens per unit of time |
| `framework_llm_prompt_tokens_total` | Counter | Number of LLM input/prompt tokens per unit of time |
| `framework_time_to_first_token` | Histogram | Distribution of latency for time to first token across all LLM node executions |
Histogram metrics appear as 3 separate metrics in the metric dropdown selector:
- `_bucket` (distribution)
- `_count` (total observations)
- `_sum` (sum of values)
## Custom Metrics
Custom metrics complement the default metrics to give you a complete picture of your application's performance. These can include:
- Engagement KPIs (e.g., session length, retention rates)
- Business metrics (e.g., in-app purchases, subscription revenue, model costs)
- User feedback tracking (e.g., thumbs up/down)
To create a custom metric:
```typescript
import { telemetry } from '@inworld/runtime';
telemetry.init({
apiKey: 'your-inworld-api-key',
appName: 'MyApp',
appVersion: '1.0.0'
});
```
```typescript
import { MetricType } from '@inworld/runtime/telemetry';
// Configure metrics once at startup
telemetry.configureMetric({
metricType: MetricType.COUNTER_UINT,
name: 'user_interactions_total',
description: 'Total user interactions',
unit: 'interactions'
});
telemetry.configureMetric({
metricType: MetricType.HISTOGRAM_DOUBLE,
name: 'response_time_seconds',
description: 'Response time distribution',
unit: 'seconds'
});
```
```typescript
// Track user interactions
telemetry.metric.recordCounterUInt('user_interactions_total', 1, {
userId: 'user123',
feature: 'chat'
});
// Track response times
telemetry.metric.recordHistogramDouble('response_time_seconds', 0.245, {
endpoint: '/chat',
model: 'llm-1'
});
```
## Using Metrics with Experiments
When running experiments, it's important to track metrics to understand how different variants impact key metrics like latency or engagement. Metrics can be tracked with attributes that identify which experiment and variant they are associated with.
Below is an example demonstrating how to execute a graph with user context and log relevant metrics along with the user context:
```typescript
// Initialize telemetry and configure metrics
telemetry.init({
apiKey: process.env.INWORLD_API_KEY, // replace with your API key
appName: 'ChatApp',
appVersion: '1.0.0'
});
// Configure metrics
telemetry.configureMetric({
metricType: MetricType.COUNTER_UINT,
name: 'chat_interactions_total',
description: 'Total chat interactions',
unit: 'interactions'
});
telemetry.configureMetric({
metricType: MetricType.HISTOGRAM_DOUBLE,
name: 'response_latency_seconds',
description: 'Response time distribution',
unit: 'seconds'
});
// Execute graph with user context and metrics
async function handleUserMessage(userId: string, message: string) {
const startTime = performance.now();
// Create user context with targeting information
const userContext = new UserContext({
userId: userId,
userTier: 'premium',
region: 'us-west'
}, userId); // targetingKey
try {
// Create graph using GraphBuilder
const myGraph = new GraphBuilder({
id: 'chat-graph',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: false,
})
// Add your nodes and edges here
.build();
const outputStream = myGraph.start({ text: message }, userContext);
// Process the response
for await (const result of outputStream) {
// Handle graph output here
}
// Record success metrics
const latency = (performance.now() - startTime) / 1000;
telemetry.metric.recordCounterUInt('chat_interactions_total', 1, {
userId: userId,
userTier: userContext.attributes.userTier,
status: 'success'
});
telemetry.metric.recordHistogramDouble('response_latency_seconds', latency, {
userTier: userContext.attributes.userTier,
messageLength: message.length.toString()
});
} catch (error) {
// Record error metrics
const latency = (performance.now() - startTime) / 1000;
telemetry.metric.recordCounterUInt('chat_interactions_total', 1, {
userId: userId,
userTier: userContext.attributes.userTier,
status: 'error',
errorType: error.name
});
telemetry.metric.recordHistogramDouble('response_latency_seconds', latency, {
userTier: userContext.attributes.userTier,
status: 'error'
});
throw error;
}
}
```
This approach enables you to:
- **Pass user context**: UserContext provides targeting information for graph execution
- **Track performance by user segments**: Use user attributes (tier, region) in metric tags
- **Measure real latency**: Track actual response times under real conditions
- **Monitor errors**: Record both success and failure metrics with context
## Next Steps
View and analyze your metrics with custom dashboards
Run robust A/B experiments and optimize your metrics
---
#### Experiments
Source: https://docs.inworld.ai/portal/graph-registry
Inworld Agent Runtime lets you iterate on prompts, models, and other LLM and TTS configs without redeploying code. This guide walks through the entire workflow—from CLI prep to Portal configuration.
## Summary
1. **Code your agent** – Code your agent, enable remote config, and deploy via CLI.
2. **Register variants** – Use `inworld graph variant register` to register the graph for experiment. Or upload the JSON via UI.
3. **Start experiment** – Set targeting + rollout percentages in the Experiments Tab in Portal
4. **Monitor & roll out** – Monitor metrics dashboards, then promote the winner.
```mermaid
flowchart TD
A[CLI: Build & Test] --> B[CLI: Deploy + Register Variants]
B --> C[Portal: Start Experiment]
C --> D[Portal: Roll out Winning Variant]
D --> A
style A fill:#e1f5fe
style B fill:#e1f5fe
style C fill:#e8f5e8
style D fill:#e8f5e8
```
## Experiment workflow
### Step 1 – Code your agent
Install the CLI:
```bash
npm install -g @inworld/cli
```
Pull a template project:
```bash
inworld init --template llm-to-tts-node --name my-agent
cd my-agent
```
Enable remote config in your graph and add user context to enable targeting:
```typescript
const graph = new GraphBuilder({
id: 'my-graph-id',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: true, // Required for Experiments
})
.addNode(llmNode)
.setStartNode(llmNode)
.setEndNode(llmNode)
.build();
const userContext = new UserContext({ // Add user context
tier: user.tier,
country: user.country,
app_version: '2.1.0',
}, user.id); // targeting key
```
Test locally before deploying:
```bash
inworld run ./graph.ts '{"input":{"message":"Hello"},"userContext":{"targetingKey":"user123","attributes":{"tier":"premium","country":"US"}}}'
```
Add custom telemetry so you can monitor experiment KPIs later:
```typescript
telemetry.init({
apiKey: process.env.INWORLD_API_KEY,
appName: 'cli-ops',
appVersion: '1.0.0',
});
telemetry.configureMetric({
metricType: MetricType.COUNTER_UINT,
name: 'successful_interactions',
description: 'Count of responses that reached business success criteria',
});
```
### Step 2 – Register variants
Registering a variant means telling Portal which configuration should be available in Experiments. Start with the baseline version, then add additional variants.
**Register the baseline variant**
Choose the registration workflow that matches your deployment:
- **Hosted endpoint (recommended):** Deploying with the CLI automatically registers the graph ID and baseline variant:
```bash
inworld deploy ./graph.ts
```
- **Self-host + CLI:** After building your graph, push the baseline configuration manually:
```bash
inworld graph variant register -d baseline ./graph.ts
```
- **Self-host + UI:** Export the graph JSON and upload it in Portal:
```bash
inworld graph variant print ./graph.ts > baseline.json
```
Then in Portal:
1. Open **Register Graph** and enter the graph ID exactly as defined in your code.
2. Click **Create Variant**, name it, and upload `baseline.json`.
**Add additional variants**
Create an experimental copy and change the LLM provider (or any other settings):
```bash
cp graph.ts graph-claude.ts
```
```typescript
const llmNode = new RemoteLLMChatNode({
provider: 'openai',
modelName: 'gpt-5-mini',
stream: true,
textGenerationConfig: { maxNewTokens: 200 },
});
```
```typescript
const llmNode = new RemoteLLMChatNode({
provider: 'anthropic',
modelName: 'claude-4-sonnet',
stream: true,
textGenerationConfig: { maxNewTokens: 200 },
});
```
Register the experimental variant:
```bash
inworld graph variant register -d claude ./graph-claude.ts
```
List the variants tied to your graph:
```bash
inworld graph variant list ./graph.ts
```
### Step 3 – Start an experiment
**Set targeting rules**
Open the **Targeting & Rollout** tab → **+ Rule** and configure the following:
- Add filters on user attributes (`user_tier`, `location`, etc.) and assign percentages that sum to 100%.
- Save the rule and repeat for additional cohorts (keep specific rules at the top; use "Everyone else" as the fallback).
Order matters—rules are evaluated top to bottom.
Traffic allocation only works if requests include a targeting key and attributes (built in Step 1).
**Start the experiment**
Rules are disabled by default:
- Use the rule menu → **Enable**, then click **Save** to go live.
- Start with small allocations (10–20%) to validate; increase once metrics look good.
- Use the same rule menu to disable, delete, duplicate, or drag-to-reorder rules.
### Step 4 – Monitor & roll out
Monitor your experiment results and deploy the winner:
- Watch [metrics](/node/core-concepts/metrics), [dashboards](/portal/dashboards), [traces](/portal/traces), and [logs](/portal/logs) while the experiment runs.
- Increase the winning variant's allocation gradually (50/50 → 70/30 → 90/10), then set it to 100% and retire old rules.
- Roll back or tweak allocations if latency, errors, or business KPIs regress.
## How Experiments picks variants
When a request hits your graph, the runtime decides whether to use the local configuration or a remote variant from Experiments:
1. Remote config must be enabled.
2. The graph ID must be registered in Experiments and have at least one active rule that returns a variant.
```mermaid
flowchart TD
A[Graph Execution Request] --> B{enableRemoteConfig == true}
B -->|NO/DEFAULT| C[Static Config]
C --> D[Use local graph configuration]
B -->|YES| E[Remote Config]
E --> F{Experiments returns a variant?}
F -->|NO| G[Use local graph configuration]
F -->|YES| H[Use Experiments variant]
style A fill:#e1f5fe
style D fill:#c8e6c9
style G fill:#c8e6c9
style H fill:#fff3e0
```
**If remote config is enabled**, Experiments evaluates each request as follows:
1. **Local cache check:** If the compiled variant for this user is cached, it executes immediately; otherwise Experiments is queried.
2. **Variant fetch:** Experiments evaluates your targeting rules, returns the selected variant, and falls back to the local configuration if no rule applies or the fetch fails.
3. **Compile & cache:** The runtime compiles the variant payload, caches it, and executes the graph with the new configuration.
## Troubleshooting
**Why are all my users getting the same variant despite setting traffic splits?**
This happens when the UserContext is not properly configured in your code. Here's what you need to check:
1. **Specify a targeting key:** This is typically the user ID and ensures the same user gets consistent variants.
```typescript
// CORRECT: Each user gets a consistent variant
const userContext = new UserContext(
{
country: country // attributes for targeting rules
},
userId // targeting key
);
await graphExecutor.execute(input, executionId, userContext);
```
```typescript
// INCORRECT: All users get the same variant
await graphExecutor.execute(input, executionId);
```
2. **Include targeting attributes:** Make sure to pass any attributes that your targeting rules use (e.g., country, user_tier, etc.).
If you don't specify a targeting key, all users will share the same default key, causing everyone to get the same variant regardless of your traffic split settings.
**How can I tell if my graph is using remote config or static config?**
Confirm that `enableRemoteConfig: true` is set in your CLI project (Step 1) and inspect your application [logs](/portal/logs) for Experiments fetch messages. If remote config is disabled, the runtime always executes the local configuration.
**How do I know if Experiments are working?**
- Ensure remote config is enabled and the graph ID is registered in Experiments.
- Verify that the relevant targeting rules are enabled and percentages sum to 100%.
- Pass different user attributes to confirm that variant assignments change as expected.
- Monitor logs/dashboards for variant metadata or assignment info.
**Why is my graph always using the local configuration?**
Check these common causes:
1. Missing `INWORLD_API_KEY` so the graph never authenticates.
2. `enableRemoteConfig` is set to `false`.
3. Graph not registered in Experiments.
4. No active targeting rules or no variants configured for the matched cohort.
**What changes can I upload without redeploying code?**
Supported via Experiments:
- Switch LLM/STT/TTS models or providers.
- Adjust node configuration (temperature, token limits, prompts).
- Reorder/add/remove built-in nodes while preserving the same inputs/outputs.
- Update processing logic (edge conditions, preprocessing steps, data flow).
Requires a code deployment:
1. Adding new component types.
2. Introducing unregistered custom nodes.
3. Changing the graph's input/output interface.
4. Using custom edge conditions beyond supported CEL expressions.
See [UserContext](/node/core-concepts/user-context) for additional targeting guidance.
---
#### User Context
Source: https://docs.inworld.ai/node/core-concepts/user-context
UserContext provides user-specific information during graph execution. It consists of user attributes (key-value pairs) and a targeting key that determines which graph variant a user receives.
## Why Use UserContext?
- **A/B Testing**: Enable sophisticated targeting rules for experiments
- **Session Consistency**: User context's targeting key ensures the same user always gets the same experience across sessions
- **Personalization**: Pass user attributes to customize graph behavior
## What Happens Without UserContext?
If running A/B experiments, always pass UserContext with a unique targeting key. Otherwise, all users get the same variant regardless of your split percentages.
**Without UserContext:** The system uses a single default targeting key for all requests, causing 100% of users to get assigned to the same variant instead of distributing according to your configured splits (50/50, 30/70, etc.).
## How to Use UserContext
### Basic Usage
```typescript
// Create UserContext
const userContext = new UserContext(
{
user_id: 'user_12345',
age: '25',
country: 'US',
user_tier: 'premium'
},
'user_12345' // targeting key
);
// Execute graph with context
const outputStream = graph.start(inputData, userContext);
```
### Schema
```typescript
new UserContext(
attributes: { [key: string]: string }, // Required if creating UserContext
targetingKey: string // Required if creating UserContext
)
```
## Common Attributes
Attributes are flexible key-value pairs. Common examples:
- **`user_id`**: Primary user identifier
- **`age`**: User age for demographic targeting
- **`country`**: Country code ('US', 'CA', 'UK')
- **`user_tier`**: Subscription level ('free', 'premium')
- **`app_version`**: Version of your application
## Next Steps
Learn how to register your graphs and run A/B experiments
Learn best practices for A/B experiments
---
#### Inworld CLI
#### Inworld CLI
Source: https://docs.inworld.ai/node/cli/overview
The [Inworld CLI](https://www.npmjs.com/package/@inworld/cli) is a comprehensive command-line toolkit that streamlines the entire lifecycle of Runtime graph development - from initial prototyping to production deployment and continuous optimization.
To download and install Inworld CLI, run the following command:
```bash npm
npm install -g @inworld/cli
```
## Commands
The CLI follows a logical command hierarchy that matches developer workflows:
```
inworld
├── auth # Authentication
│ ├── login # User authentication (alias: inworld login)
│ ├── logout # Sign out (alias: inworld logout)
│ ├── print-access-token # Display access token
│ ├── print-api-key # Display API key
│ └── status # Show authentication status (alias: inworld status)
├── init # Project creation with templates
├── graph # Graph management and deployment
│ ├── run # Local testing and development (alias: inworld run)
│ ├── serve # Local HTTP/gRPC servers (alias: inworld serve)
│ ├── deploy # Cloud deployment (alias: inworld deploy)
│ ├── visualize # Generate architectural diagrams
│ └── variant # A/B testing and experimentation
├── tts # Text-to-speech synthesis
└── workspace # Workspace management
```
## Project Initialization
The `inworld init` command downloads the `llm-to-tts-node` template—a production-ready LLM to TTS pipeline.
Currently, only the `llm-to-tts-node` template is available via CLI. CLI support for fetching additional templates is coming soon. To view all available templates, visit the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node).
For graph-only examples that demonstrate specific Runtime features, clone the [templates repository](https://github.com/inworld-ai/inworld-runtime-templates-node) directly.
```bash
# Initialize with the llm-to-tts-node template
inworld init --template llm-to-tts-node --name my-ai-app
# Skip dependency installation prompt
inworld init --template llm-to-tts-node --name my-ai-app --skip-install
# Force refresh template cache (download fresh copy even if cached)
inworld init --template llm-to-tts-node --name my-ai-app --force
```
**Init Command Options:**
- `-n, --name ` - Name of the project directory
- `-t, --template ` - Template to use (currently only `llm-to-tts-node`)
- `--skip-install` - Skip dependency installation prompt
- `--force` - Force refresh template cache (download fresh copy even if cached)
## Working with Graphs
Once you have a `graph.ts` file (either from a template project or created manually), you can use the following commands to work with your graph:
### Local Testing
Test your graph locally with instant feedback:
```bash
# Test locally with instant feedback
inworld run ./graph.ts '{"input": {"user_input": "test message"}}'
```
### Visualization
Generate architectural diagrams of your graph:
```bash
# Visualize complex graph architectures
inworld graph visualize ./graph.ts --output ./docs/architecture.png
```
Graph visualization requires [Graphviz](https://graphviz.org/) to be installed on your system.
### Local Server
Serve your graph locally with support for both HTTP (with optional Swagger UI) and gRPC:
```bash
# Serve on HTTP Server with Swagger
inworld serve ./graph.ts --swagger
# Serve on custom port
inworld serve ./graph.ts --port 8080
# Serve on custom host
inworld serve ./graph.ts --host 0.0.0.0 --port 3000
# Combine options
inworld serve ./graph.ts --host 0.0.0.0 --port 8080 --swagger
# gRPC server for high-performance applications
inworld serve ./graph.ts --transport grpc
```
## Cloud Deployment
Once you’ve successfully built your graph, you can deploy it to Inworld Cloud to create a persistent, production-ready endpoint. See [Cloud Deployment](/node/cli/deploy) for more details.
```bash
# Deploy to cloud with persistent endpoint
inworld deploy ./graph.ts
```
## A/B Testing & Experimentation
You can use the CLI to create and manage graph variants, such as model or prompt changes, for [A/B testing and experimentation](/portal/graph-registry). Your clients will automatically use the latest variants—no client-side updates required.
```bash
# Register different variants for testing
inworld graph variant register -d baseline ./graph-v1.ts
inworld graph variant register -d experiment ./graph-v2.ts
# Configure traffic splits in Portal, monitor results
```
## TTS
Use the CLI to quickly get started with [Inworld TTS](/tts/tts). Generate speech content with direct TTS commands:
```bash
# Direct TTS synthesis for immediate use
inworld tts synthesize "Hello world" --voice Dennis --output audio.mp3
```
To create graphs that use TTS, initialize a template project with `inworld init --template llm-to-tts-node` (see [Project Initialization](#project-initialization) above).
## Telemetry
The Inworld CLI collects general usage data to help us improve the tool.
Information such as the following are tracked:
- Command execution time
- CLI version
- Command name
**Managing Telemetry**
Inworld takes privacy and security seriously.
You can opt-out if you prefer not to share any telemetry information.
```bash
# Disable telemetry collection
inworld auth disable_telemetry
# Enable telemetry collection
inworld auth enable_telemetry
```
## Next Steps
Ready to start building with the Inworld CLI?
Guide to installation, authentication, and first project setup
Deploy your graphs to production with persistent endpoints
## Need Help?
- **[Troubleshooting Guide](/node/cli/troubleshooting)** - Common issues and solutions
- **CLI Help**: Run `inworld help` or `inworld [command] --help` for detailed command information
---
#### Troubleshooting
Source: https://docs.inworld.ai/node/cli/troubleshooting
This guide walks through troubleshooting for common CLI setup and development issues.
## Installation Issues
### "Permission denied" during installation
```bash
npm install -g @inworld/cli
# Error: EACCES: permission denied
```
**On macOS/Linux:**
```bash
sudo npm install -g @inworld/cli
```
**Alternative (without sudo):**
```bash
# Configure npm to use a different directory
mkdir ~/.npm-global
npm config set prefix '~/.npm-global'
echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.bashrc
source ~/.bashrc
# Then install without sudo
npm install -g @inworld/cli
```
### "inworld: command not found" after installation
```bash
npm list -g @inworld/cli
# Should show the installed package
```
```bash
# Find where npm installs global packages
npm config get prefix
# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
export PATH="$(npm config get prefix)/bin:$PATH"
# Reload your shell
source ~/.bashrc # or ~/.zshrc
```
### Node.js version issues
```bash
node --version
# Must be v18 or higher
```
**Using nvm (recommended):**
```bash
# Install nvm if not already installed
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
# Install and use Node 20+
nvm install 20
nvm use 20
nvm alias default 20
```
**Or download from nodejs.org:**
Visit [nodejs.org](https://nodejs.org) and download the latest LTS version.
## Authentication Issues
### Login opens browser but fails
```bash
inworld login --force
```
```bash
# Clear cached credentials and retry
inworld logout
inworld login
# Check status
inworld status
```
### "Authentication Error" on commands
```bash
inworld status
```
```bash
# Token expired? Re-authenticate
inworld login --force
# Wrong workspace? Select correct one
inworld workspace select
# Verify API key is set (for deployed apps)
echo $INWORLD_API_KEY
```
### Multiple workspaces confusion
```bash
inworld workspace select
# Shows interactive list of available workspaces
```
```bash
inworld status
# Shows currently active workspace
```
## Local Development Issues
### "Graph compilation failed"
```bash
# Compile TypeScript directly to check for syntax errors
npx tsc --noEmit ./graph.ts
```
**Missing Graph ID:**
```typescript
// ❌ Wrong
const graph = new GraphBuilder({
apiKey: process.env.INWORLD_API_KEY
})
// ✅ Correct
const graph = new GraphBuilder({
id: 'my-graph-id', // Required!
apiKey: process.env.INWORLD_API_KEY
})
```
**Missing Dependencies:**
```bash
npm install # Make sure all packages are installed
```
### "Module not found" errors
```bash
# In your project directory
npm install
# Verify package.json exists
ls -la package.json
```
```typescript
// ✅ Correct import
import { GraphBuilder, LLMChatNode } from '@inworld/runtime';
// ❌ Wrong import path
import { GraphBuilder } from 'inworld-runtime';
```
### `inworld serve` not starting
```bash
# Default port 3000 in use?
lsof -i :3000
# Use different port
inworld serve ./graph.ts --port 8080
```
```bash
# Test graph compilation first
inworld run ./graph.ts '{"input":"test"}'
# If that works, then try serve
inworld serve ./graph.ts
```
### `inworld run` hangs or times out
```bash
# Verify API key is set
echo $INWORLD_API_KEY
# Or set it for the session
export INWORLD_API_KEY="your-key-here"
```
```bash
# Try minimal input first
inworld run ./graph.ts '{"input":"hi"}'
# Then gradually add complexity
inworld run ./graph.ts '{
"input": {"message": "hello"},
"userContext": {"targetingKey": "test"}
}'
```
## Project Issues
### `inworld init` creates empty project
```bash
# Initialize with the llm-to-tts-node template
inworld init --template llm-to-tts-node --name my-project
```
```bash
# If template fails, create manually
mkdir my-project && cd my-project
npm init -y
npm install @inworld/runtime
# Copy example from docs or templates
```
### "Dependencies not installed" after `inworld init`
```bash
cd your-project-name
npm install
```
```bash
# Init without auto-install, then install manually
inworld init --template llm-to-tts-node --name my-project --skip-install
cd my-project
npm install
```
### Template cache issues or outdated template
```bash
# Force download fresh template even if cached
inworld init --template llm-to-tts-node --name my-project --force
```
**Note:** The template is cached for 24 hours. Use `--force` to bypass the cache and download the latest version from GitHub.
### Graph shows unexpected behavior
```bash
# Visualize your graph
inworld graph visualize ./graph.ts
# Print graph JSON for inspection
inworld graph variant print ./graph.ts
```
```bash
# Enable verbose logging
export DEBUG=inworld:*
inworld run ./graph.ts '{"input":"test"}'
```
## TTS Issues
### "Voice not found" error
```bash
inworld tts list-voices
```
```bash
# Use default voice if specific one fails
inworld tts synthesize "test message"
```
### TTS synthesis fails
```bash
inworld status
# Verify authentication is valid
```
```bash
# Test with minimal text
inworld tts synthesize "hello"
# Check output directory permissions
ls -la ./
```
## Production Issues
### Deployment Problems
**Problem**: "Graph deployment fails"
```bash
# Check authentication
inworld auth status
```
```
❌ Authentication Error
Token Status:
Inworld Token: eyJraWQiOiJkZWZhdWx0...
Expires At: 12/20/2024, 10:30:45 AM
Status: Expired ❌
⚠️ Please re-authenticate: inworld login
```
```bash
# Verify graph compiles locally
inworld run ./graph.ts '{"input": "test"}'
```
```
❌ Graph Compilation Failed
Error: Missing required configuration
File: ./graph.ts
Line: 15
Issue: Graph ID is required but not specified
💡 Fix: Add id to GraphBuilder constructor:
const graph = new GraphBuilder({ id: 'my-graph-id', ... })
```
```bash
# Check deployment info
inworld deploy ./graph.ts --info
```
```
📋 Deployment Information
Graph Details:
Graph ID: my-graph-id
Source: ./graph.ts
Status: Not deployed ❌
⚠️ This graph has not been deployed yet.
Run: inworld deploy ./graph.ts
```
```bash
# Package locally first to debug
inworld deploy ./graph.ts --package-only
```
```
❌ Packaging Failed
Validation Errors:
- Missing API key in environment
- Graph compilation failed: Syntax error on line 23
- Dependencies not installed: run 'npm install'
💡 Next steps:
1. Set INWORLD_API_KEY environment variable
2. Fix syntax errors in graph.ts
3. Run 'npm install' to install dependencies
```
**Problem**: "Endpoint not responding after deployment"
**Debug Steps:**
1. Check deployment status: `inworld deploy ./graph.ts --info`
2. Verify API key is correct: `inworld auth status`
3. Test with simple input: `curl -X POST [endpoint-url] -H "Authorization: Bearer [api-key]" -d '{"input": "test"}'`
4. Check Portal logs for errors: Navigate to [Portal Logs](/portal/logs)
**Problem**: "High latency after deployment"
**Optimization Steps:**
1. Review graph complexity with visualization
2. Optimize LLM calls (reduce temperature, limit tokens)
3. Remove unnecessary nodes from execution path
4. Use streaming responses where possible
5. Monitor performance in [Portal Dashboards](/portal/dashboards)
**Problem**: "Metrics/traces not appearing in Portal"
**Solutions:**
1. Verify API key is set: `echo $INWORLD_API_KEY`
2. Check telemetry is enabled (default: ON)
3. Wait 1-2 minutes for data to appear
4. Check Portal time range includes execution time
**Problem**: "Graph always uses local config instead of variants"
**Check these issues:**
1. **Missing enableRemoteConfig**: Set `enableRemoteConfig: true` in GraphBuilder
2. **Graph not registered**: Register graph ID in Portal Graph Registry
3. **No active rules**: Enable targeting rules in Portal
4. **Missing API key**: Ensure `INWORLD_API_KEY` environment variable is set
### Experiment Issues
**Problem**: "All users getting the same variant despite traffic splits"
**Causes & Solutions:**
1. **Missing enableRemoteConfig**
```typescript
// WRONG: Remote config disabled (default)
const graph = new GraphBuilder({
id: 'my-graph-id',
apiKey: process.env.INWORLD_API_KEY
})
// CORRECT: Enable remote config
const graph = new GraphBuilder({
id: 'my-graph-id',
apiKey: process.env.INWORLD_API_KEY,
enableRemoteConfig: true // Required for experiments
})
```
2. **Missing or Invalid UserContext**
```bash
# WRONG: No targeting key - all users get same variant
inworld run ./graph.ts '{"input": "hello"}'
```
```json
{
"output": {
"text": "Hello! This is the baseline response.",
"variant": "baseline"
},
"userContext": {
"targetingKey": "default",
"attributes": {}
},
"warning": "All users assigned to same variant due to default targeting key"
}
```
3. **Graph Not Registered in Portal**
- Verify graph ID matches between CLI and Portal
- Check Graph Registry shows your graph as "Registered"
- Ensure variants are uploaded and active
4. **Rules Disabled in Portal**
- Check targeting rules are **Enabled** in Graph Registry
- Verify traffic distribution adds up to 100%
- Confirm rule filters match your user attributes
**Problem**: "Users not getting expected variant distribution"
**Debug Steps:**
1. **Check UserContext**: Ensure unique targeting keys are passed
2. **Verify Rules**: Confirm targeting rules are enabled in Portal
3. **Traffic Distribution**: Verify percentages add up to 100%
4. **Rule Filters**: Ensure user attributes match rule criteria
```bash
# Test with explicit user context
inworld run ./graph.ts '{
"input": "test",
"userContext": {
"targetingKey": "debug-user-123",
"attributes": {"tier": "premium", "region": "us"}
}
}'
```
```json
{
"output": {...},
"experimentInfo": {
"rule": "Premium Users Test",
"variantAssignment": "experiment-v1",
"trafficWeight": 30
}
}
```
**Problem**: "Variant performance differs significantly"
**Debug Steps:**
1. **Check variant complexity**: Use `inworld graph visualize` for each variant
2. **Compare configurations**: Review model parameters, prompt lengths, node counts
3. **Monitor metrics**: Use Portal dashboards to compare P99 latency across variants
4. **Test individually**: Run each variant independently to isolate performance issues
## Quick Diagnostic Commands
When things go wrong, run these commands to gather diagnostic info:
```bash
# 1. Check CLI version and help
inworld --version
inworld help
# 2. Verify authentication
inworld status
# 3. Test basic functionality
inworld run ./graph.ts '{"input":"test"}'
# 4. Check project structure
ls -la
cat package.json
# 5. Verify Node.js and npm
node --version
npm --version
```
## Environment Setup
### Set up development environment variables
```bash
# Add to ~/.bashrc, ~/.zshrc, etc.
export INWORLD_API_KEY="your-api-key-here"
export DEBUG=inworld:* # Enable debug logging
export NODE_ENV=development
# Reload shell
source ~/.bashrc # or ~/.zshrc
```
### Project-specific environment
```bash
# Create .env file in your project
echo "INWORLD_API_KEY=your-key-here" > .env
# Install dotenv for automatic loading
npm install dotenv
```
## Getting More Help
### CLI Help Commands
```bash
inworld help # General help
inworld auth --help # Authentication help
inworld graph --help # Graph commands help
inworld tts --help # TTS commands help
```
### Debug Mode
```bash
# Enable verbose logging for any command
export DEBUG=inworld:*
inworld [your-command]
```
### Before Asking for Help
When reporting issues, please include:
1. **CLI Version**: `inworld --version`
2. **Node Version**: `node --version`
3. **Operating System**: `uname -a` (Linux/macOS) or Windows version
4. **Command that failed**: The exact command you ran
5. **Error message**: Full error output
6. **Authentication status**: `inworld status`
This information helps diagnose issues quickly and effectively.
---
#### SDK Reference
#### Inworld Node.js Agent Runtime Reference
Source: https://docs.inworld.ai/node/runtime-reference/overview
Inworld's Node.js Agent Runtime SDK is a comprehensive toolkit for building AI-powered applications.
## Graphs
At the core of the SDK is the Graph system, an interface to construct complete AI pipelines, from input to final output.
### Core Classes
Import core graph classes from `@inworld/runtime/graph`:
```javascript
Graph,
GraphBuilder,
SubgraphBuilder,
SequentialGraphBuilder
} from '@inworld/runtime/graph';
```
* [Graph](/node/runtime-reference/classes/graph_graph.Graph) - Main graph execution engine
* [GraphBuilder](/node/runtime-reference/classes/graph_dsl_graph_builder.GraphBuilder) - Builder for creating graph configurations
* [SubgraphBuilder](/node/runtime-reference/classes/graph_dsl_subgraph_builder.SubgraphBuilder) - Builder for creating reusable subgraphs
* [SequentialGraphBuilder](/node/runtime-reference/classes/graph_dsl_sequential_graph_builder.SequentialGraphBuilder) - Builder for creating sequential graphs
### Nodes
Import nodes from `@inworld/runtime/graph` or `@inworld/runtime/graph/nodes`:
```javascript
RemoteLLMChatNode,
RemoteTTSNode,
RemoteSTTNode,
SubgraphNode
// ... and other nodes ...
} from '@inworld/runtime/graph';
```
#### Built-in Nodes
#### Built-in Subgraphs
#### Custom Nodes
You can create your own nodes by extending the [CustomNode](/node/runtime-reference/classes/graph_dsl_nodes_custom_node.CustomNode) base class.
### Components
Import components from `@inworld/runtime/graph` or `@inworld/runtime/graph/components`:
```javascript
RemoteLLMComponent,
RemoteKnowledgeComponent,
RemoteEmbedderComponent,
RemoteSTTComponent,
RemoteTTSComponent,
MCPClientComponent
} from '@inworld/runtime/graph';
```
Components are reusable configurations that can be shared across multiple nodes:
* [RemoteLLMComponent](/node/runtime-reference/classes/graph_dsl_components_remote_llm_component.RemoteLLMComponent) - LLM provider configuration
* [RemoteKnowledgeComponent](/node/runtime-reference/classes/graph_dsl_components_remote_knowledge_component.RemoteKnowledgeComponent) - Knowledge base configuration
* [RemoteEmbedderComponent](/node/runtime-reference/classes/graph_dsl_components_remote_embedder_component.RemoteEmbedderComponent) - Text embedding configuration
* [RemoteSTTComponent](/node/runtime-reference/classes/graph_dsl_components_remote_stt_component.RemoteSTTComponent) - Speech-to-text configuration
* [RemoteTTSComponent](/node/runtime-reference/classes/graph_dsl_components_remote_tts_component.RemoteTTSComponent) - Text-to-speech configuration
* [MCPClientComponent](/node/runtime-reference/classes/graph_dsl_components_mcp_client_component.MCPClientComponent) - MCP server configuration
## Key Features
- 🚀 **High Performance** - Optimized graph execution
- 🧠 **AI-First** - Built-in LLM, memory, and knowledge systems
- 🔧 **Extensible** - Custom nodes and components
- 📊 **Observable** - Comprehensive telemetry and logging
- 🌊 **Streaming** - Real-time audio and text processing
---
#### SDK Reference > Graph
#### Graph
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_graph.Graph
## Constructors
* [constructor](#constructor)
## Methods
* [start](#start)
* [toJSON](#tojson)
* [stop](#stop)
* [visualize](#visualize)
* [stopRuntime](#stopruntime)
* [cancelExecution](#cancelexecution)
* [getGraphId](#getgraphid)
## Interfaces
* [ExecutionResult](#executionresult)
* [ExecutionContext](#executioncontext)
## Module Functions
* [isGraphOptions](#isgraphoptions)
***
## Constructors
### constructor
```typescript
new Graph(options: Config): Graph
```
Creates a new Graph instance.
#### Parameters
#### Returns
`Graph`
## Methods
### start
```typescript
start(input: any, ExecutionContext?: ExecutionContext): Promise
```
Starts execution of the graph with the provided input. **This method is asynchronous.** Returns an containing: - `outputStream`: Async iterator for processing results - `executionId`: Unique identifier for this execution - `variantName`: The variant of the graph that was executed **Important Changes in 0.8.0:** - ✅ `start()` is now async - **always use `await`** - ✅ Returns `ExecutionResult` object (not just the stream) - ✅ Graph initialization happens on first call **Execution Flow:** 1. Call `await graph.start(input)` to begin execution 2. Iterate over `outputStream` to process results 3. Call `stopInworldRuntime()` when done
#### Parameters
Input data for the graph. Type depends on your start node's expected input. Common types: string, LLMChatRequest, TTSRequest, Audio, or custom objects.
Optional execution context
#### Returns
`Promise`
### toJSON
```typescript
toJSON(): string
```
Returns the JSON configuration of the graph.
#### Returns
`string`
### stop
```typescript
stop(): Promise
```
Stops the graph executor.
#### Returns
`Promise`
### visualize
```typescript
visualize(outputPath: string): Promise
```
Generate a visualization of the graph structure. Creates a visual representation of the graph in PNG format. This method requires Graphviz to be installed on the system.
#### Parameters
Path where the PNG visualization should be saved
#### Returns
`Promise`
### stopRuntime
```typescript
stopRuntime(): Promise
```
Completely stops Inworld Agent Runtime;
#### Returns
`Promise`
### cancelExecution
```typescript
cancelExecution(id: string): Promise
```
Cancel the execution
#### Parameters
#### Returns
`Promise`
### getGraphId
```typescript
getGraphId(): string
```
Gets the unique identifier for this graph.
#### Returns
`string`
## Interfaces
### ExecutionResult
Result returned from , containing the execution stream and metadata. This interface provides access to the graph execution results through an async iterator, along with execution tracking information.
#### Properties
**variantName**: `string`
**outputStream**: `GraphOutputStream`
**executionId**: `string`
### ExecutionContext
Optional context parameters for graph execution. Provides control over execution tracking, user targeting, and initial state. All parameters are optional and have sensible defaults.
#### Properties
**executionId**?: `string`
**userContext**?: `UserContextInterface`
**dataStoreContent**?: `{ [x: string]: any; }`
## Module Functions
### isGraphOptions
```typescript
isGraphOptions(options: any): boolean
```
#### Parameters
#### Returns
`boolean`
---
#### ExecutionResult
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_graph.ExecutionResult
## Properties
* [variantName](#variantname)
* [outputStream](#outputstream)
* [executionId](#executionid)
***
## Properties
### variantName
```typescript
variantName: string
```
### outputStream
```typescript
outputStream: GraphOutputStream
```
### executionId
```typescript
executionId: string
```
---
#### ExecutionContext
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_graph.ExecutionContext
## Properties
* [executionId](#executionid)
* [userContext](#usercontext)
* [dataStoreContent](#datastorecontent)
***
## Properties
### executionId
```typescript
executionId?: string
```
### userContext
```typescript
userContext?: UserContextInterface
```
### dataStoreContent
```typescript
dataStoreContent?: { [x: string]: any; }
```
---
#### GraphOutputStream
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_GraphOutputStream.GraphOutputStream
## Examples
// With cancellation
const { outputStream } = await graph.start(input);
setTimeout(() => outputStream.abort(), 5000); // Cancel after 5 seconds
for await (const result of outputStream) {
// Processing will stop when abort() is called
}
## Methods
* [next](#next)
* [abort](#abort)
***
## Methods
### next
```typescript
next(): Promise>
```
Gets the next result from the execution.
#### Returns
`Promise>`
### abort
```typescript
abort(): void
```
Cancels the graph execution and stops processing. This method immediately aborts the ongoing graph execution, canceling any pending operations including LLM generation, TTS synthesis, and custom node processing. The cancellation propagates through the entire execution graph. **Use Cases:** - User-initiated cancellation (e.g., stop button) - Timeout handling - Early termination based on conditions - Resource cleanup **Behavior:** - Safe to call multiple times (subsequent calls are no-op) - Stops all async operations in the graph - Causes streams to end early - Triggers cancellation context in custom nodes (`context.isCancelled`)
#### Returns
`void`
---
#### GraphOutputStreamResponse
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_GraphOutputStreamResponse.GraphOutputStreamResponse
## Constructors
* [constructor](#constructor)
## Methods
* [isString](#isstring)
* [isCustom](#iscustom)
* [isClassificationResult](#isclassificationresult)
* [isContent](#iscontent)
* [isGoalAdvancement](#isgoaladvancement)
* [isAudio](#isaudio)
* [isKnowledgeRecords](#isknowledgerecords)
* [isListToolsResponse](#islisttoolsresponse)
* [isLLMChatRequest](#isllmchatrequest)
* [isMatchedIntents](#ismatchedintents)
* [isMatchedKeywords](#ismatchedkeywords)
* [isSafetyResult](#issafetyresult)
* [isMemoryState](#ismemorystate)
* [isToolCallResponse](#istoolcallresponse)
* [isTextStream](#istextstream)
* [isContentStream](#iscontentstream)
* [isTTSOutputStream](#isttsoutputstream)
* [isSpeechChunkStream](#isspeechchunkstream)
* [isGraphError](#isgrapherror)
* [processResponse](#processresponse)
***
## Constructors
### constructor
```typescript
new GraphOutputStreamResponse(data: T, done?: boolean): GraphOutputStreamResponse
```
Creates a new GraphOutputStreamResponse instance.
#### Parameters
The graph output stream data, if available
Whether the graph has finished executing. When true, indicates that
the graph execution has completed and no more data will be streamed.
#### Returns
`GraphOutputStreamResponse`
## Methods
### isString
```typescript
isString(): boolean
```
#### Returns
`boolean`
### isCustom
```typescript
isCustom(): boolean
```
#### Returns
`boolean`
### isClassificationResult
```typescript
isClassificationResult(): boolean
```
#### Returns
`boolean`
### isContent
```typescript
isContent(): boolean
```
#### Returns
`boolean`
### isGoalAdvancement
```typescript
isGoalAdvancement(): boolean
```
#### Returns
`boolean`
### isAudio
```typescript
isAudio(): boolean
```
#### Returns
`boolean`
### isKnowledgeRecords
```typescript
isKnowledgeRecords(): boolean
```
#### Returns
`boolean`
### isListToolsResponse
```typescript
isListToolsResponse(): boolean
```
#### Returns
`boolean`
### isLLMChatRequest
```typescript
isLLMChatRequest(): boolean
```
#### Returns
`boolean`
### isMatchedIntents
```typescript
isMatchedIntents(): boolean
```
#### Returns
`boolean`
### isMatchedKeywords
```typescript
isMatchedKeywords(): boolean
```
#### Returns
`boolean`
### isSafetyResult
```typescript
isSafetyResult(): boolean
```
#### Returns
`boolean`
### isMemoryState
```typescript
isMemoryState(): boolean
```
#### Returns
`boolean`
### isToolCallResponse
```typescript
isToolCallResponse(): boolean
```
#### Returns
`boolean`
### isTextStream
```typescript
isTextStream(): boolean
```
#### Returns
`boolean`
### isContentStream
```typescript
isContentStream(): boolean
```
#### Returns
`boolean`
### isTTSOutputStream
```typescript
isTTSOutputStream(): boolean
```
#### Returns
`boolean`
### isSpeechChunkStream
```typescript
isSpeechChunkStream(): boolean
```
#### Returns
`boolean`
### isGraphError
```typescript
isGraphError(): boolean
```
#### Returns
`boolean`
### processResponse
```typescript
processResponse(handlers: object): Promise
```
Processes the graph output stream response with type safety using a visitor pattern. This function makes it easy to handle different response types in a switch-like pattern.
#### Parameters
Object containing handler functions for each type
#### Returns
`Promise`
---
#### GraphBuilder
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_graph_builder.GraphBuilder
## Examples
```typescript
const graph = new Graph('my_graph')
.addComponent(llmComponent)
.addComponent(embedderComponent)
.addNode(intentNode)
.addNode(llmNode)
.addEdge(intentNode, llmNode)
.setStartNode(intentNode)
.setEndNode(llmNode)
.build();
```
## Constructors
* [constructor](#constructor)
## Methods
* [addSubgraph](#addsubgraph)
* [addIntentSubgraph](#addintentsubgraph)
* [addNode](#addnode)
* [addEdge](#addedge)
* [addComponent](#addcomponent)
* [setStartNode](#setstartnode)
* [setEndNode](#setendnode)
* [setStartNodes](#setstartnodes)
* [setEndNodes](#setendnodes)
* [addMCPSubgraph](#addmcpsubgraph)
* [build](#build)
***
## Constructors
### constructor
```typescript
new GraphBuilder(opts: string | GraphBuilderProps): GraphBuilder
```
Creates a new graph builder. Accepts either an options object or a graph ID string.
#### Parameters
Graph builder options or graph ID string
#### Returns
`GraphBuilder`
## Methods
### addSubgraph
```typescript
addSubgraph(subgraph: SubgraphBuilder): this
```
Adds a subgraph to the graph configuration.
#### Parameters
Subgraph builder instance to be added
#### Returns
`this`
### addIntentSubgraph
```typescript
addIntentSubgraph(id: string, parameters: Config): this
```
Adds an intent subgraph to the graph.
#### Parameters
Unique identifier for the subgraph
Intent subgraph parameters
#### Returns
`this`
### addNode
```typescript
addNode(node: Node | AbstractNode): this
```
Adds a node to the graph. If an is provided without corresponding component, internal components are automatically added.
#### Parameters
Node to add to the graph
#### Returns
`this`
### addEdge
```typescript
addEdge(fromNode: string | Node | AbstractNode, toNode: string | Node | AbstractNode, options?: object): this
```
Adds an edge connecting two nodes in the graph.
#### Parameters
Source node
Destination node
Optional edge configuration
#### Returns
`this`
### addComponent
```typescript
addComponent(component: Component | AbstractComponent): this
```
Adds a component to the graph configuration.
#### Parameters
Component to add to the graph
#### Returns
`this`
### setStartNode
```typescript
setStartNode(node: string | Node | AbstractNode): this
```
Sets the start node of the graph.
#### Parameters
Start node
#### Returns
`this`
### setEndNode
```typescript
setEndNode(node: string | Node | AbstractNode): this
```
Sets the end node of the graph.
#### Parameters
End node
#### Returns
`this`
### setStartNodes
```typescript
setStartNodes(nodes: (string | Node | AbstractNode)[]): this
```
Sets multiple start nodes for the graph.
#### Parameters
Array of start nodes
#### Returns
`this`
### setEndNodes
```typescript
setEndNodes(nodes: (string | Node | AbstractNode)[]): this
```
Sets multiple end nodes for the graph.
#### Parameters
Array of end nodes
#### Returns
`this`
### addMCPSubgraph
```typescript
addMCPSubgraph(id: string, parameters: Config): this
```
Adds an MCP subgraph to the graph.
#### Parameters
Unique identifier for the subgraph
MCP subgraph parameters
#### Returns
`this`
### build
```typescript
build(): Graph
```
Creates a graph executor instance for running the graph.
#### Returns
`Graph`
---
#### GraphBuilderProps
Source: https://docs.inworld.ai/node/runtime-reference/types/graph_dsl_graph_builder.GraphBuilderProps
Represents the properties required to initialize and configure a graph builder.
---
#### SubgraphBuilder
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_subgraph_builder.SubgraphBuilder
## Examples
```typescript
const subgraph = new SubgraphBuilder('my_subgraph')
.addParameter({ name: 'user_input', type: 'string' })
.addNode(intentNode)
.addNode(llmNode)
.addEdge(intentNode, llmNode)
.setStartNode(intentNode)
.setEndNode(llmNode)
.build();
```
## Constructors
* [constructor](#constructor)
## Methods
* [addParameter](#addparameter)
* [addParameters](#addparameters)
* [addNode](#addnode)
* [addEdge](#addedge)
* [setStartNode](#setstartnode)
* [setEndNode](#setendnode)
* [build](#build)
***
## Constructors
### constructor
```typescript
new SubgraphBuilder(id: string): SubgraphBuilder
```
Creates a new subgraph builder with the specified ID.
#### Parameters
Unique identifier for the subgraph
#### Returns
`SubgraphBuilder`
## Methods
### addParameter
```typescript
addParameter(config: object): this
```
Adds a parameter to the subgraph that can be passed from the parent graph.
#### Parameters
Parameter configuration
#### Returns
`this`
### addParameters
```typescript
addParameters(parameters: object): this
```
Adds multiple parameters to the subgraph at once.
#### Parameters
Array of parameter configurations
#### Returns
`this`
### addNode
```typescript
addNode(node: Node | AbstractNode): this
```
Adds a node to the subgraph.
#### Parameters
Node to add to the subgraph
#### Returns
`this`
### addEdge
```typescript
addEdge(fromNode: string | Node | AbstractNode, toNode: string | Node | AbstractNode, options?: object): this
```
Adds an edge connecting two nodes in the subgraph.
#### Parameters
Source node or node ID
Destination node or node ID
Optional edge configuration
#### Returns
`this`
### setStartNode
```typescript
setStartNode(node: string | Node | AbstractNode): this
```
Sets the start node of the subgraph (subgraphs can only have one start node).
#### Parameters
Start node or node ID
#### Returns
`this`
### setEndNode
```typescript
setEndNode(node: string | Node | AbstractNode): this
```
Sets the end node of the subgraph (subgraphs can only have one end node).
#### Parameters
End node or node ID
#### Returns
`this`
### build
```typescript
build(): Subgraph
```
Builds and returns the final subgraph configuration.
#### Returns
`Subgraph`
---
#### SequentialGraphBuilder
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_sequential_graph_builder.SequentialGraphBuilder
## Constructors
* [constructor](#constructor)
## Methods
* [addSequentialNode](#addsequentialnode)
## Interfaces
* [SequentialGraphBuilderProps](#sequentialgraphbuilderprops)
***
## Constructors
### constructor
```typescript
new SequentialGraphBuilder(opts: SequentialGraphBuilderProps): SequentialGraphBuilder
```
Creates a new instance of `SequentialGraphBuilder` and builds the sequential graph.
#### Parameters
The configuration options for the sequential graph builder.
#### Returns
`SequentialGraphBuilder`
## Methods
### addSequentialNode
```typescript
addSequentialNode(node: AbstractNode): this
```
Adds a node to the graph, links it from the previous end node (if any), and sets the passed node as the new end node.
#### Parameters
The node to add to the graph.
#### Returns
`this`
## Interfaces
### SequentialGraphBuilderProps
Configuration properties for the `SequentialGraphBuilder`.
#### Properties
**nodes**: `AbstractNode[]`
The list of nodes to be connected sequentially.
---
#### SDK Reference > Nodes > AbstractNode
#### AbstractNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_abstract_node.AbstractNode
## Constructors
* [constructor](#constructor)
## Interfaces
* [AbstractNodeProps](#abstractnodeprops)
## Module Variables
* [INTERNAL_COMPONENTS](#internal_components)
* [TO_GRAPH_CONFIG_NODE](#to_graph_config_node)
***
## Constructors
### constructor
```typescript
new AbstractNode(props: AbstractNodeProps): AbstractNode
```
Creates a new `AbstractNode`.
#### Parameters
Node configuration including optional `id` and
`reportToClient` flag.
#### Returns
`AbstractNode`
## Interfaces
### AbstractNodeProps
Base configuration for any graph node.
#### Properties
**id**?: `string`
Optional explicit node identifier.
**reportToClient**?: `boolean`
Whether this node should report its outputs to the client. If set to `true`, you will see the output of this node in the .
## Module Variables
### INTERNAL_COMPONENTS
```typescript
INTERNAL_COMPONENTS: any
```
### TO_GRAPH_CONFIG_NODE
```typescript
TO_GRAPH_CONFIG_NODE: any
```
---
#### AbstractNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_abstract_node.AbstractNodeProps
## Properties
* [id](#id)
* [reportToClient](#reporttoclient)
***
## Properties
### id
```typescript
id?: string
```
Optional explicit node identifier.
### reportToClient
```typescript
reportToClient?: boolean
```
Whether this node should report its outputs to the client. If set to `true`, you will see the output of this node in the .
---
#### SDK Reference > Nodes > KeywordMatcherNode
#### KeywordMatcherNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_keyword_matcher_node.KeywordMatcherNode
## Input
**Type:** `String`
The data type that KeywordMatcherNode accepts as input
## Output
**Type:** `GraphTypes.MatchedKeywords`
The data type that KeywordMatcherNode outputs
## Examples
```typescript
const keywordNode = new KeywordMatcherNode({
keywords: ['urgent', 'important', 'priority', 'asap'],
reportToClient: true
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [KeywordMatcherNodeProps](#keywordmatchernodeprops)
***
## Constructors
### constructor
```typescript
new KeywordMatcherNode(props: KeywordMatcherNodeProps): KeywordMatcherNode
```
Creates a new KeywordMatcherNode instance.
#### Parameters
Configuration for the keyword matcher node.
#### Returns
`KeywordMatcherNode`
## Interfaces
### KeywordMatcherNodeProps
Configuration interface for `KeywordMatcherNode` creation.
#### Properties
**keywords**: `string[] | { name: string; keywords: string[]; }[]`
Keywords can be either a flat array of strings or structured groups. For CPP compatibility, use structured format: `Array<{name: string, keywords: string[]}>`
---
#### KeywordMatcherNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_keyword_matcher_node.KeywordMatcherNodeProps
## Properties
* [keywords](#keywords)
***
## Properties
### keywords
```typescript
keywords: string[] | { name: string; keywords: string[]; }[]
```
Keywords can be either a flat array of strings or structured groups. For CPP compatibility, use structured format: `Array<{name: string, keywords: string[]}>`
---
#### SDK Reference > Nodes > KnowledgeNode
#### KnowledgeNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_knowledge_node.KnowledgeNode
## Input
**Type:** `String`
The data type that KnowledgeNode accepts as input
## Output
**Type:** `GraphTypes.KnowledgeRecords`
The data type that KnowledgeNode outputs
## Examples
```typescript
// Using knowledge provider configuration
const knowledgeNode = new KnowledgeNode({
id: 'my-knowledge-node',
knowledgeId: 'company-docs',
knowledgeRecords: ['policy-1', 'policy-2', 'faq-1'],
retrievalConfig: {
threshold: 0.8,
topK: 3
}
});
// Using existing knowledge component
const knowledgeComponent = new RemoteKnowledgeComponent({ id: 'existing-knowledge-component' });
const knowledgeNodeWithComponent = new KnowledgeNode({
id: 'my-knowledge-node',
knowledgeId: 'company-docs',
knowledgeRecords: ['policy-1', 'policy-2', 'faq-1'],
knowledgeComponent
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [KnowledgeNodeProps](#knowledgenodeprops)
* [KnowledgeNodeWithComponentProps](#knowledgenodewithcomponentprops)
***
## Constructors
### constructor
```typescript
new KnowledgeNode(props: KnowledgeNodeProps | KnowledgeNodeWithComponentProps): KnowledgeNode
```
Creates a new KnowledgeNode instance.
#### Parameters
Configuration for the knowledge node.
#### Returns
`KnowledgeNode`
## Interfaces
### KnowledgeNodeProps
Configuration for `KnowledgeNode` using knowledge provider settings.
#### Properties
**knowledgeId**: `string`
ID of the knowledge collection
**knowledgeRecords**: `string[]`
Collection of knowledge records
**maxCharsPerChunk**?: `number`
Maximum characters per chunk (default: 1000)
**maxChunksPerDocument**?: `number`
Maximum chunks per document (default: 10)
**retrievalConfig**?: `object`
Configuration for retrieving relevant information
### KnowledgeNodeWithComponentProps
Configuration for `KnowledgeNode` using an existing knowledge component.
#### Properties
**knowledgeId**: `string`
ID of the knowledge collection
**knowledgeRecords**: `string[]`
Collection of knowledge records
**knowledgeComponent**: `RemoteKnowledgeComponent`
ID of the existing knowledge component to use
**retrievalConfig**?: `object`
Configuration for retrieving relevant information
---
#### KnowledgeNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_knowledge_node.KnowledgeNodeProps
## Properties
* [knowledgeId](#knowledgeid)
* [knowledgeRecords](#knowledgerecords)
* [maxCharsPerChunk](#maxcharsperchunk)
* [maxChunksPerDocument](#maxchunksperdocument)
* [retrievalConfig](#retrievalconfig)
***
## Properties
### knowledgeId
```typescript
knowledgeId: string
```
ID of the knowledge collection
### knowledgeRecords
```typescript
knowledgeRecords: string[]
```
Collection of knowledge records
### maxCharsPerChunk
```typescript
maxCharsPerChunk?: number
```
Maximum characters per chunk (default: 1000)
### maxChunksPerDocument
```typescript
maxChunksPerDocument?: number
```
Maximum chunks per document (default: 10)
### retrievalConfig
```typescript
retrievalConfig?: object
```
Configuration for retrieving relevant information
---
#### SDK Reference > Nodes > LLMPromptBuilderNode
#### LLMPromptBuilderNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_llm_prompt_builder_node.LLMPromptBuilderNode
## Input
**Type:** `Object`
The data type that LLMPromptBuilderNode accepts as input
## Output
**Type:** `String`
The data type that LLMPromptBuilderNode outputs
## Examples
```typescript
const promptBuilderNode = new LLMPromptBuilderNode({
id: 'prompt-builder',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: '{{user_question}}' }
],
responseFormat: 'text'
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [LLMPromptBuilderNodeProps](#llmpromptbuildernodeprops)
***
## Constructors
### constructor
```typescript
new LLMPromptBuilderNode(props?: LLMPromptBuilderNodeProps): LLMPromptBuilderNode
```
Creates a new LLMPromptBuilderNode instance.
#### Parameters
Configuration for the LLM prompt builder node.
#### Returns
`LLMPromptBuilderNode`
## Interfaces
### LLMPromptBuilderNodeProps
Configuration interface for `LLMPromptBuilderNode` creation.
#### Properties
**messages**?: `MessageTemplate[]`
List of message templates
**tools**?: `Tool[]`
List of available tools for function calling
**toolChoice**?: `object`
Tool choice configuration
**responseFormat**?: `"text" | "json" | "json_schema"`
Response format specification ('text', 'json', 'json_schema')
---
#### LLMPromptBuilderNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_llm_prompt_builder_node.LLMPromptBuilderNodeProps
## Properties
* [messages](#messages)
* [tools](#tools)
* [toolChoice](#toolchoice)
* [responseFormat](#responseformat)
***
## Properties
### messages
```typescript
messages?: MessageTemplate[]
```
List of message templates
### tools
```typescript
tools?: Tool[]
```
List of available tools for function calling
### toolChoice
```typescript
toolChoice?: object
```
Tool choice configuration
### responseFormat
```typescript
responseFormat?: "text" | "json" | "json_schema"
```
Response format specification ('text', 'json', 'json_schema')
---
#### SDK Reference > Nodes > MCPCallToolNode
#### MCPCallToolNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_mcp_call_tool_node.MCPCallToolNode
## Input
**Type:** `GraphTypes.ToolCallRequest`
The data type that MCPCallToolNode accepts as input
## Output
**Type:** `GraphTypes.ToolCallResponse`
The data type that MCPCallToolNode outputs
## Examples
```typescript
const mcpComponent = new MCPClientComponent({ sessionConfig: { ... } });
const mcpCallToolNode = new MCPCallToolNode({
mcpComponent,
reportToClient: true
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [MCPCallToolNodeProps](#mcpcalltoolnodeprops)
* [MCPCallToolNodeWithComponentProps](#mcpcalltoolnodewithcomponentprops)
***
## Constructors
### constructor
```typescript
new MCPCallToolNode(props: MCPCallToolNodeProps | MCPCallToolNodeWithComponentProps): MCPCallToolNode
```
Creates a new MCPCallToolNode instance.
#### Parameters
Configuration for the MCP call tool node.
#### Returns
`MCPCallToolNode`
## Interfaces
### MCPCallToolNodeProps
#### Properties
**sessionConfig**: `Config`
MCP session configuration for establishing the connection to the MCP server.
### MCPCallToolNodeWithComponentProps
Configuration interface for `MCPCallToolNode` creation.
#### Properties
**mcpComponent**: `MCPClientComponent`
Existing MCP component to use.
---
#### SDK Reference > Nodes > MCPListToolsNode
#### MCPListToolsNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_mcp_list_tools_node.MCPListToolsNode
## Input
**Type:** `any`
any - The data type that MCPListToolsNode accepts as input
## Output
**Type:** `GraphTypes.ListToolsResponse`
The data type that MCPListToolsNode outputs
## Examples
```typescript
const mcpComponent = new MCPClientComponent({ id: 'my-mcp-component', sessionConfig: { ... } });
const mcpListToolsNode = new MCPListToolsNode({
mcpComponent,
reportToClient: true
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [MCPListToolsNodeProps](#mcplisttoolsnodeprops)
* [MCPListToolsNodePropsWithComponent](#mcplisttoolsnodepropswithcomponent)
***
## Constructors
### constructor
```typescript
new MCPListToolsNode(props: MCPListToolsNodeProps | MCPListToolsNodePropsWithComponent): MCPListToolsNode
```
Creates a new MCPListToolsNode instance.
#### Parameters
Configuration for the MCP list tools node.
#### Returns
`MCPListToolsNode`
## Interfaces
### MCPListToolsNodeProps
#### Properties
**sessionConfig**: `Config`
MCP session configuration
**reportToClient**?: `boolean`
Whether to report results to a client
### MCPListToolsNodePropsWithComponent
Configuration interface for `MCPListToolsNode` creation.
#### Properties
**mcpComponent**: `MCPClientComponent`
Existing MCP component to use
**reportToClient**?: `boolean`
Whether to report results to a client
---
#### MCPListToolsNodePropsWithComponent
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_mcp_list_tools_node.MCPListToolsNodePropsWithComponent
## Properties
* [mcpComponent](#mcpcomponent)
* [reportToClient](#reporttoclient)
***
## Properties
### mcpComponent
```typescript
mcpComponent: MCPClientComponent
```
Existing MCP component to use
### reportToClient
```typescript
reportToClient?: boolean
```
Whether to report results to a client
---
#### SDK Reference > Nodes > ProxyNode
#### ProxyNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_proxy_node.ProxyNode
## Input
**Type:** `any`
any - The data type that ProxyNode accepts as input
## Output
**Type:** `any`
any - The data type that ProxyNode outputs
## Examples
```typescript
const proxyNode = new ProxyNode({
id: 'my-proxy-node',
reportToClient: false
});
```
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new ProxyNode(props?: AbstractNodeProps): ProxyNode
```
Creates a new ProxyNode instance.
#### Parameters
Configuration for the proxy node.
#### Returns
`ProxyNode`
---
#### SDK Reference > Nodes > RandomCannedTextNode
#### RandomCannedTextNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_random_canned_text_node.RandomCannedTextNode
## Input
**Type:** `any`
any - The data type that RandomCannedTextNode accepts as input
## Output
**Type:** `String`
The data type that RandomCannedTextNode outputs
## Examples
```typescript
const cannedTextNode = new RandomCannedTextNode({
id: 'greeting-node',
cannedPhrases: [
'Hello! How can I help you?',
'Hi there! What can I do for you?',
'Welcome! How may I assist you today?'
]
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RandomCannedTextNodeProps](#randomcannedtextnodeprops)
***
## Constructors
### constructor
```typescript
new RandomCannedTextNode(props: RandomCannedTextNodeProps): RandomCannedTextNode
```
Creates a new RandomCannedTextNode instance.
#### Parameters
Configuration for the random canned text node.
#### Returns
`RandomCannedTextNode`
## Interfaces
### RandomCannedTextNodeProps
Configuration interface for RandomCannedTextNode creation.
#### Properties
**cannedPhrases**: `string[]`
List of phrases to randomly select from.
---
#### RandomCannedTextNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_random_canned_text_node.RandomCannedTextNodeProps
## Properties
* [cannedPhrases](#cannedphrases)
***
## Properties
### cannedPhrases
```typescript
cannedPhrases: string[]
```
List of phrases to randomly select from.
---
#### SDK Reference > Nodes > RemoteLLMChatNode
#### RemoteLLMChatNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_remote_llm_chat_node.RemoteLLMChatNode
## Input
**Type:** `LLMChatRequest`
The data type that LLMChatNode accepts as input
## Output
**Type:** `LLMChatResponse` | [Content](/node/runtime-reference/classes/common_data_types_api_content.Content)
The data type that LLMChatNode outputs. See [Content](/node/runtime-reference/classes/common_data_types_api_content.Content) for more details.
## Examples
```typescript
// Using LLM provider configuration
const llmNode = new RemoteLLMChatNode({
id: 'my-llm-node',
provider: 'openai',
modelName: 'gpt-4o-mini',
stream: true
});
// Using existing LLM component
const llmNodeWithComponent = new RemoteLLMChatNode({
id: 'my-llm-node',
llmComponent: existingLLMComponent
});
// Using default settings
const defaultLlmNode = new RemoteLLMChatNode();
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteLLMChatNodeProps](#remotellmchatnodeprops)
* [RemoteLLMChatNodeWithLLMComponentProps](#remotellmchatnodewithllmcomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteLLMChatNode(props?: RemoteLLMChatNodeProps | RemoteLLMChatNodeWithLLMComponentProps): RemoteLLMChatNode
```
Creates a new RemoteLLMChatNode instance.
#### Parameters
Optional configuration for the chat node.
#### Returns
`RemoteLLMChatNode`
## Interfaces
### RemoteLLMChatNodeProps
Configuration for `RemoteLLMChatNode` using LLM provider settings.
#### Properties
**textGenerationConfig**?: `object`
Text generation configuration parameters
**stream**?: `boolean`
Whether to stream responses
**provider**?: `string`
LLM provider (e.g., 'openai', 'anthropic', 'inworld')
**modelName**?: `string`
Model name specific to the provider (e.g., 'gpt-4', 'claude-3-5-sonnet-20241022')
**responseFormat**?: `"text" | "json" | "json_schema"`
Model name specific to the provider (e.g., 'gpt-4', 'claude-3-5-sonnet-20241022')
**messageTemplates**?: `Camelize[]`
Model name specific to the provider (e.g., 'gpt-4', 'claude-3-5-sonnet-20241022')
### RemoteLLMChatNodeWithLLMComponentProps
Configuration for `RemoteLLMChatNode` using an existing LLM component.
#### Properties
**llmComponent**: `RemoteLLMComponent | AbstractComponent`
ID of the existing LLM component to use
**textGenerationConfig**?: `object`
Text generation configuration parameters
**stream**?: `boolean`
Whether to stream responses
**responseFormat**: `"text" | "json" | "json_schema"`
Whether to stream responses
**messageTemplates**?: `Camelize[]`
Whether to stream responses
---
#### RemoteLLMChatNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_remote_llm_chat_node.RemoteLLMChatNodeProps
## Properties
* [textGenerationConfig](#textgenerationconfig)
* [stream](#stream)
* [provider](#provider)
* [modelName](#modelname)
* [responseFormat](#responseformat)
* [messageTemplates](#messagetemplates)
***
## Properties
### textGenerationConfig
```typescript
textGenerationConfig?: object
```
Text generation configuration parameters
### stream
```typescript
stream?: boolean
```
Whether to stream responses
### provider
```typescript
provider?: string
```
LLM provider (e.g., 'openai', 'anthropic', 'inworld')
### modelName
```typescript
modelName?: string
```
Model name specific to the provider (e.g., 'gpt-4', 'claude-3-5-sonnet-20241022')
### responseFormat
```typescript
responseFormat?: "text" | "json" | "json_schema"
```
Model name specific to the provider (e.g., 'gpt-4', 'claude-3-5-sonnet-20241022')
### messageTemplates
```typescript
messageTemplates?: Camelize[]
```
Model name specific to the provider (e.g., 'gpt-4', 'claude-3-5-sonnet-20241022')
---
#### SDK Reference > Nodes > RemoteSTTNode
#### RemoteSTTNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_remote_stt_node.RemoteSTTNode
## Input
**Type:** `GraphTypes.Audio`
The data type that STTNode accepts as input
## Output
**Type:** `String`
The data type that STTNode outputs
## Examples
```typescript
// Using provider configuration
const sttNode = new RemoteSTTNode({
id: 'my-stt-node',
sttConfig: { language_code: 'en-US' }
});
// Using existing STT component
const sttNodeWithComponent = new RemoteSTTNode({
id: 'my-stt-node',
sttComponent: existingSTTComponent
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteSTTNodeProps](#remotesttnodeprops)
* [RemoteSTTNodeWithComponentProps](#remotesttnodewithcomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteSTTNode(props?: RemoteSTTNodeProps | RemoteSTTNodeWithComponentProps): RemoteSTTNode
```
Creates a new RemoteSTTNode instance.
#### Parameters
Configuration for the STT node.
#### Returns
`RemoteSTTNode`
## Interfaces
### RemoteSTTNodeProps
Configuration for `RemoteSTTNode` using STT provider settings.
#### Properties
**sttConfig**: `{ languageCode?: string; }`
STT configuration object
### RemoteSTTNodeWithComponentProps
Configuration for `RemoteSTTNode` using an existing STT component.
#### Properties
**sttComponent**: `RemoteSTTComponent`
Existing STT component to use
---
#### RemoteSTTNodeWithComponentProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_remote_stt_node.RemoteSTTNodeWithComponentProps
## Properties
* [sttComponent](#sttcomponent)
***
## Properties
### sttComponent
```typescript
sttComponent: RemoteSTTComponent
```
Existing STT component to use
---
#### SDK Reference > Nodes > RemoteTTSNode
#### RemoteTTSNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_remote_tts_node.RemoteTTSNode
## Input
**Type:** `String` | `TextStream` | [TTSRequest](/node/runtime-reference/classes/common_data_types_api_tts_request.TTSRequest)
The data type that TTSNode accepts as input. See `TextStream` and [TTSRequest](/node/runtime-reference/classes/common_data_types_api_tts_request.TTSRequest) for more details.
## Output
**Type:** `GraphTypes.TTSOutputStream`
The data type that TTSNode outputs
## Examples
```typescript
// Using provider configuration
const ttsNode = new RemoteTTSNode({
modelId: 'inworld-voice-v1',
temperature: 1.0,
speakingRate: 1.1,
sampleRate: 22050
});
// Using existing TTS component
const ttsNodeWithComponent = new RemoteTTSNode({
ttsComponent: existingTTSComponent
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteTTSNodeProps](#remotettsnodeprops)
***
## Constructors
### constructor
```typescript
new RemoteTTSNode(props?: RemoteTTSNodeProps): RemoteTTSNode
```
Creates a new RemoteTTSNode instance.
#### Parameters
Configuration for the TTS node.
#### Returns
`RemoteTTSNode`
## Interfaces
### RemoteTTSNodeProps
Configuration for `RemoteTTSNode` using provider settings.
#### Properties
**speakerId**?: `string`
Voice configuration for synthesis
**languageCode**?: `string`
Voice configuration for synthesis
**modelId**?: `string`
Voice configuration for synthesis
**temperature**?: `number`
Voice configuration for synthesis
**speakingRate**?: `number`
Voice configuration for synthesis
**sampleRate**?: `number`
Voice configuration for synthesis
**ttsComponent**?: `AbstractComponent | RemoteTTSComponent`
Voice configuration for synthesis
---
#### RemoteTTSNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_remote_tts_node.RemoteTTSNodeProps
## Properties
* [speakerId](#speakerid)
* [languageCode](#languagecode)
* [modelId](#modelid)
* [temperature](#temperature)
* [speakingRate](#speakingrate)
* [sampleRate](#samplerate)
* [ttsComponent](#ttscomponent)
***
## Properties
### speakerId
```typescript
speakerId?: string
```
The voice ID to use for synthesis. See [available voices](/api-reference/ttsAPI/texttospeech/list-voices) for options.
### languageCode
```typescript
languageCode?: string
```
The language code for the synthesized speech (e.g., 'en-US', 'es-ES').
### modelId
```typescript
modelId?: string
```
The TTS model to use for synthesis. See [available TTS models](/models#tts) for options.
### temperature
```typescript
temperature?: number
```
Controls randomness in synthesis. Range (0, 2]. Higher values produce more expressive and creative output, while lower values are more deterministic. Default is typically 1.1.
### speakingRate
```typescript
speakingRate?: number
```
Speaking rate/speed multiplier. Range [0.5, 1.5]. 1.0 is the normal native speed. Values less than 1.0 slow down speech, values greater than 1.0 speed it up.
### sampleRate
```typescript
sampleRate?: number
```
Audio output sample rate in Hz (e.g., 22050, 44100, 48000). Determines the quality and file size of the generated audio.
### ttsComponent
```typescript
ttsComponent?: AbstractComponent | RemoteTTSComponent
```
An existing TTS component to reuse across multiple nodes. If provided, other configuration options (speakerId, modelId, etc.) are ignored in favor of the component's configuration.
---
#### SDK Reference > Nodes > SubgraphNode
#### SubgraphNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_subgraph_node.SubgraphNode
## Input
**Type:** `any`
any - The data type that SubgraphNode accepts as input
## Output
**Type:** `any`
any - The data type that SubgraphNode outputs
## Examples
```typescript
const subgraphNode = new SubgraphNode({
id: 'intent-processing-subgraph',
parameters: {
confidence_threshold: 0.8,
max_intents: 5,
fallback_enabled: true
}
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [SubgraphNodeProps](#subgraphnodeprops)
***
## Constructors
### constructor
```typescript
new SubgraphNode(props: SubgraphNodeProps): SubgraphNode
```
Creates a new SubgraphNode instance.
#### Parameters
Configuration for the subgraph node.
#### Returns
`SubgraphNode`
## Interfaces
### SubgraphNodeProps
Configuration interface for `SubgraphNode` creation.
#### Properties
**subgraphId**: `string`
ID of the subgraph to reference
**parameters**?: `{ [k: string]: string | number | boolean; }`
Parameters to pass to the subgraph
---
#### SDK Reference > Nodes > TextAggregatorNode
#### TextAggregatorNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_text_aggregator_node.TextAggregatorNode
## Input
**Type:** `String` | `TextStream` | `LLMChatResponse` | [Content](/node/runtime-reference/classes/common_data_types_api_content.Content)
The data type that TextAggregatorNode accepts as input. See `TextStream` and [Content](/node/runtime-reference/classes/common_data_types_api_content.Content) for more details.
## Output
**Type:** `String`
The data type that TextAggregatorNode outputs
## Examples
```typescript
const aggregatorNode = new TextAggregatorNode({
id: 'my-aggregator-node',
reportToClient: true
});
```
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new TextAggregatorNode(props?: AbstractNodeProps): TextAggregatorNode
```
Creates a new TextAggregatorNode instance.
#### Parameters
Configuration for the text aggregator node.
#### Returns
`TextAggregatorNode`
---
#### SDK Reference > Nodes > TextChunkingNode
#### TextChunkingNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_text_chunking_node.TextChunkingNode
## Input
**Type:** `TextStream` | `ContentStream`
The data type that TextChunkingNode accepts as input. See `TextStream` and `ContentStream` for more details.
## Output
**Type:** `GraphTypes.TextStream`
The data type that TextChunkingNode outputs
## Examples
```typescript
const chunkingNode = new TextChunkingNode({
id: 'my-chunking-node',
reportToClient: true,
minChunkLength: 100
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [TextChunkingNodeProps](#textchunkingnodeprops)
***
## Constructors
### constructor
```typescript
new TextChunkingNode(props?: TextChunkingNodeProps): TextChunkingNode
```
Creates a new TextChunkingNode instance.
#### Parameters
Configuration for the text chunking node.
#### Returns
`TextChunkingNode`
## Interfaces
### TextChunkingNodeProps
Configuration properties for TextChunkingNode.
#### Properties
**minChunkLength**?: `number`
Optional minimum chunk length (in bytes) for sentence splitting. If not provided, the default value is 80.
---
#### SDK Reference > Nodes > TextClassifierNode
#### TextClassifierNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_text_classifier_node.TextClassifierNode
## Input
**Type:** `String`
The data type that TextClassifierNode accepts as input
## Output
**Type:** `GraphTypes.ClassificationResult`
The data type that TextClassifierNode outputs
## Examples
```typescript
// Using component configuration
const classifierNode = new TextClassifierNode({
id: 'content-classifier',
modelWeightsPath: '/models/classifier_weights.json',
embedderComponentId: 'text_embedder_component',
supportedClasses: ['hategroup', 'selfharm', 'sexual', 'sexualminors', 'substance'],
classifierConfig: {
classes: [
{ label: 'hategroup', threshold: 0.8 },
{ label: 'selfharm', threshold: 0.9 }
]
}
});
// Using existing text classifier component
const classifierComponent = new TextClassifierComponent({
id: 'existing-classifier-component',
modelWeightsPath: '/models/classifier_weights.json',
embedderComponentId: 'text_embedder_component',
supportedClasses: ['hategroup', 'selfharm']
});
const classifierNodeWithComponent = new TextClassifierNode({
id: 'content-classifier',
textClassifierComponent: classifierComponent
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [ClassifierClass](#classifierclass)
* [ClassifierConfig](#classifierconfig)
* [TextClassifierNodeProps](#textclassifiernodeprops)
* [TextClassifierNodeWithComponentProps](#textclassifiernodewithcomponentprops)
***
## Constructors
### constructor
```typescript
new TextClassifierNode(props: TextClassifierNodeProps | TextClassifierNodeWithComponentProps): TextClassifierNode
```
Creates a new TextClassifierNode instance.
#### Parameters
Configuration for the text classifier node.
#### Returns
`TextClassifierNode`
## Interfaces
### ClassifierClass
Configuration for a single classification class
#### Properties
**label**: `string`
The class label for the classification category (raw name like "hategroup", "selfharm")
**threshold**: `number`
Threshold value for classification confidence
### ClassifierConfig
Configuration for the text classifier
#### Properties
**classes**: `ClassifierClass[]`
Array of classes to classify with their thresholds
### TextClassifierNodeProps
Array of classes to classify with their thresholds
#### Properties
**modelWeightsPath**: `string`
Path to model weights for text classification
**embedderComponentId**: `string`
Text embedder component ID to use for semantic analysis
**supportedClasses**: `string[]`
List of supported classes that the model can classify
**classifierConfig**?: `ClassifierConfig`
Optional classifier configuration with classes and thresholds
**errorHandling**?: `NodeErrorConfig`
Optional error handling configuration for the node
### TextClassifierNodeWithComponentProps
Optional error handling configuration for the node
#### Properties
**textClassifierComponent**: `TextClassifierComponent`
Pre-configured text classifier component to reuse
**classifierConfig**?: `ClassifierConfig`
Optional classifier configuration with classes and thresholds
**errorHandling**?: `NodeErrorConfig`
Optional error handling configuration for the node
---
#### ClassifierClass
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_text_classifier_node.ClassifierClass
## Properties
* [label](#label)
* [threshold](#threshold)
***
## Properties
### label
```typescript
label: string
```
The class label for the classification category (raw name like "hategroup", "selfharm")
### threshold
```typescript
threshold: number
```
Threshold value for classification confidence
---
#### ClassifierConfig
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_text_classifier_node.ClassifierConfig
## Properties
* [classes](#classes)
***
## Properties
### classes
```typescript
classes: ClassifierClass[]
```
Array of classes to classify with their thresholds
---
#### SDK Reference > Nodes > CustomNode
#### CustomNode
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_nodes_custom_node.CustomNode
## Examples
```typescript
// Define a custom node that processes the input text
class CustomTextNode extends CustomNode {
async process(
context: ProcessContext,
input: string,
): Promise<{ processedText: string }> {
return {
processedText: input.toUpperCase(),
};
}
}
// Create an instance of the custom node
const customTextNode = new CustomTextNode();
```
## Constructors
* [constructor](#constructor)
## Methods
* [process](#process)
## Interfaces
* [CustomNodeProps](#customnodeprops)
***
## Constructors
### constructor
```typescript
new CustomNode(props?: CustomNodeProps): CustomNode
```
Creates a new `CustomNode`.
#### Parameters
">
Custom node options including optional `executionConfig`.
#### Returns
`CustomNode`
## Methods
### process
```typescript
process(context: ProcessContext, inputs: InputType[]): OutputType | Promise
```
The execution function of the custom node. Must be implemented by subclasses.
#### Parameters
The execution context.
The inputs to the node.
#### Returns
`OutputType | Promise`
## Interfaces
### CustomNodeProps
Configuration for a `CustomNode`.
#### Properties
**executionConfig**?: `ExecutionConfigType`
Execution configuration for the custom node.
---
#### CustomNodeProps
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_nodes_custom_node.CustomNodeProps
## Properties
* [executionConfig](#executionconfig)
***
## Properties
### executionConfig
```typescript
executionConfig?: ExecutionConfigType
```
Execution configuration for the custom node.
---
#### ProcessContext
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_nodes_custom_process_context_Impl.ProcessContext
## Properties
* [nodeId](#nodeid)
* [isCancelled](#iscancelled)
***
## Properties
### nodeId
```typescript
nodeId: string
```
The unique identifier of the current node.
### isCancelled
```typescript
isCancelled: boolean
```
Flag indicating if the execution has been cancelled via `GraphOutputStream.abort()`. Custom nodes can check this to exit early.
---
#### SDK Reference > GraphTypes
#### AbstractApiDataType
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_abstract_api_data_type.AbstractApiDataType
## Methods
* [isApiDataType](#isapidatatype)
***
## Methods
### isApiDataType
```typescript
isApiDataType(value: any): boolean
```
Static method to check if a value is an instance of AbstractApiDataType.
#### Parameters
The value to check
#### Returns
`boolean`
---
#### Audio
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_audio.Audio
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new Audio(chunk: AudioChunkInterface): Audio
```
Creates an `Audio` object.
#### Parameters
Low-level audio chunk containing waveform data and sample rate.
#### Returns
`Audio`
---
#### ClassificationResult
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_classification_result.ClassificationResult
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new ClassificationResult(classes: string[]): ClassificationResult
```
Creates a new ClassificationResult instance.
#### Parameters
The list of class labels detected in the input text.
#### Returns
`ClassificationResult`
---
#### Content
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_content.Content
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new Content(content: ContentInterface): Content
```
Creates a new Content instance.
#### Parameters
The content data to create the instance from.
#### Returns
`Content`
---
#### DataStreamWithMetadata
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_data_stream_with_metadata.DataStreamWithMetadata
## Examples
```typescript
// Voice Agent example: Extracting audio with metadata
class AudioExtractorNode extends CustomNode {
async process(context: ProcessContext, input: DataStreamWithMetadata): Promise {
const metadata = input.getMetadata();
// Check metadata to determine if processing is complete
if (!metadata.interaction_complete) {
throw new Error('Interaction not complete');
}
// Extract completed audio from metadata
return metadata.completed_audio as GraphTypes.Audio;
}
}
```
## Constructors
* [constructor](#constructor)
## Methods
* [getMetadata](#getmetadata)
* [getElementType](#getelementtype)
* [toStream](#tostream)
* [toTextStream](#totextstream)
* [toAudioChunkStream](#toaudiochunkstream)
* [toContentStream](#tocontentstream)
* [toTTSOutputStream](#tottsoutputstream)
***
## Constructors
### constructor
```typescript
new DataStreamWithMetadata(streamOrWrapper: object, metadata?: { [x: string]: any; }): DataStreamWithMetadata
```
Creates a new DataStreamWithMetadata instance.
#### Parameters
The stream to wrap. Can be a raw stream or any typed stream wrapper.
Metadata object containing contextual information. Common fields include `iteration`, `completed`, `elementType`, and any custom application data.
#### Returns
`DataStreamWithMetadata`
## Methods
### getMetadata
```typescript
getMetadata(): { [x: string]: any; }
```
Gets the metadata object containing all contextual information.
#### Returns
`{ [x: string]: any; }`
### getElementType
```typescript
getElementType(): string
```
Gets the element type of the stream if specified in metadata. Indicates what type of data elements the stream contains.
#### Returns
`string`
### toStream
```typescript
toStream(): AudioChunkStream | TextStream | ...
```
Converts the wrapped stream to its appropriate typed stream wrapper. Automatically detects the stream type from metadata and returns the correct stream class. Detection order: 1. `metadata.elementType` (most reliable, set when wrapping) 2. `stream.type` property (fallback)
#### Returns
`AudioChunkStream | TextStream | ...`
### toTextStream
```typescript
toTextStream(): TextStream
```
Reconstructs a TextStream from the underlying NAPI stream.
#### Returns
`TextStream`
### toAudioChunkStream
```typescript
toAudioChunkStream(): AudioChunkStream
```
Reconstructs an AudioChunkStream from the underlying NAPI stream.
#### Returns
`AudioChunkStream`
### toContentStream
```typescript
toContentStream(): ContentStream
```
Reconstructs a ContentStream from the underlying NAPI stream.
#### Returns
`ContentStream`
### toTTSOutputStream
```typescript
toTTSOutputStream(): TTSOutputStream
```
Reconstructs a TTSOutputStream from the underlying NAPI stream.
#### Returns
`TTSOutputStream`
---
#### GoalAdvancement
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_goal_advancement.GoalAdvancement
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new GoalAdvancement(goalAdvancement: GoalAdvancementInterface): GoalAdvancement
```
Creates a new GoalAdvancement instance.
#### Parameters
The goal advancement data to create the instance from.
#### Returns
`GoalAdvancement`
---
#### KnowledgeRecords
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_knowledge_records.KnowledgeRecords
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new KnowledgeRecords(records: string[]): KnowledgeRecords
```
Creates a new KnowledgeRecords instance.
#### Parameters
The list of knowledge records to store.
#### Returns
`KnowledgeRecords`
---
#### ListToolsResponse
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_list_tools_response.ListToolsResponse
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new ListToolsResponse(listToolsResponse: ListToolsResponseInterface): ListToolsResponse
```
Creates a new ListToolsResponse instance.
#### Parameters
The tools list to initialize from.
#### Returns
`ListToolsResponse`
---
#### LLMChatRequest
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_llm_chat_request.LLMChatRequest
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new LLMChatRequest(llmChatRequest: LLMChatRequestInterface): LLMChatRequest
```
Creates a new LLMChatRequest instance.
#### Parameters
The chat request to wrap.
#### Returns
`LLMChatRequest`
---
#### MatchedIntents
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_matched_intents.MatchedIntents
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new MatchedIntents(intents: IntentMatchInterface[]): MatchedIntents
```
Creates a new MatchedIntents instance.
#### Parameters
The list of matched intents with scores.
#### Returns
`MatchedIntents`
---
#### MatchedKeywords
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_matched_keywords.MatchedKeywords
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new MatchedKeywords(keywords: KeywordMatchInterface[]): MatchedKeywords
```
Creates a new MatchedKeywords instance.
#### Parameters
The list of matched keywords.
#### Returns
`MatchedKeywords`
---
#### MemoryState
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_memory_state.MemoryState
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new MemoryState(state: string): MemoryState
```
Creates a new MemoryState instance.
#### Parameters
The serialized memory state.
#### Returns
`MemoryState`
---
#### SafetyResult
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_safety_result.SafetyResult
## Constructors
* [constructor](#constructor)
## Interfaces
* [SafetyResultInterface](#safetyresultinterface)
***
## Constructors
### constructor
```typescript
new SafetyResult(result: SafetyResultInterface): SafetyResult
```
Creates a new SafetyResult instance.
#### Parameters
The safety evaluation result data.
#### Returns
`SafetyResult`
## Interfaces
### SafetyResultInterface
Interface representing a safety evaluation result.
#### Properties
**isSafe**: `boolean`
Whether the content is considered safe
**text**: `string`
The text that was evaluated for safety
**classes**?: `string[]`
Optional classification results that triggered safety violations
**keywordMatches**?: `KeywordMatchInterface[]`
Optional keyword matches that triggered safety violations
---
#### ToolCallRequest
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_tool_call_request.ToolCallRequest
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new ToolCallRequest(toolCalls: ToolCallInterface[]): ToolCallRequest
```
Creates a new ToolCallRequest instance.
#### Parameters
The list of tool calls to execute.
#### Returns
`ToolCallRequest`
---
#### ToolCallResponse
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_tool_call_response.ToolCallResponse
## Constructors
* [constructor](#constructor)
***
## Constructors
### constructor
```typescript
new ToolCallResponse(toolCallResults: ToolCallResultInterface[]): ToolCallResponse
```
Creates a new ToolCallResponse instance.
#### Parameters
#### Returns
`ToolCallResponse`
---
#### TTSRequest
Source: https://docs.inworld.ai/node/runtime-reference/classes/common_data_types_api_tts_request.TTSRequest
## Constructors
* [constructor](#constructor)
## Methods
* [withText](#withtext)
* [withStream](#withstream)
* [hasTextContent](#hastextcontent)
* [hasStreamContent](#hasstreamcontent)
* [getText](#gettext)
* [getVoice](#getvoice)
***
## Constructors
### constructor
```typescript
new TTSRequest(request: TTSRequestInterface): TTSRequest
```
Creates a new TTSRequest instance.
#### Parameters
The TTS request to wrap.
#### Returns
`TTSRequest`
## Methods
### withText
```typescript
withText(text: string, voice?: VoiceInterface): TTSRequest
```
Creates a TTSRequest with text content.
#### Parameters
The text content to synthesize
Optional voice configuration
#### Returns
`TTSRequest`
### withStream
```typescript
withStream(stream: TextStream, voice?: VoiceInterface, synthesisConfig?: SpeechSynthesisConfigInterface): TTSRequest
```
Creates a TTSRequest with stream content.
#### Parameters
The stream content to synthesize
Optional voice configuration
#### Returns
`TTSRequest`
### hasTextContent
```typescript
hasTextContent(): boolean
```
Checks if the request has text content.
#### Returns
`boolean`
### hasStreamContent
```typescript
hasStreamContent(): boolean
```
Checks if the request has stream content.
#### Returns
`boolean`
### getText
```typescript
getText(): string
```
Gets the text content if available.
#### Returns
`string`
### getVoice
```typescript
getVoice(): VoiceInterface
```
Gets the voice configuration if available.
#### Returns
`VoiceInterface`
---
#### SDK Reference > Components
#### RemoteLLMComponent
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_components_remote_llm_component.RemoteLLMComponent
## Examples
```typescript
const llmComponent = new RemoteLLMComponent({
id: 'my-llm-component',
provider: 'openai',
modelName: 'gpt-4o-mini',
defaultConfig: {
temperature: 0.7,
maxNewTokens: 1000
}
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteLLMComponentProps](#remotellmcomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteLLMComponent(props?: RemoteLLMComponentProps): RemoteLLMComponent
```
Creates a new `RemoteLLMComponent` instance. Defaults to a provider and model when omitted, and accepts a camelCased `defaultConfig` that will be converted to snake_case for the runtime.
#### Parameters
Configuration for the remote LLM component.
#### Returns
`RemoteLLMComponent`
## Interfaces
### RemoteLLMComponentProps
Configuration options for `RemoteLLMComponent` creation.
#### Properties
**provider**?: `string`
LLM provider (e.g., 'openai', 'anthropic', 'inworld').
**modelName**?: `string`
Provider-specific model name (e.g., 'gpt-4o-mini').
**defaultConfig**?: `object`
Default text generation configuration (camelCase allowed).
---
#### RemoteKnowledgeComponent
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_components_remote_knowledge_component.RemoteKnowledgeComponent
## Examples
```typescript
const knowledgeComponent = new RemoteKnowledgeComponent({
maxCharsPerChunk: 1500,
maxChunksPerDocument: 20
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteKnowledgeComponentProps](#remoteknowledgecomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteKnowledgeComponent(props: RemoteKnowledgeComponentProps): RemoteKnowledgeComponent
```
Creates a new RemoteKnowledgeComponent instance.
#### Parameters
Configuration for the remote knowledge component.
#### Returns
`RemoteKnowledgeComponent`
## Interfaces
### RemoteKnowledgeComponentProps
Configuration options for `RemoteKnowledgeComponent` creation.
#### Properties
**maxCharsPerChunk**?: `number`
Maximum characters per chunk (default: 1000).
**maxChunksPerDocument**?: `number`
Maximum chunks per document (default: 10).
---
#### RemoteEmbedderComponent
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_components_remote_embedder_component.RemoteEmbedderComponent
## Examples
```typescript
const embedderComponent = new RemoteEmbedderComponent({
provider: 'openai',
modelName: 'text-embedding-ada-002',
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteEmbedderComponentProps](#remoteembeddercomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteEmbedderComponent(props: RemoteEmbedderComponentProps): RemoteEmbedderComponent
```
Creates a new RemoteEmbedderComponent instance.
#### Parameters
Configuration for the remote embedder component.
#### Returns
`RemoteEmbedderComponent`
## Interfaces
### RemoteEmbedderComponentProps
Configuration options for `RemoteEmbedderComponent` creation.
#### Properties
**provider**?: `string`
Embedder provider (e.g., 'openai', 'inworld').
**modelName**?: `string`
Provider-specific model name.
---
#### RemoteSTTComponent
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_components_remote_stt_component.RemoteSTTComponent
## Examples
```typescript
const sttComponent = new RemoteSTTComponent({
\ * sttConfig: {
provider: 'whisper',
modelName: 'whisper-1',
language: 'en'
}
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteSTTComponentProps](#remotesttcomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteSTTComponent(props: RemoteSTTComponentProps): RemoteSTTComponent
```
Creates a new RemoteSTTComponent instance.
#### Parameters
Configuration for the remote STT component.
#### Returns
`RemoteSTTComponent`
## Interfaces
### RemoteSTTComponentProps
Configuration options for `RemoteSTTComponent` creation.
#### Properties
**sttConfig**: `{ languageCode?: string; }`
STT configuration object.
---
#### RemoteTTSComponent
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_components_remote_tts_component.RemoteTTSComponent
## Examples
```typescript
const ttsComponent = new RemoteTTSComponent({
synthesisConfig: {
voice: 'alloy',
speed: 1.0,
}
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [RemoteTTSComponentProps](#remotettscomponentprops)
***
## Constructors
### constructor
```typescript
new RemoteTTSComponent(props: RemoteTTSComponentProps): RemoteTTSComponent
```
Creates a new RemoteTTSComponent instance.
#### Parameters
Configuration for the remote TTS component.
#### Returns
`RemoteTTSComponent`
## Interfaces
### RemoteTTSComponentProps
Configuration options for `RemoteTTSComponent` creation.
#### Properties
**synthesisConfig**?: `Config`
Speech synthesis configuration.
---
#### MCPClientComponent
Source: https://docs.inworld.ai/node/runtime-reference/classes/graph_dsl_components_mcp_client_component.MCPClientComponent
## Examples
```typescript
const mcpComponent = new MCPClientComponent({
sessionConfig: {
serverPath: '/path/to/mcp-server',
serverArgs: ['--config', 'config.json'],
timeout: 30000
}
});
```
## Constructors
* [constructor](#constructor)
## Interfaces
* [MCPClientComponentProps](#mcpclientcomponentprops)
***
## Constructors
### constructor
```typescript
new MCPClientComponent(props: MCPClientComponentProps): MCPClientComponent
```
Creates a new `MCPClientComponent` instance. The provided camelCase `sessionConfig` is converted to snake_case when generating the runtime configuration.
#### Parameters
Configuration for the MCP client component.
#### Returns
`MCPClientComponent`
## Interfaces
### MCPClientComponentProps
Configuration options for `MCPClientComponent` creation.
#### Properties
**sessionConfig**: `Config`
MCP session configuration (camelCase allowed).
---
#### SDK Reference > Graph Configs
#### TextGenerationConfig
Source: https://docs.inworld.ai/node/runtime-reference/interfaces/graph_dsl_graph_config_schema.TextGenerationConfig
## Properties
* [max_new_tokens](#max_new_tokens)
* [max_prompt_length](#max_prompt_length)
* [temperature](#temperature)
* [top_p](#top_p)
* [repetition_penalty](#repetition_penalty)
* [frequency_penalty](#frequency_penalty)
* [presence_penalty](#presence_penalty)
* [stop_sequences](#stop_sequences)
* [seed](#seed)
* [logit_bias](#logit_bias)
***
## Properties
### max_new_tokens
```typescript
max_new_tokens?: string | number
```
Maximum number of tokens to generate
### max_prompt_length
```typescript
max_prompt_length?: string | number
```
Maximum length of the prompt in tokens
### temperature
```typescript
temperature?: string | number
```
Controls randomness of generated text
### top_p
```typescript
top_p?: string | number
```
Probability for most probable tokens sampling
### repetition_penalty
```typescript
repetition_penalty?: string | number
```
Repetition penalty
### frequency_penalty
```typescript
frequency_penalty?: string | number
```
Frequency penalty
### presence_penalty
```typescript
presence_penalty?: string | number
```
Presence penalty
### stop_sequences
```typescript
stop_sequences?: string[]
```
List of sequences to stop generation
### seed
```typescript
seed?: string | number
```
Random seed for controlling the randomness of text generation
### logit_bias
```typescript
logit_bias?: object
```
Logit bias to modify token likelihood
---
#### Resources
#### Release Notes
Source: https://docs.inworld.ai/release-notes/runtime-node
## Node.js Agent Runtime v0.8.0
Enhanced performance, execution control, and component access for custom nodes.
* **2x faster performance** with optimized addon architecture
* **Cancel running executions** with [abort()](/node/runtime-reference/classes/graph_GraphOutputStream.GraphOutputStream#abort) on [GraphOutputStream](/node/runtime-reference/classes/graph_GraphOutputStream.GraphOutputStream)
* **Call LLMs from custom nodes** via [getLLMInterface()](/node/runtime-reference/interfaces/graph_nodes_custom_process_context_Impl.ProcessContext#getllminterface) and [getEmbedderInterface()](/node/runtime-reference/interfaces/graph_nodes_custom_process_context_Impl.ProcessContext#getembedderinterface)
* **Build stateful graph loops** with [DataStreamWithMetadata](/node/runtime-reference/classes/common_data_types_api_data_stream_with_metadata.DataStreamWithMetadata)
**Breaking changes:** [graph.start()](/node/runtime-reference/classes/graph_graph.Graph) is now async, and `stopInworldRuntime()` is required.
See the [Migration Guide](/node/migration-guide-v0.8) for upgrading from v0.6.
## Inworld CLI - Hosted Endpoint
`npm install -g @inworld/cli`
* **[3-Minute Setup](/node/quickstart#get-started):** Single command installation, browser-based login, and instant API key generation.
* **[Local Development](/node/quickstart#run-a-local-server):** Test your graphs instantly with `inworld serve`.
* **[Instant Deployment](/node/cli/deploy):** Deploy to cloud with `inworld deploy` - no hosting, scaling, or infrastructure required.
## Node.js Agent Runtime v0.6.0
Simplified interfaces and improved APIs.
Developers upgrading from v0.5 should review the breaking changes below.
* **ExecutionConfig Access:** New [context.getExecutionConfig()](/node/runtime-reference/interfaces/graph_nodes_custom_process_context_Impl.ProcessContext) method with automatic property unwrapping
* **Graph Execution:** [Graph.start()](/node/runtime-reference/classes/graph_graph.Graph) now returns [ExecutionResult](/node/runtime-reference/interfaces/graph_graph.ExecutionResult) with execution details
* **Unwrapped Types:** Cleaner [GoalAdvancement](/node/runtime-reference/classes/common_data_types_api_goal_advancement.GoalAdvancement) and [LLMChatRequest](/node/runtime-reference/classes/common_data_types_api_llm_chat_request.LLMChatRequest) interfaces
## Runtime
* Publicly launched [Node.js Agent Runtime SDK](/node/quickstart), available via [npm](https://www.npmjs.com/package/@inworld/runtime)
* Released templates including [Mulitmodal Companion](/node/templates/multimodal-companion) ([git](https://github.com/inworld-ai/multimodal-companion-node)) and [Comic Generator with MiniMax](/node/templates/comic-generator) ([git](https://github.com/inworld-ai/comic-generator-node))
* Launched [Experiments](/portal/graph-registry), [Dashboards](/portal/dashboards), [Traces](/portal/traces), [Logs](/portal/logs), and LLM Playground on [Portal](https://platform.inworld.ai)
---
#### Authentication
Source: https://docs.inworld.ai/node/authentication
The Runtime SDK uses API keys to authenticate requests to Inworld's server.
## Getting an API key
The Runtime SDK currently only supports Basic authorization, although you can use JWT authentication with our standalone Model APIs.
Make sure to keep your Base64 credentials safe, as anyone with your credentials can make requests on your behalf. It is recommended that credentials are stored as environment variables and read at run time.
Do not expose your Base64 API credentials in client-side code (browsers, apps, game builds), as it may be compromised. Runtime SDK support for JWT authentication is coming soon.
---
#### Support
Source: https://docs.inworld.ai/runtime/support
---
### Unreal
#### Get Started
#### Unreal Agent Runtime
Source: https://docs.inworld.ai/unreal-engine/runtime/overview
## Get Started
Get started with the Unreal Agent Runtime.
Create and chat with an AI character with Unreal Agent Runtime.
## Explore
Learn how to use the visual graph editor to create an AI pipeline.
Learn how to use Agent Runtime with MetaHuman.
Get started building with LLMs in Agent Runtime.
Explore the Agent Runtime reference for class definitions and functions.
---
#### Unreal Agent Runtime Quickstart
Source: https://docs.inworld.ai/unreal-engine/runtime/get-started
This guide covers setup and installation for Inworld’s Unreal Agent Runtime SDK.
## Prerequisites
- [Inworld account](https://platform.inworld.ai/signup) and [API Key](/unreal-engine/runtime/authentication)
- [Unreal 5.4 - 5.7](https://www.unrealengine.com/en-US/download)
- [Visual Studio with .NET Desktop Development and .NET 8.0 and 9.0 Runtime](https://learn.microsoft.com/en-us/visualstudio/install/install-visual-studio?view=vs-2022)
- [Follow the Unreal Specific Guide](https://learn.microsoft.com/en-us/visualstudio/gamedev/unreal/get-started/vs-tools-unreal-install)
- [.NET 8.0 and 9.0 Runtime](https://dotnet.microsoft.com/en-us/download)
- Mac with Apple Silicon (x86_64 not supported)
- [Inworld account](https://platform.inworld.ai/signup) and [API Key](/unreal-engine/runtime/authentication)
- [Unreal 5.4 - 5.7](https://www.unrealengine.com/en-US/download)
- [Xcode 15.4 or higher](https://developer.apple.com/download/all/?q=xcode%2015.4)
Some newer versions of Xcode may not be compatible with older Unreal Engine versions. Make sure you have a compatible version.
- [Inworld account](https://platform.inworld.ai/signup) and [API Key](/unreal-engine/runtime/authentication)
- [Unreal 5.4 - 5.7](https://www.unrealengine.com/en-US/download)
- [Follow the Unreal Linux Development Quickstart](https://dev.epicgames.com/documentation/en-us/unreal-engine/linux-development-quickstart-for-unreal-engine)
## Install Unreal Agent Runtime
**Download Inworld Agent Runtime**
, which contains the core Inworld Agent Runtime functionality.
We also recommend **downloading our template plugins**, which are pre-built implementations of common use cases that can be immediately plugged into your game. See [Templates](/unreal-engine/runtime/templates) for a description of each of the template plugins.
Create a new C++ game project in Unreal (5.4+). _Any of the games templates will work, but we recommend the Blank Template._
Close Unreal. Create a Plugins folder in your project’s root directory. Extract the contents of each of the zip files you downloaded in step 1, and place them into this new folder.
In the root folder of your project, click on the .uproject file to launch your project. You will be prompted to rebuild the newly added plugin modules.
To use the Inworld Agent Runtime SDK with cloud APIs, open the Unreal Editor and, in the top menu bar, go to Edit \> Project Settings \> Plugins \> Inworld \> ApiKey and paste your Base64 [Runtime API key](/unreal-engine/runtime/authentication) into the Runtime API Key field.
After you add your API key, to capture telemetry (e.g., logs, traces), you will need to restart Unreal.
Play the Chat template which illustrates how to have a simple audio to audio conversation using the Inworld Agent Runtime & Character.
1. Open the settings menu in the Content Browser within Unreal.
2. Enable the option for “Show Plugin Content”.
3. In the Content Browser, navigate to: `/Plugins/InworldChat`
4. Double-click on the Chat.umap.
5. Click the Play-in-Editor button (alt\+p) to launch your level.
Go to your workspace in [Portal](https://platform.inworld.ai/) and explore the traces that were generated while you were playing the Chat Template.
With these steps, you'll be all set to explore the capabilities of the Inworld Agent Runtime\!
## Next steps
For more advanced explorations, check out:
- [Understand how the Chat template works](/unreal-engine/runtime/templates/chat)
- [Runtime reference](/unreal-engine/runtime/runtime-reference/overview) - Class definitions and functions
- [Learn how to use the Graph Editor](/unreal-engine/runtime/graph-editor)
---
#### Lexicon
Source: https://docs.inworld.ai/unreal-engine/runtime/lexicon
Use this lexicon to quickly understand common acronyms and AI terms used in the Runtime.
## Speech recognition
| Term | Definition |
| --- | --- |
| STT (Speech-to-Text) | Converts spoken audio into text. Used to understand player speech. |
| VAD (Voice Activity Detection) | Detects when speech is present in an audio stream to start/stop sending audio. |
| AEC (Acoustic Echo Cancellation) | Reduces game audio leaking into the microphone input by cancelling echoes. |
## Language models and reasoning
| Term | Definition |
| --- | --- |
| LLM (Large Language Model) | An AI model trained on massive amounts of text to understand and generate human-like language. LLMs can power capabilities like dialog generation, game state changes, reasoning, and more. |
| Intent | The inferred meaning or purpose behind an input. Often used in the context of inferring whether a user message falls under a certain intent, to trigger some action. |
## Knowledge and retrieval
| Term | Definition |
| --- | --- |
| Embedding | A numeric vector representation of text used to measure semantic similarity. |
| Knowledge | Structured or unstructured information the character can reference during conversation. |
| RAG (Retrieval-Augmented Generation) | A technique that retrieves relevant knowledge (via embeddings) to ground LLM responses. |
## Speech synthesis
| Term | Definition |
| --- | --- |
| TTS (Text to Speech) | Converts generated text into spoken audio. Often paired with an LLM to generate a character's voice. |
---
#### Troubleshooting
Source: https://docs.inworld.ai/unreal-engine/runtime/troubleshooting
## Unreal Editor not loading
If you get stuck at 72% when loading Unreal Editor, this is likely because you do not have the NNEDenoiser and NNERuntimeORT plugins disabled for your Unreal project. On Mac and Linux, the NNERuntime Module will hang indefinitely while attempting to load if these plugins are not disabled.
## Invalid authorization credentials
You may get an error similar to the following if you haven't provided your API key:
`Error: TTS Creation Failed. INTERNAL: Failed to check TTS server health: Invalid authorization credentials`
Please make sure you have added your **Runtime** API key to [project settings](/unreal-engine/runtime/get-started#set-up-the-inworld-runtime-api-key), and are using the Base64 API key.
---
#### Templates
#### Overview
Source: https://docs.inworld.ai/unreal-engine/runtime/templates
Templates provide pre-built components and examples for common use cases with the Inworld Unreal Agent Runtime SDK. These templates include fully implemented components, blueprints, and example levels that you can use immediately or as a foundation for your own projects.
Use them to help jumpstart your development with the Unreal Agent Runtime SDK!
Download the templates [**here**](https://storage.googleapis.com/assets-inworld-ai/unreal-plugin/runtime/InworldTemplates.zip).
## Available Templates
Each template is packaged as an individual plugin, allowing you to install only what your project needs.
| **Name** | **Package** | **Description** | **Dependencies** |
| :--- | :--- | :--- | :--- |
| [Character](/unreal-engine/runtime/templates/character) | `InworldCharacter.zip` | Comprehensive system providing Actor Components and Inworld Graphs for building interactive AI characters. Does not include playable levels. | Runtime |
| [Chat](/unreal-engine/runtime/templates/chat) | `InworldChat.zip` | Simple chat interface with text and voice input for conversing with characters or character groups. | Runtime + Character |
| [Innequin](/unreal-engine/runtime/templates/innequin) | `InworldInnequin.zip` | 3D character demo demonstrating Inworld Character components applied to Pawns. | Runtime + Character + Lipsync |
| [Command](/unreal-engine/runtime/templates/command) | `InworldCommand.zip` | Template demonstrating how to control AI agents and game logic with voice and text input. | Runtime |
| Lipsync | `InworldLipsync.zip` | Experimental template enabling lip-sync for text-to-speech. | Runtime |
| [MetaHuman](/unreal-engine/runtime/templates/metahuman) | `InworldMetaHuman.zip` | Experimental template demonstrating text-to-speech driving MetaHuman facial animation. | Runtime + Lipsync |
| [Assets](/unreal-engine/runtime/templates/assets) | `InworldAssets.zip` | Lightweight content library with meshes, materials, and textures for quick prototyping. No gameplay logic or levels. | None |
---
#### Character
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/character
The Character template is a comprehensive system for AI character interaction using Inworld Agent Runtime.
It provides a variety of Inworld Graphs and Actor Components for building characters with a variety of capabilities and facilitating conversations amongst those characters and the player.
This template does not include a playable level - see the [Chat](/unreal-engine/runtime/templates/chat) and [Innequin](/unreal-engine/runtime/templates/innequin) templates for simple chat UIs and HUDs powered by the Character template.
## Getting Started
To set up a player-character interaction in your own game, you'll need to set up both the player and the character.
### Player setup
Let's start with the player.
1. Add the **Inworld Player** component to an Actor or other blueprint in your level. In this template we attached the **Inworld Player** component to a custom Player Controller.
2. In the Details panel, setup the **Player Profile** defaults:
- **Name**: The name of the player (that the Character will know the player as). By default, we use the name "Alex".
- **Gender**: The gender of the player.
- **Age**: The age of the player (this could be interpreted as a number or stage of life).
- **Role**: This is the player's role (this can be interpreted in a number of ways, such as the player's job or role in society. Used to give more context as to who the player is to characters).
3. Setup the Audio Capture config:
- **Capture Frequency** - The frequency at which audio is captured (measured in cycles per second). Controls how frequently OnAudioCapture is fired.
- **Enable AEC** - Enable Acoustic Echo Cancellation (AEC). This helps reduce game audio from being interpreted as player speech when captured by the microphone.
- **AECId** - Ids the AEC primtive creation configuration (there is currently no creation config settings, just leave as `Default`)
- **AECConfig**:
- **Echo Supression Level** - AEC processing intensity (0.0-1.0, higher values provide stronger echo cancellation)
- **Noise Suppression Level** - Noise suppression intensity (0.0-1.0, higher values provide stronger noise reduction)
- **Enabled Adaptive Filter** - Enable adaptive filtering for better echo cancellation
- **Enable VAD** - Enable Voice Audio Detection (VAD). This is useful for determining when the player is speaking for sending audio data.
- **VADId** - Ids the VAD primtive creation configuration (there is currently no creation config settings, just leave as `Default`)
- **VADConfig**:
- **Speech Threshold** - Sensitivity threshold for detecting speech. This is a `float` value where higher thresholds make the detection more
selective (fewer false positives), and lower thresholds make it more sensitive (detecting quieter speech). Valid range: 0.0 - 1.0 (default = 0.4)
- **VADBuffer Time** - The amount of silence required before it is determined that the player has stopped speaking.
- **StreamingSTTComponentId** - Identifier for the Streaming STT Creation Config settings for the Streaming STT component used for converting speech-to-text.
- **Start Mic Enabled** - Whether to start the game with the microphone enabled.
### Character setup
Now, let's set up the character.
1. Start by creating a blueprint Actor that represents your character.
2. Add the necessary Inworld components to your actor:
- **InworldCharacter** - for interfacing with the conversational AI system
- **InworldVoiceAudio** - for playing character speech audio
- **InworldLipsync** (optional) - for lip-syncing (experimental)
- **BP_ChatBubbleComponent** (optional) - for displaying chat bubbles in world space
3. Setup the **InworldCharacter** component's **Id** and **Graph** properties:
- **Id**: The id of the conversation target. This is used to identify the conversation target in the Inworld Character Subsystem.
- **Emotion Graph Asset**: The graph used to generate the character's current emotional state.
- **Relation Graph Asset**: The graph used to generate the character's relationship to the player.
- **Dialogue Generation Graph Asset**: The graph used to generate the character's dialogue responses.
4. Setup the **InworldCharacter** component's **Inworld** runtime data properties:
The **Character Profile** shapes who the character is and how the character responds.
- **Name**: The name of the player (that the Character will know the player as). By default, we use the name "Alex".
- **Role**: The job this character does in the scene. This can be things like "Market Vendor" "Fan of the Player", and "Witness". Specific roles and titles can help the AI produce better results. For example, “World War 2 Private” would give the character more context than just “soldier”.
- **Pronouns**: The pronouns the character uses to define themselves. For example: She/Her/Hers
- **Stage of Life**: This is used for the character's age or approximate age. For example: The user could enter 15 or teenager.
- **Hobbies and Interests**: The pastimes and activities that the character enjoys.
- **Description**: This provides details about a character's current circumstances, backstory, general disposition, and how they present themselves to others. It is recommended you use evocative and descriptive language when writing a character’s description, instead of straightforwardly listing information.
This helps the AI develop a fluid narrative around the character and their circumstances, encouraging the AI to visualize the character, interpret any descriptive elements, and craft its own dynamic portrayal.
A dramatic, evocative writing approach results in more fully-realized and expressive responses from the AI when portraying the character.
Things to include:
- Backstory
- Relationships with other characters
- Specialty or unique skillset
- Personal opinions
- Current appearance
- Things they have experienced
- Favorite sayings
- Things they hate
- Things they love
- **Motivation**: What goals and ambitions drive the character.
- **Flaws**: A list of character flaws the bot has. Flaws can be as minor as the character being unable to spell "ballerina" to massive story changing traits. Like Description, it is recommend that you write character flaws in an evocative style. Some examples:
- Allergic to peanuts
- Secretly jealous of their sibling
- Can't talk about their favorite game without going on rants
- **Custom Dialog Style**: Enabling Custom Dialogue Style opens an Adjectives and Colloquialism section. While you can add these in the default Dialog Style, this allows more specificity for character customization.
- **Adjectives**: This allows you to describe your character's overall dialog style. Some examples:
- Formal well-spoken, straightforward, matter-of-fact. _(computer, scientist, lawyer)_
- Blunt short, direct, candid, outspoken. _(sports coach, celebrity chef)_
- Entertaining animated, comical, amusing, witty. _(stand-up comedian, cartoon character, jester)_
- **Colloquialism**: List any colloquialisms that the character uses. Some examples:
- Cool (for good or awesome)
- Hit the nail on the head
- Buddy
- gonna
- y'all
- ain't
- **Dialog Style**: A condensed version of the Custom Dialog Style. Instead of the style being broken into multiple categories, it is condensed into one description box.
- **Example Dialog**: To give the LLM a better chance at meeting your desired character response style, this can be used to give the LLM response examples in the voice of the character.
- **Personality Traits**: Used if you want the character to have a specific personality type. Some examples:
- Bubbly thoughtful, energetic. _(influencer, entertainer, partygoer)_
- Inquisitive curious, exploratory, interested, prying. _(detective, psychologist)_
- Commanding intense, determined, authoritative, dictatorial._(drill sergeant, personal trainer, ship captain)_
- Empathetic gentle, compassionate, understanding. _(teacher, healer, animal rescuer)_
The **Voice** determines what voice is used for the character audio.
- **Speaker Id**: The ID of the voice you would like your character to use. For Inworld voices, you can check TTS Playground or use the [List voices](/api-reference/ttsAPI/texttospeech/list-voices) API for the complete list of options.
- **Language**: Optional. The language the character will speak.
The **Event History** defines how the conversation history will be handled.
- **Speech Event Formatter**: The string format to use for event records in the concatenated string representation of all stored events.
- **Delimiter**: The delimiter to use for separating history records in the concatenated string representation of all stored events.
- **Capacity**: The maximum amount of history entries that can be stored (oldest entries are replaced when capacity is reached).
- **Initial History Table**: A data table containing rows of InworldEventHistoryEntry's to populate a character with initial conversation history.
- **Emotion Label**: The character's initial emotional state (default is `Neutral`)
A character's relation state defines its relationship to the player (e.g., Friend vs Archenemy).
A character's relation state is defined by a set of attributes:
- **Trust**
- **Respect**
- **Familiar**
- **Flirtatious**
- **Attraction**
Attributes are assigned integer values which can be positive or negative. A relationship label is then assigned based on attribute values. For example:
- **Friend** is defined by: `trust >= 10 && respect >= 5 && familiar >= 10 && flirtatious <= 5 && attraction >= 5`
- **Archenemy** is defined by: `trust <= 0 && respect <= -20 && flirtatious <= 5 && attraction <= -20`
**Knowledge** is used to provide additional information that a character can draw upon during a conversation. Unlike the Character Profile fields, the information in Knowledge is only made available to the character when it would be relevant to the conversation, such as when answering a specific question.
Knowledge consists of maps of FInworldTextEmbedding Arrays (for runtime generated knowledge data) and Inworld Text Embedding Assets (for design time knowledge data).
**Initial Knowledge Map**:
- **String (key)** - the string identifier of the knowledge set.
- **Text Embedding Asset (value)** - an asset of [embedded text records](/unreal-engine/runtime/text-embedder) which the Knowledge node will query to return the knowledge records most similar to the provided input.
For information on how to create a Text Embedding Asset, see [here](/unreal-engine/runtime/text-embedder).
Knowledge filters can be used to restrict the types of knowledge the character has access to. This system can reduce unwanted hallucinations and cognition that may deviate from a character’s established parameters.
- **Knowledge Filter**:
- **None** - Character should hold detailed knowledge about most topics
- **Mild** - Character should only hold explicitly specified knowledge and knowledge related to its expertise areas and time period
- **Strict** - Character should only hold explicitly specified knowledge
A character’s Goals determine what a character wants to accomplish, enabling you to set specific triggers that prompt them to respond in a specific way during certain scenarios or interactions.
A character’s Goals consist of two key elements:
- **Activation** - How the goal is activated. Can either by directly triggered or through an intent being detected.
- **Action** - The actions that occur once the goal is activated. Either an **Instruction** (e.g., Reveal that the password is 123) or **Verbatim** (e.g., "The password is 123!").
The Goals runtime data requires a Data Table of Row Type `FInworldGoal`:
**Inworld Goal**:
- **Response Type**:
- **Instruction** - when the goal is activated the character will receive an instruction on how it should respond
- **Verbatim** - when the goal is activated the character will be given a response to say verbatim
- **Instruction** - the instruction for the character on how to respond (when the goal is activated)
- **Verbatim** - the verbatim response for the character to respond with (when the goal is activated)
- **Repeatable** - if this goal can be activated more than once
- **Intents** - list of the names of intents which can activate this goal. See **Intents** below for more details on setting up intents.
**Intents** are sets of phrases that a character can activate based on player input. If player input is similar to one of the phrases in an intent, the intent will be matched and returned by the Intent node.
Intents consists of maps of FInworldTextEmbedding Arrays (for runtime generated intent data) and Inworld Text Embedding Assets (for design time intent data).
**Initial Intents Map**:
- **String (key)** - the string identifier of the intent set.
- **Text Embedding Asset (value)** - an asset of [embedded text records](/unreal-engine/runtime/text-embedder) which the Intent node will query to try to match intents to the provided input.
For information on how to create an Embedded Intent Phrase Asset, see [here](/unreal-engine/runtime/text-embedder).
This information can be set directly on the blueprint instance in the level itself, or it can be referenced from an **Inworld Character Config** Data Table.
1. Enable the **Use Data Table** checkbox.
2. Select the appropriate **Inworld Character Config** Data Table asset (or create a new one).
3. Select the **Character ID** of the configured character from within the Data Table that you want to use.
**Inworld Character Config Data Table**:
5. [Optional] Setup a **Conversation Group** \
If you would like to support multi-character or character-to-character conversations you will need to setup a Conversation Group. As an example we have created BP_ConversationGroup.
MultiCharacter level from the [Innequin](/unreal-engine/runtime/templates/innequin) template.
The blueprint utilizes the **Inworld Conversation Group** component to create and manage the conversation.
On `BeginPlay` characters are added to the conversation as they are ready:
Within the level, in the details panel you can set the Inworld Characters you want added to the group. There are also settings for a basic automated response system:
6. [Optional] Configure **Speaker Rule Director** \
For scenarios requiring strict, turn-based speaker order (like board games or scripted sequences), you can use **Director Mode** to override the AI's dynamic speaker selection.
**Configuration:**
First, configure the **Rule Director** on the **Inworld Conversation Group** component in your level. Here you define the rules:
Each **Speaker Rule** consists of:
- **Rule ID**: A unique identifier (e.g., "Round1").
- **Speaker Sequence**: An ordered list of Character IDs defining the turn order.
- **Loop**: Whether the sequence restarts after completion.
**Activation:**
At runtime, you can activate a specific rule using the `SetActiveRule` function on the Rule Director object. This is typically handled in your **GameMode** or Level Blueprint.
**How it works:**
- When a rule is active, the conversation follows the `SpeakerSequence`.
- If the rule finishes (and isn't looping) or is cleared via `ClearActiveRule`, the system falls back to the standard AI-driven `SpeakerSelectionGraph`.
7. Set the Player's Conversation Target
The conversation target can either be a single character or a conversation group.
You can utilize the **Inworld Character Subsystem** to find registered conversation targets, with useful functions such as `GetClosestConversationTarget`
to find the target nearest the player pawn:
8. Now you should be able to run the level and talk to your characters (just don't forget to enable the microphone).
In our [Chat](/unreal-engine/runtime/templates/chat) and [Innequin](/unreal-engine/runtime/templates/innequin) templates we have setup simple chat UIs and HUDs. Please see `BP_Chat_HUD`/`BP_Innequin_HUD` and `WBP_Chat2D`/`WBP_Chat3D` for examples on how you can set this up in your own level.
## Technical Overview
The Character system is comprised of a variety of Inworld Agent Runtime graphs and Unreal actor components which work together to facilitate a variety of possible character interactions.
### Graphs
#### IG_SimpleDialogue
The Simple Dialogue graph is a simple graph that allows you to generate text and audio dialogue for a character, in response to player input.
Input:
- **Text** - player input text
- **Audio** - player input audio
Runtime Data:
- **Player Profile** - player profile runtime data
- **Character Profile** - character profile runtime data
- **Voice** - voice runtime data
- **Event History** - event history runtime data
Output:
- **Player Out (Text)** - player input text
- **Response Out (Text)** - generated response text
- **TTS (TTSOutput)** - generated response text and audio output stream
**Understanding the Simple Dialogue Graph:**
Let's walk through how the inworld graph (`IG_SimpleDialogue`) that is powering the character's responses is constructed. The graph is split into 3 parts.
1. The first section of the graph handles input.
_Input Section_
- Input enters via a Routing node.
- A Routing node transfers data without mutating it, which is ideal here because we don’t want to change input data.
- Based on the detected input type, route to a Speech-To-Text (STT) node to convert audio to text.
- Once we have text, send it to both **Player Out** and the **Simple Dialogue Prompt** input.
- Use **Player Out** if you want to capture what the player said.
2. The second section of the graph handles the LLM request.
_LLM Section_
- The custom **Simple Dialogue Prompt** node takes four inputs:
- **Input** (text) (coming from input section)
- **Player Profile** (runtime data for the player)
- **Character Profile** (runtime data for the AI character)
- **Event History** (history of interactions between player and character)
_The Process Default of **Simple Dialogue Prompt**_
Inside the **Simple Dialogue Prompt**, we extract strings from each input struct and feed them into a `FormatText` node to create the prompt.
- The prompt is sent to the **Dialog Generation LLM** node.
- The LLM is configured to stream responses (tokens are emitted as soon as they’re generated).
- Use **Response Out** if you want to capture what the LLM said.
3. The third section handles text-to-speech output.
_TTS Section_
- The streamed output from **Dialog Generation LLM** is fed into **Text Chunking**.
- Text Chunking aggregates tokens into words/sentences.
- Once chunked, pass the text and **Get Voice** to **TTS Request** to create a text-to-speech request.
- **Get Voice** is runtime data that specifies which voice ID to use.
- Feed **TTS Request** into **TTS** to output audio as it arrives.
- The TTS Audio is automatically buffered into the **InworldVoiceAudio** component. Which plays back the streamed audio.
#### IG_Dialogue
The Dialogue graph allows you to generate text and audio dialogue for a character, either in response to player input, as part of a conversation, or as instructed by an action.
Input:
- **Text** - player input text
- **Trigger** - a goal to activate
- **Action** - action to perform
Runtime Data:
- **Player Profile** - player profile runtime data
- **Character Profile** - character profile runtime data
- **Voice** - voice runtime data
- **Event History** - event history runtime data
- **Conversation State** - conversation state runtime data
- **Knowledge** - knowledge runtime data
- **Knowledge Filter** - knowledge filters runtime data
- **Relation** - relation runtime data
- **Emotion** - emotion runtime data
- **Goals** - goals runtime data
- **Intents** - intents runtime data
Output:
- **Player Out (SafetyResult)** - player input text
- **Response Out (SafetyResult)** - generated response text
- **Goal Out (Action)** - action to perform
- **TTS (TTSOutput)** - generated response text and audio output stream
Key Features:
- **Safety** - for handling unsafe player input and generated dialogue.
- **Goals** - for activating predefined dialogue responses or instructions.
- **Knowledge** - for fetching relevant information known by the character and filtering the response.
- **LLM Generation** - for generating dialogue based on conversation context and character state.
- **Text-to-Speech (TTS)** - for converting generated dialogue to speech output.
#### IG_Safety
The Safety graph is a subgraph of the Dialogue graph. It classifies text based on a set of model weights and thresholds, and searches for listed keywords in the text to determine if the text is safe.
Input:
- **Text** - text to classify
Output:
- **Safety Aggregator (SafetyResult)** - the safety result (`IsSafe`: boolean, `Text`: string)
#### IG_KnowledgeFilter
The Knowledge Filter graph is a subgraph of the Dialogue graph. It generates a textual filter to add to the LLM prompt to modify the character's response based on the character's knowledge.
Input:
- **Query (Text)** - the player's query text
- **Knowledge Records** - a list of knowledge records
- **Knowledge Filter** - knowledge filter runtime data
- **Character Profile** - character profile runtime data
- **Event History** - event history runtime data
Runtime Data:
- **Character Profile** - character profile runtime data
- **Knowledge Filter** - knowledge filter runtime data
Output:
- **Knowledge Filter Processing (Text)** - the textual knowledge filter to add to the LLM prompt (e.g. "Make sure to respond as if \{agent's name\} has poor knowledge on the topic, showing lack of confidence.")
#### IG_SpeakerSelection
The Speaker Selection graph selects the most appropriate next speaker from the context of a conversation based on the conversation history and player input (if any). This graph is utilized by a Conversation Group to determine who should speak next.
Input:
- **Text** - the player's query text
Runtime Data:
- **ConversationState** - conversation state runtime data
- **EventHistory** - event history runtime data
Output:
- **LLMResponse To Text (Text)** - the index of the next speaker (as a string)
#### IG_Relation
The Relation graph generates a relationship update (between a character and the player) based on the most recent conversation history.
Runtime Data:
- **Character Profile** - character profile runtime data
- **Player Profile** - player profile runtime data
- **EventHistory** - event history runtime data
Output:
- **LLMResponse To Text (Text)** - the relationship update (as a string)
The output will be of the form:
```yaml
- trust: [-2, -1, 0, +1, +2]
- respect: [-2, -1, 0, +1, +2]
- familiar: [-2, -1, 0, +1, +2]
- flirtatious: [-2, -1, 0, +1, +2]
- attraction: [-2, -1, 0, +1, +2]
```
#### IG_Emotion
The Emotion graph generates an emotion label for a character based on the character's personality and the most recent conversation history.
Runtime Data:
- **Character Profile** - character profile runtime data
- **EventHistory** - event history runtime data
Output:
- **LLMResponse To Text (Text)** - the emotion label (as a string)
`['NEUTRAL', 'CONTEMPT', 'BELLIGERENCE', 'DOMINEERING', 'CRITICISM', 'ANGER', 'TENSION', 'TENSE_HUMOR', 'DEFENSIVENESS', 'WHINING', 'SADNESS', 'STONEWALLING', 'INTEREST', 'VALIDATION', 'HUMOR', 'AFFECTION', 'SURPRISE', 'JOY', 'DISGUST']`
## Runtime Data
The Character Engine graphs are stateless. This means a single instance of any graph can be used for any and all characters or groups of characters. This also means all state data must be supplied to the graph on execution.
This data in the Inworld Agent Runtime Unreal SDK is known as Runtime Data. Nodes require any number of different types of runtime data in order to function.
Runtime data comes in two forms: **Direct `FInworldData` types** (like `FInworldData_Bool` or `FInworldData_Struct(myStruct)`) which can be used immediately, and **`UInworldGraphRuntimeData` containers** which are UObject wrappers that provide functions for managing data and expose it via their `GetData()` method. Both forms ultimately provide `FInworldDataHandle` for graph execution.
### Player Profile
**`FInworldPlayerProfile`** (Direct FInworldData Usage)
A lightweight struct that stores player profile information used by the AI to personalize interactions and maintain context about the player. This struct is passed to graphs by wrapping it as `FInworldData_Struct(playerProfile)`.
**Structure Properties:**
- `Name` (`FString`) - Player's name for personalized address
- `Gender` (`FString`) - Player's gender identity for appropriate character responses
- `Age` (`FString`) - Player's age or age category for context-appropriate dialogue
- `Role` (`FString`) - Player's role or position in the game world
**Graph Integration:**
```cpp
// Direct usage - wrap as FInworldData_Struct for graph consumption
FInworldPlayerProfile playerProfile;
// ... populate profile data ...
FInworldDataHandle profileData = FInworldData_Struct(playerProfile);
```
**Example in Practice:**
```cpp
// From UInworldBaseCharacterComponent::SendPlayerMessage_Implementation
FString MessageId = Execute_SendDataMessage(this, Message,
{{PlayerProfileRuntimeDataKey, FInworldData_Struct(PlayerProfile)}});
```
**Use Cases:**
- Personalizing character responses based on player characteristics
- Maintaining player context across sessions
- Adapting conversation style to player demographics
- Lightweight player data that doesn't require complex manipulation functions
### Character Profile
**`UInworldGraphRuntimeData_CharacterProfile`** (UObject Container)
A UObject container that manages character personality and dialogue style data. Internally stores an `FInworldCharacterProfile` struct and provides functions for manipulation. When graphs need the data, `GetData()` wraps the internal profile as `FInworldData_Struct`.
**Internal Data:**
- `CharacterProfile` - `FInworldCharacterProfile` structure defining character traits and behavior
**Container Functions:**
- `Get()` - Retrieves the current character profile as const reference
- `Set(CharacterProfile)` - Updates the character profile with new data
- `GetData()` - **Returns `FInworldData_Struct(CharacterProfile)` for graph consumption**
- `Initialize_Implementation()` - Handles setup and validation logic
**Why Use a Container:**
- Provides Blueprint-friendly functions (Set, Get)
- Supports initialization logic and validation
- Enables `ExposeOnSpawn = "true"` for Blueprint configuration
- Allows for future extension with complex manipulation methods
**Use Cases:**
- Managing character personality data with functions that structs cannot provide
- Blueprint-friendly character configuration
- Complex initialization and validation logic
- Providing structured access to character profile data for LLM nodes
### Voice
**`UInworldGraphRuntimeData_Voice`** (UObject Container)
A UObject container for managing voice configuration. Internally stores an `FInworldVoice` struct and provides simple get/set functions. When graphs need the data, `GetData()` wraps the internal voice settings as `FInworldData_Struct`.
**Internal Data:**
- `Voice` - `FInworldVoice` structure containing voice configuration settings
**Container Functions:**
- `Get()` - Retrieves the current voice configuration as const reference
- `Set(Voice)` - Sets the voice configuration for this runtime data
- `GetData()` - **Returns `FInworldData_Struct(Voice)` for graph consumption**
**Use Cases:**
- Blueprint-friendly voice configuration management
- Structured access to voice settings for TTS nodes
- Providing voice context during graph processing
- Managing character-specific voice characteristics with UObject functions
### Knowledge
**`UInworldGraphRuntimeData_Knowledge`** (UObject Container)
A UObject container that manages knowledge retrieval for Inworld graphs. Stores knowledge sets used for information retrieval and question-answering operations with support for dynamic adding, removing, enabling, and disabling of knowledge sets using text embeddings for semantic search.
**Internal Data:**
- `InitialKnowledge` - Map of initial knowledge assets automatically loaded during initialization
**Container Functions:**
- `Set(KnowledgeAssets)` - Sets the initial knowledge assets for the runtime data
- `AddKnowledgeTextEmbeddingAsset(Id, Knowledge, Replace)` - Adds a new text embedding asset to the knowledge base
- `AddKnowledgeTextEmbeddings(Id, Knowledge, Replace)` - Adds text embeddings directly to the knowledge base
- `RemoveKnowledge(Id)` - Removes a specific knowledge set by its unique ID
- `EnableKnowledge(Id)` - Enables a specific knowledge set for retrieval operations
- `DisableKnowledge(Id)` - Disables a specific knowledge set from retrieval operations
- `GetKnowledgeMapKeys()` - Retrieves all knowledge set IDs currently stored
- `GetData()` - **Returns knowledge data wrapped as `FInworldDataHandle` for graph consumption**
- `Initialize_Implementation()` - Processes initial knowledge assets and loads text embeddings
**Use Cases:**
- Defining what the character knows about specific topics using text embeddings
- Dynamically updating character knowledge during gameplay
- Controlling access to sensitive or context-specific information
- Managing domain-specific expertise through semantic search capabilities
### Relation State
**`UInworldGraphRuntimeData_RelationState`** (UObject Container)
A UObject container that manages character relationship states. Tracks and manages the relationship state between a character and another entity (typically the player), including multi-dimensional metrics like trust, respect, familiarity, flirtation, and attraction that influence character interactions and responses.
**Internal Data:**
- `Relation` - The current relationship state including all dimensional metrics with Blueprint exposure
**Container Functions:**
- `GetRelation()` - Retrieves the complete relationship state with all metrics
- `GetRelationshipLabel()` - Retrieves the primary relationship label derived from the metrics
- `GetRelationshipLabelString()` - Retrieves the relationship label as a human-readable string
- `UpdateRelation(Relation)` - Updates the relationship state with new relation data
- `UpdateRelation(TextResponse)` - Updates relationship state by parsing an AI-generated text response
- `GetData()` - **Returns `FInworldData_Struct(Relation)` for graph consumption**
**Use Cases:**
- Tracking how relationships evolve over time through conversation dynamics
- Modifying character behavior based on current relationship status and history
- Creating dynamic social interactions that respond to relationship changes
- Implementing relationship-dependent dialogue options and character responses
### Emotion State
**`UInworldGraphRuntimeData_EmotionState`** (UObject Container)
A UObject container that manages character emotional states. Tracks and manages a character's current emotional state, which can influence their behavior, dialogue tone, and responses during interactions. Emotion states can be explicitly set or derived from AI-generated responses.
**Internal Data:**
- `Emotion` - The current emotional state of the character (defaults to NEUTRAL with Blueprint exposure)
**Container Functions:**
- `GetEmotion()` - Retrieves the current emotion state structure
- `GetEmotionLabelString()` - Retrieves the emotion label as a human-readable string
- `GetEmotionLabel()` - Retrieves the current emotion label enum value
- `SetEmotionLabel(EmotionLabel)` - Sets the character's emotion to a specific label
- `UpdateEmotion(TextResponse)` - Updates emotion state by parsing an AI-generated text response
- `GetData()` - **Returns `FInworldData_Struct(Emotion)` for graph consumption**
**Use Cases:**
- Creating emotionally responsive characters with dynamic mood changes
- Influencing dialogue tone and behavior based on character emotional state
- Implementing dynamic emotional reactions to conversation events
- Adding personality depth through emotional expression and consistency
### Event History
**`UInworldGraphRuntimeData_EventHistory`** (UObject Container)
A UObject container that manages event history in Inworld graphs. Creates and manages an event history system that tracks various types of events (primarily speech events) during graph execution with configurable capacity and formatting options.
**Internal Data:**
- `SpeechEventFormatter` - Template for formatting speech events (default: "{AgentName}: {Utterance}")
- `Delimiter` - String used to separate events in the formatted history string (default: newline)
- `Capacity` - Maximum number of events to store (INDEX_NONE = unlimited)
- `InitialHistoryTable` - Optional data table with pre-populated history entries
**Container Functions:**
- `Set(EventHistoryTable)` - Sets the initial event history data table
- `AddGenericEvent(Event)` - Adds a generic text event to the history
- `AddSpeechEvent(Speech)` - Records a speech event with agent name and utterance
- `Clear()` - Removes all stored events and resets the history
- `GetHistoryString()` - Returns formatted string containing all recorded events
- `GetSpeechEvents()` - Returns array of all recorded speech events
- `GetHistoryEventData()` - Returns structured event history data for processing
- `GetData()` - **Returns `FInworldData_EventHistory` for graph consumption**
- `Initialize_Implementation()` - Sets up event history system and processes initial data
- `DebugDumpEventHistory()` - Outputs debug information about current event history state
**Use Cases:**
- Providing conversation context to AI systems during graph execution
- Enabling characters to reference past interactions and maintain continuity
- Creating memory-based character development and relationship building
- Implementing conversation flow tracking and history-dependent behavior
### Intent
**`UInworldGraphRuntimeData_Intent`** (UObject Container)
A UObject container that manages intent recognition in Inworld graphs. Compiles and stores all necessary data for intent matching operations, including text embeddings for intent recognition, enable/disable states, and asset management with support for dynamic intent management during graph execution.
**Internal Data:**
- `InitialIntents` - Map of initial intent assets loaded during initialization
**Container Functions:**
- `Set(IntentAssets)` - Sets the initial intent assets for the runtime data
- `AddIntentTextEmbeddingAsset(Id, Intent, Replace)` - Adds a new intent using text embedding asset
- `AddIntentTextEmbeddings(Id, Intent, Replace)` - Adds text embeddings directly for intent matching
- `RemoveIntent(Id)` - Removes a specific intent identified by its unique ID
- `EnableIntent(Id)` - Enables a specific intent within the runtime data
- `DisableIntent(Id)` - Disables a specific intent within the runtime data
- `GetIntentMapKeys()` - Retrieves all intent set IDs currently stored in the runtime data
- `GetData()` - **Returns intent data wrapped as `FInworldDataHandle` for graph consumption**
- `Initialize_Implementation()` - Processes initial intent assets and loads text embeddings
**Use Cases:**
- Understanding player goals and desires through semantic intent matching
- Triggering appropriate character responses based on detected player intent
- Managing context-sensitive intent recognition with enable/disable capabilities
- Implementing dynamic conversation flows based on detected intents and confidence scores
### Goals
**`UInworldGraphRuntimeData_Goals`** (UObject Container)
A UObject container that manages character goals and intent-driven behaviors. Stores and manages a character's goals, which define intent-based triggers and responses that can be activated through dialogue intents or explicit triggers, enabling dynamic character behaviors based on conversation context.
**Internal Data:**
- `GoalTable` - Data table defining the goals of the character (must use FInworldGoal row structure)
- `ParameterSourceObject` - Weak pointer to object implementing IInworldGoalParameterInterface for dynamic parameter generation
**Container Functions:**
- `Set(GoalTable)` - Sets the goals data table and loads its contents into the goals map
- `TryGetGoal(GoalName, GoalOut)` - Returns reference to an Inworld Goal by name
- `TryGetGoalFromIntent(IntentName, GoalOut)` - Returns first goal that has the specified intent
- `CompleteGoal(GoalName)` - Removes a goal from the goals map if it is not repeatable
- `RemoveGoal(GoalName)` - Removes a goal from the goals map
- `AddGoal(Name, Goal)` - Adds a goal to the goals map
- `ClearAllGoals()` - Removes all goals from the goals map
- `SetParameterSourceObject(Source)` - Registers a UObject implementing IInworldGoalParameterInterface as parameter source
- `GetParameterSourceObject()` - Returns pointer to the parameter source object
- `GetGoalsMapKeys()` - Retrieves all goal names currently stored in the goals map
- `GetGoalsDataTable()` - Returns pointer to the data table containing the goals
- `GetData()` - **Returns `FInworldData_Struct(goalsData)` for graph consumption**
- `Initialize_Implementation()` - Initializes the runtime data by loading goals from the configured data table
**Use Cases:**
- Defining character motivations and intent-based objectives
- Creating goal-oriented dialogue and behavior systems
- Implementing quest-like character interactions with completion tracking
- Managing dynamic character objectives based on conversation context and game state
### Conversation State
**`UInworldGraphRuntimeData_ConversationState`** (UObject Container)
A UObject container that tracks the current state of conversations, including participant information, multi-agent scenarios, and conversation flow management.
**Internal Data:**
- `ConversationState` - `FInworldConversationState` containing conversation metadata
**Container Functions:**
- `GetConversationState()` - Returns current conversation state
- `SetConversationState(ConversationState)` - Updates conversation state
- `GetData()` - **Returns `FInworldData_Struct(ConversationState)` for graph consumption**
**Use Cases:**
- Managing multi-character conversations
- Tracking conversation flow and turn-taking
- Implementing conversation state-dependent behavior
- Coordinating between multiple AI participants
### Knowledge Filter
**`UInworldGraphRuntimeData_KnowledgeFilter`** (UObject Container)
A UObject container that manages character knowledge filtering. Stores and manages knowledge filter settings that constrain the amount and scope of knowledge available to a character during conversations. Knowledge filters can be used to limit responses based on character awareness, permissions, or narrative requirements.
**Internal Data:**
- `KnowledgeFilter` - Knowledge filter configuration constraining character's knowledge access with Blueprint exposure
**Container Functions:**
- `Set(KnowledgeFilter)` - Updates the knowledge filter configuration
- `GetKnowledgeFilter()` - Retrieves the current knowledge filter configuration as const reference
- `GetData()` - **Returns `FInworldData_Struct(KnowledgeFilter)` for graph consumption**
**Use Cases:**
- Limiting character knowledge based on story context and narrative requirements
- Implementing information security and access control for sensitive topics
- Creating knowledge-limited character variants for different scenarios
- Managing spoiler-free character interactions in story-driven experiences
---
## FInworldData Types
All data that flows through graphs inherits from `FInworldData`, which serves as the base interface for the graph system. This section covers the predefined `FInworldData` types available in v0.8, defined in `InworldData.h`. These can be used directly in graphs or as the underlying data wrapped by UObject containers.
**Key Concepts:**
- **All graph data is FInworldData**: Whether passed directly or via UObject containers, all runtime data ultimately becomes `FInworldData`
- **FInworldData_Struct**: A special type that can wrap any Unreal struct (like `FInworldPlayerProfile`) to make it `FInworldData`-compatible
- **Direct vs Container usage**: You can use these types directly in graphs, or manage them through `UInworldGraphRuntimeData` containers that provide functions
### Basic Data Types
**`FInworldData_Bool`**
- Simple boolean data wrapper for graph processing
- Property: `Bool` (bool) - The boolean value
**`FInworldData_Int`**
- 32-bit signed integer data wrapper for graph processing
- Property: `Int` (int32) - The integer value
**`FInworldData_Float`**
- Single-precision floating-point data wrapper for graph processing
- Property: `Float` (float) - The floating-point value
**`FInworldData_Object`**
- Unreal Engine object reference wrapper for graph processing
- Property: `Object` (`TObjectPtr`) - The object reference
**Example Usage:**
```cpp
// Direct wrapping of any struct
FInworldPlayerProfile profile;
FInworldDataHandle handle = FInworldData_Struct(profile);
// Container usage (what UInworldGraphRuntimeData classes do internally)
FInworldDataHandle UMyRuntimeData::GetData()
{
return FInworldData_Struct(myInternalStruct);
}
```
---
## Architecture Summary
The Inworld Agent Runtime data system follows a unified architecture where:
1. **FInworldData is the foundation**: All runtime data flowing through graphs is `FInworldData` or its derivatives
2. **Two usage patterns**:
- **Direct**: Use `FInworldData` types directly, or wrap structs with `FInworldData_Struct(myStruct)`
- **Container**: Use `UInworldGraphRuntimeData` classes for function-based management, which internally wrap their data as `FInworldData` via `GetData()`
3. **FInworldData_Struct is the bridge**: Enables any Unreal struct to become `FInworldData`-compatible
4. **Consistent graph interface**: Whether using direct types or containers, graphs receive `FInworldDataHandle` containing `FInworldData`
This unified approach provides flexibility while maintaining a consistent interface for graph processing.
---
## Components
### IInworldConversationTarget Interface
The `IInworldConversationTarget` interface defines the contract for entities that can participate in Inworld conversations as recipients of player messages and interactions. Both individual character components and conversation groups implement this interface, enabling unified interaction patterns.
**Key Responsibilities:**
- Conversation interruption handling
- Message processing (audio, text, and data)
- Integration with Inworld conversation system
**Typical Implementers:**
- Individual character components (`UInworldBaseCharacterComponent` via `UInworldConversationTargetComponent`)
- Multi-character conversation groups (`UInworldConversationGroup`)
- Custom conversation handlers
**Interface Methods**
```cpp
// Interrupts the current conversation or ongoing speech
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Inworld")
void Interrupt();
// Sends a message to this conversation target
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Inworld")
FString SendMessage(const FInworldDataHandle& Message);
// Sends a message with player information
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Inworld")
FString SendPlayerMessage(const FInworldDataHandle& Message, const FInworldPlayerProfile& PlayerProfile);
// Sends a message with runtime data
UFUNCTION(BlueprintCallable, BlueprintNativeEvent, Category = "Inworld")
FString SendDataMessage(const FInworldDataHandle& Message, const TMap& RuntimeData);
```
**Unified API Benefits:**
- Player components can interact seamlessly with either individual characters or conversation groups
- Consistent message handling across different conversation target types
- Blueprint-friendly interface for conversation management
- Polymorphic behavior for different conversation scenarios
---
### UInworldConversationTargetComponent
The abstract base class for all conversation target implementations. This component provides the foundation for any actor component that can serve as a conversation target in the Inworld system, implementing the `IInworldConversationTarget` interface.
**Key Features**
- Conversation target interface implementation (`IInworldConversationTarget`)
- Readiness state management
- Lifecycle event broadcasting
- Message handling abstraction
- Blueprint and C++ integration
**Core Functions**
```cpp
// Gets the unique identifier for this conversation target
UFUNCTION(BlueprintCallable, Category = "Inworld")
const FString& GetId() const;
// Checks if the conversation target is ready and initialized
UFUNCTION(BlueprintCallable, Category = "Inworld")
bool IsReady();
```
**Properties**
```cpp
// Unique identifier for this conversation target
UPROPERTY(EditAnywhere, BlueprintReadOnly, Category = "Inworld")
FString Id;
// Whether the conversation target is ready and initialized
UPROPERTY(BlueprintReadOnly, Category = "Inworld")
bool bIsReady;
```
**Events**
```cpp
// Event fired when conversation target component is ready and initialized
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldConversationTargetReady OnConversationTargetReady;
// Event fired when a character response is generated
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldConversationCharacterMessage OnCharacterMessage;
// Event fired when a player message is received
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldConversationPlayerMessage OnPlayerMessage;
```
---
### Inworld Player Components
The Inworld Player Component system provides a hierarchical architecture for handling player interactions with AI characters. This system is built on three main components that extend each other to provide increasing levels of functionality.
#### Component Hierarchy
```
UInworldBasePlayerComponent (Abstract)
↳ UInworldSimplePlayerComponent
↳ UInworldPlayerComponent
```
#### UInworldBasePlayerComponent
The abstract base class that provides core functionality for player interaction with Inworld characters.
**Key Features**
- Text and audio message sending
- Conversation target management
- Microphone and voice activity detection
- Audio capture and processing
- Comprehensive event delegation system
**Public Functions**
*Message Sending*
```cpp
// Sends a message to the current conversation target
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void SendMessage(const FInworldDataHandle& Message);
```
*Conversation Management*
```cpp
// Sets the current conversation target for this player
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void SetConversationTarget(TScriptInterface NewConversationTarget);
// Gets the current conversation target
UFUNCTION(BlueprintCallable, Category = "Inworld")
TScriptInterface GetConversationTarget();
// Removes the current conversation target
UFUNCTION(BlueprintCallable, Category = "Inworld")
void RemoveConversationTarget();
```
*Audio Control*
```cpp
// Enable/disable microphone
UFUNCTION(BlueprintCallable, Category = "Inworld")
void SetMicrophone(bool Enabled);
```
**Protected Virtual Methods (Common Override Points)**
*Audio Processing*
```cpp
// Called when audio is captured from the microphone (commonly overridden)
virtual void OnAudioCapture(const UInworldAudioCapture* AudioCapture, const FInworldData_Audio& AudioData, bool VAD);
```
**Events**
*Player Ready Event* - Fired when the player component is initialized and ready.
```cpp
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldPlayerReady OnPlayerReady;
```
*Player Message Event* - Fired when a player sends a message.
```cpp
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldPlayerMessage OnPlayerMessage;
```
*Audio Capture Event* - Triggered when audio is captured from the player's microphone.
```cpp
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldPlayerAudioCapture OnPlayerAudioCapture;
```
*Conversation Target Updates* - Fired when the conversation target is changed.
```cpp
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldConversationTargetUpdate OnConversationTargetUpdate;
```
**Configuration Properties**
*Audio Settings*
```cpp
// Configuration settings for audio capture
UPROPERTY(EditAnywhere, BlueprintReadOnly, Category = "Audio")
FInworldAudioCaptureConfig AudioCaptureConfig;
// Time buffer for Voice Activity Detection in seconds
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Audio")
float VADBufferTime{0.6f};
// Identifier for the Streaming STT Creation Config settings
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Audio")
FString StreamingSTTComponentId{"Default"};
// Whether to start with microphone enabled
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Audio")
bool bStartMicEnabled{true};
```
#### UInworldSimplePlayerComponent
Extends the base component with player profile support for personalized interactions.
**Key Features**
- Player profile integration
- Enhanced messaging with profile context
- Ready-to-use basic implementation
- Spawnable in Blueprint editor
**Enhanced Methods**
*Profile-Aware Messaging*
```cpp
// Sends a message to the player's conversation target with profile context
virtual void SendMessage(const FInworldDataHandle& Message) override;
```
**Properties**
*Player Profile*
```cpp
// Player profile data used to personalize interactions
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld")
FInworldPlayerProfile PlayerProfile;
```
The player profile struct contains:
- `Name` (`FString`) - Player's name for personalized address
- `Gender` (`FString`) - Player's gender identity for appropriate character responses
- `Age` (`FString`) - Player's age or age category for context-appropriate dialogue
- `Role` (`FString`) - Player's role or position in the game world
#### UInworldPlayerComponent
The full-featured implementation that adds trigger support for goal-driven interactions.
**Key Features**
- Complete player interaction suite
- Trigger system for goal-based interactions
- Custom parameter support
- Advanced conversation control
**Advanced Functions**
*Trigger System*
```cpp
// Sends a trigger to initiate goal-driven interactions
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void SendTrigger(FName GoalName, const TMap Parameters);
```
**Trigger Usage:**
- Activate specific character behaviors, goals, or conversation paths
- Include custom parameters to provide context and customize the interaction
- Control advanced AI interactions through goal-driven conversations
**Example Usage:**
```cpp
// Blueprint example - using FName for GoalName as per the API
TMap TriggerParams;
TriggerParams.Add("location", "tavern");
TriggerParams.Add("mood", "friendly");
PlayerComponent->SendTrigger(FName("greet_player"), TriggerParams);
```
---
### Inworld Character Components
The Inworld Character Component system provides a hierarchical architecture for AI-driven characters. Built on the conversation target foundation (explained above), this system creates increasingly sophisticated character implementations.
#### Component Hierarchy
```
UInworldConversationTargetComponent (Abstract) - see above
↳ UInworldBaseCharacterComponent (Abstract)
↳ UInworldSimpleCharacterComponent
↳ UInworldCharacterComponent
↳ UInworldConversationGroupComponent
```
#### UInworldBaseCharacterComponent
The abstract base class that extends `UInworldConversationTargetComponent` and provides core functionality for AI-driven character behavior and conversation management.
**Key Features**
- Conversation target implementation (extends UInworldConversationTargetComponent)
- Audio and text message handling via data handles
- Speech synthesis and viseme blending
- Character graph integration for dialogue generation
- Runtime data management system
- Comprehensive event system for speech and conversation events
- Character data table support
**Public Functions**
*Runtime Data Management*
```cpp
// Gets all runtime data for this character (C++ override)
virtual void GetRuntimeData(TMap& OutRuntimeData) const;
// Gets all runtime data and its associated type for this character
UFUNCTION(BlueprintPure, Category = "Inworld|Graph", meta = (DisplayName = "Get Runtime Data with Type"))
virtual TArray GetRuntimeDataAndType() const;
```
*Character Data Configuration*
```cpp
// Applies data from an FInworldCharacterConfig struct to this component's properties
virtual void ApplyCharacterData(const FInworldCharacterConfig& Data);
// Returns the specific type of this component (e.g., Simple or Full)
virtual EInworldComponentType GetComponentType() const;
```
*Message Handling (Inherited from IInworldConversationTarget)*
```cpp
// Sends a message to this character
virtual FString SendMessage_Implementation(const FInworldDataHandle& Message) override;
// Sends a message to this character with player information
virtual FString SendPlayerMessage_Implementation(
const FInworldDataHandle& Message, const FInworldPlayerProfile& PlayerProfile) override;
// Sends a message to this character with runtime data
virtual FString SendDataMessage_Implementation(
const FInworldDataHandle& Message, const TMap& RuntimeData) override;
```
**Protected Virtual Methods (Common Override Points)**
*Dialogue Processing*
```cpp
// Called when dialogue generation produces a result (commonly overridden)
UFUNCTION(BlueprintNativeEvent, Category = "Inworld")
void OnDialogueGenerationResult(const FString& ExecutionId, const FString& NodeId, const FInworldDataHandle& DataHandle);
```
**Events**
*Speech Events*
```cpp
// Event fired during character speech playback with viseme data
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterUtterancePlayback OnUtterancePlayback;
// Event fired when character starts speaking
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterUtteranceStart OnUtteranceStart;
// Event fired when character speech is interrupted
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterUtteranceInterrupt OnUtteranceInterrupt;
// Event fired when character completes speaking
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterUtteranceComplete OnUtteranceComplete;
```
**Properties**
*Data Table Configuration*
```cpp
// If true, initialize properties from DataTable using Character ID
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld")
bool bUseDataTable = false;
// The Data Table containing character definitions for this component
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld")
TSoftObjectPtr CharacterDataTable;
// The ID (Row Name) of the character to load from the CharacterDataTable
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld")
FName CharacterID;
```
*Graph Configuration*
```cpp
// Reference to the dialogue generation graph asset
UPROPERTY(EditAnywhere, BlueprintReadOnly, Category = "Inworld|Graph")
TObjectPtr DialogueGenerationGraphAsset;
// Runtime instance of the dialogue generation graph
UPROPERTY(BlueprintReadOnly, Category = "Graph")
TObjectPtr DialogueGenerationGraph;
```
*Audio Component*
```cpp
// Audio component for character voice playback
UPROPERTY(BlueprintReadOnly, Category = "Audio")
TWeakObjectPtr InworldCharacterAudioComponent;
```
#### UInworldSimpleCharacterComponent
Simple implementation extending the base character functionality with essential runtime data management for basic AI features.
**Key Features**
- Character profile and personality configuration
- Voice and speech settings
- Event history tracking for context
- Conversation state management
- Spawnable in Blueprint editor
- Complete basic character functionality
**Public Functions**
*Event History Management*
```cpp
// Updates the event history by adding a new speech event entry
UFUNCTION(BlueprintCallable, Category = "Inworld|Event History")
void AddHistoryEvent(const FInworldEventSpeech& EventHistoryEntry) const;
// Clears the character's event history data, resetting all stored interactions
UFUNCTION(BlueprintCallable, Category = "Inworld|Event History")
void ClearHistory() const;
```
**Runtime Data Properties**
*Character Profile* - Contains personality, backstory, and behavior settings
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr CharacterProfile;
```
*Voice Settings* - Controls speech synthesis and voice characteristics
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr Voice;
```
*Event History* - Tracks conversation context and past interactions
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr EventHistory;
```
**Events**
*Dialogue Response*
```cpp
// Event fired when dialogue is generated and ready
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterDialogueResponse OnDialogueResponse;
```
#### UInworldCharacterComponent
Full-featured implementation that adds advanced AI capabilities including emotion tracking, relationship states, and sophisticated behavioral systems.
**Key Features**
- Emotion state tracking and emotional responses
- Relationship state management with dynamic character relationships
- Knowledge base and knowledge filtering
- Goal system with completion tracking
- Intent recognition and processing
- Advanced TTS output processing and buffering
**Advanced Functions**
*Conversation Group Management*
```cpp
// Checks if this character is currently part of a conversation group
UFUNCTION(BlueprintCallable, Category = "Inworld")
bool IsInConversationGroup();
```
**Advanced Runtime Data Properties**
*Emotion State* - Tracks and manages character emotions
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr EmotionState;
```
*Relationship State* - Manages character relationships and dynamics
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr RelationState;
```
*Knowledge System* - Contains character knowledge base and filtering
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr Knowledge;
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr KnowledgeFilter;
```
*Goal Management* - Manages character objectives and motivations
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr Goals;
```
*Intent Processing* - Recognizes and processes user intentions
```cpp
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld|Manual Config")
TObjectPtr Intents;
```
**Events**
*Goal Completion*
```cpp
// Event fired when a character goal is completed
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldGoalComplete OnGoalComplete;
```
#### UInworldConversationGroupComponent
Component wrapper for managing multi-character conversation groups. This component extends `UInworldConversationTargetComponent` and provides an actor component interface to `UInworldConversationGroup`, allowing easy integration of multi-character conversations into actors.
**Key Features**
- Multi-character conversation management
- Speaker selection graph integration
- Participant management (add/remove characters)
- Event history tracking
- Blueprint accessible conversation control
**Participant Management**
```cpp
// Adds a character to the conversation group
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void AddCharacter(UInworldCharacterComponent* CharacterComponent);
// Removes a character from the conversation group
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void RemoveCharacter(FString CharacterId);
// Gets all character participants in the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
TArray GetParticipants();
```
**Conversation Control**
```cpp
// Manually invokes the next character response in the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void InvokeNextResponse(const FString& CharacterId = "");
// Gets the current player target for the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
UInworldBasePlayerComponent* GetPlayerTarget();
// Gets the current state of the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
FInworldConversationState GetConversationState();
// Clears the group's event history data
UFUNCTION(BlueprintCallable, Category = "Inworld|Event History")
void ClearHistory(bool IncludeCharacters = false) const;
```
**Properties**
*Speaker Selection Configuration*
```cpp
// Inworld graph for intelligent speaker selection in multi-character conversations
UPROPERTY(EditAnywhere, BlueprintReadOnly, Category = "Graph")
TObjectPtr SpeakerSelectionGraphAsset;
```
**Events**
```cpp
// Event fired when a character is added to the conversation
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterAddedToConversation OnCharacterAdded;
// Event fired when a character is removed from the conversation
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldCharacterRemovedFromConversation OnCharacterRemoved;
```
---
### Inworld Conversation Group (UObject)
The `UInworldConversationGroup` is a UObject (not a component) that manages multi-character conversations, orchestrating interactions between multiple AI characters and a player through intelligent message routing and speaker selection. It implements the `IInworldConversationTarget` interface and can be used directly in C++ or wrapped by `UInworldConversationGroupComponent` for Blueprint/component-based usage.
#### Key Features
- **Multi-character conversation management** - Coordinates multiple AI participants
- **Intelligent speaker selection system** - AI-driven selection of appropriate responders
- **Message broadcasting** - Distributes messages to all participants
- **Conversation state and history tracking** - Maintains context across interactions
- **Player target integration** - Connects with player components
- **Event system** - Comprehensive monitoring of conversation activities
- **Dynamic participant management** - Add/remove characters during conversations
#### Core Functions
**Creation and Setup**
```cpp
// Creates a new conversation group asynchronously (Blueprint version)
UFUNCTION(BlueprintCallable, Category = "Inworld", meta = (WorldContext = "WorldContextObject"))
static void CreateConversation(UObject* WorldContextObject, UObject* Owner,
FOnInworldConversationGroupCreated Callback,
UInworldGraphAsset* SpeakerSelectionGraphAsset = nullptr);
// Creates a new conversation group asynchronously (C++ version)
static void CreateConversation(UObject* WorldContextObject, UObject* Owner,
FOnInworldConversationGroupCreatedNative NativeCallback,
UInworldGraphAsset* SpeakerSelectionGraphAsset = nullptr);
```
**Participant Management**
```cpp
// Adds a character to the conversation group
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void AddCharacter(UInworldCharacterComponent* CharacterComponent);
// Removes a character from the conversation group
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void RemoveCharacter(FString CharacterId);
// Gets all character participants in the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
TArray GetParticipants();
```
**Player Integration**
```cpp
// Sets the player target for this conversation group (C++ only)
virtual void SetPlayerTarget(UInworldBasePlayerComponent* PlayerComponent);
// Removes the current player target from this conversation group (C++ only)
virtual void RemovePlayerTarget();
// Gets the current player target for the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
UInworldBasePlayerComponent* GetPlayerTarget();
```
**Message Handling (IInworldConversationTarget Interface)**
```cpp
// Sends a message to the conversation group
virtual FString SendMessage_Implementation(const FInworldDataHandle& Message) override;
// Sends a message to the conversation group with player information
virtual FString SendPlayerMessage_Implementation(
const FInworldDataHandle& Message, const FInworldPlayerProfile& PlayerProfile) override;
// Sends a message to the conversation group with runtime data
virtual FString SendDataMessage_Implementation(
const FInworldDataHandle& Message, const TMap& RuntimeData) override;
// Cancels the current message and interrupts all characters in the group
virtual void Interrupt_Implementation() override;
```
**Conversation Control**
```cpp
// Manually invokes the next character response in the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
virtual void InvokeNextResponse(const FString& CharacterId = "");
// Gets the current state of the conversation
UFUNCTION(BlueprintCallable, Category = "Inworld")
FInworldConversationState GetConversationState();
// Clears the group's event history data
UFUNCTION(BlueprintCallable, Category = "Inworld|Event History")
void ClearHistory(bool IncludeCharacters = false) const;
```
#### Events
**Player Message Events** - Fired when a player sends a message in the conversation
```cpp
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldConversationPlayerMessage OnPlayerMessage;
```
**Character Message Events** - Fired when a character sends a message in the conversation
```cpp
UPROPERTY(BlueprintAssignable, Category = "Inworld")
FOnInworldConversationCharacterMessage OnCharacterMessage;
```
#### Properties
**AI Guidance**
```cpp
// Instruction text provided to AI for next response generation
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld")
FString NextResponseInstruction{"consider all individuals present in the conversation."};
```
**Event History Management**
```cpp
// Event history runtime data for tracking conversation context
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Inworld")
TObjectPtr EventHistory;
```
#### Usage Pattern
```cpp
// 1. Create conversation group
UInworldConversationGroup::CreateConversation(this, this,
FOnInworldConversationGroupCreatedNative::CreateLambda([this](UInworldConversationGroup* Group, bool Success)
{
if (Success && Group)
{
ConversationGroup = Group;
// 2. Add characters to the conversation
Group->AddCharacter(Character1Component);
Group->AddCharacter(Character2Component);
Group->AddCharacter(Character3Component);
// 3. Set player target
Group->SetPlayerTarget(PlayerComponent);
// 4. Set up event bindings
Group->OnPlayerMessage.AddDynamic(this, &AMyActor::OnPlayerSpoke);
Group->OnCharacterMessage.AddDynamic(this, &AMyActor::OnCharacterSpoke);
// 5. Send messages to the group using IInworldConversationTarget interface
FInworldData_Text TextMessage;
TextMessage.Text = "Hello everyone!";
FInworldPlayerProfile PlayerProfile;
PlayerProfile.Name = "Player";
Group->SendPlayerMessage(TextMessage, PlayerProfile);
}
}));
```
---
## Usage Guidelines
### Choosing the Right Components
**Use UInworldSimplePlayerComponent when:**
- Basic player interactions are sufficient
- Player profile personalization is needed
- Simple text/audio messaging is the primary requirement
- No advanced trigger system is needed
**Use UInworldPlayerComponent when:**
- Full-featured player interactions are required
- Goal-driven conversations are needed
- Trigger system with custom parameters is necessary
- Maximum flexibility and control is desired
**Use UInworldSimpleCharacterComponent when:**
- Basic AI character functionality is sufficient
- Character profile and voice settings are needed
- Event history tracking is desired
- Advanced AI features are not necessary
**Use UInworldCharacterComponent when:**
- Advanced AI capabilities are required
- Emotion and relationship tracking are needed
- Conversation state management is important
- Goal-driven behavior and intent recognition are necessary
- Complex multi-graph AI processing is required
**Use UInworldConversationGroup when:**
- Multiple AI characters need to participate in conversations
- Intelligent speaker selection is required
- Complex multi-participant dialogue management is needed
- Group conversation state tracking is important
### Best Practices
**Component Setup**
1. Choose the appropriate component level based on feature requirements
2. Configure all necessary runtime data before beginning play
3. Set up event bindings early in component initialization
4. Assign player profiles and character profiles for personalized interactions
**Runtime Data Management**
1. Use direct `FInworldData` types (like `FInworldData_Bool`) for simple data
2. Use `UInworldGraphRuntimeData` containers for complex data that needs manipulation functions
3. Ensure all runtime data is properly initialized before graph execution
4. Use `FInworldData_Struct` to make any Unreal struct graph-compatible
**Conversation Management**
1. Always set a conversation target before sending messages
2. Handle conversation target changes in your UI system
3. Use conversation groups for multi-character scenarios
4. Clean up targets when switching contexts
**Audio Handling**
1. Monitor speech events for responsive UI feedback
2. Configure voice settings appropriately for each character
3. Handle microphone permissions on different platforms
4. Provide visual feedback for audio capture and playback status
**Advanced AI Features**
1. Configure appropriate runtime data for character behavior
2. Set up goals and intents for character behavior patterns
3. Use emotion and relationship states to create dynamic interactions
4. Use player triggers strategically for scripted story moments
5. Leverage conversation groups for multi-character scenarios
---
#### Chat
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/chat
The Chat Template provides a simple chat interface for talking to a character or group of characters powered by the [Character template](/unreal-engine/runtime/templates/character).
## Run the Chat Example Level
1. Open the settings menu in the Content Browser within Unreal.
2. Enable the option for "Show Plugin Content".
3. In the Content Browser, navigate to `/Plugins/InworldChat`.
4. Double-click on the Chat map.
5. Click the Play-in-Editor button (alt+p) to launch your level.
6. Have a conversation! You can uncheck the microphone to speak or type directly in the text box.
## Template Overview
The Chat template provides a sample chat interface in the form of a widget `WBP_Chat2D`. When this widget is added to a level, it display a messaging interface with a dropdown menu for selecting the conversation target to chat with.
The widget will automatically detect any characters or conversation groups that are added to the level and display them in the dropdown menu.
The sample Chat level has two conversation targets:
- **Algo the Apprentice**: A single character conversation target.
- **Aurora and Marcus**: A group conversation target.
By default **Algo the Apprentice** will be the target character. If you click the Innequin profile icon at the top of the widget a dropdown menu will appear allowing you to change the conversation target.
To send a message to the conversation target, you can either:
1. Type your message in the text box and press enter (or click the send button).
2. Talk to the character or group using your microphone. Simply click the microphone icon at the left end of the text box to toggle the microphone on or off.
|**Microphone Off** || **Microphone On** |
|---|---|---|
| || |
The audio capture component will automatically detect when you have completed your message and send the message to the conversation target.
You will hear audio with the character's response, along with seeing the text displayed in the widget.
---
#### Innequin
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/innequin
The Innequin Template builds on the [Character template](/unreal-engine/runtime/templates/character) to provide an example of utilizing the Inworld Character graphs and components to create 3D character interactions.
## Run an Innequin Sample Level
1. Open the settings menu in the Content Browser within Unreal.
2. Enable the option for "Show Plugin Content".
3. In the Content Browser, navigate to `/Plugins/InworldInnequin/Levels`.
4. Double-click on the **SingleCharacter** or **MultiCharacter** map.
5. Click the Play-in-Editor button (alt+p) to launch your level.
6. Have a conversation! You can uncheck the microphone to speak or type directly in the text box.
## Template Overview
The Innequin template provides a few sample levels to demonstrate the [Character template](/unreal-engine/runtime/templates/character) components in a 3D environment.
This also includes a collection of Inworld Innequin character assets, and a sample 3D chat interface widget `WBP_Chat3D`.
### SingleCharacter Level
The SingleCharacter level is a simple level that demonstrates a single character interacting with the player.
This character is named "Algo the Apprentice", she is an eager,
highly trained, and slightly chaotic wizard's apprentice who has mastered her studies and now desperately waits at the tower to be claimed by the right wizard, all while hoping
the user might be the one she's been searching for.
To send a message to Algo, you can either:
1. Type your message in the text box and press enter (or click the send button).
2. Talk using your microphone. Simply click the microphone icon at the left end of the text box to toggle the microphone on or off.
|**Microphone Off** || **Microphone On** |
|---|---|---|
| || |
The audio capture component will automatically detect when you have completed your message and send the message to Algo.
A speech bubble will appear above Algo's head and she will generate a response.
#### Character Configuration
In the SingleCharacter level, the Algo character is represented by a **BP_Innequin_Character** actor (which inherits from the **BP_InworldCharacter3D** blueprint from the [Character template](/unreal-engine/runtime/templates/character)).
The character is configured in the **Inworld Character** component. In this example, the details are set in the **DT_InworldCharacters** data table and referenced on the component.
For more information on how to configure a character, see the [Character template](/unreal-engine/runtime/templates/character#character-setup).
#### Goals and Intents
Goals allow you to configure predefined character responses and behavior. Intents are sets of phrases used to classify a player's message intention. Intents can be used to activate goals.
Algo has a few goals and intents configured to her. For instance, when you say "Hello", the "Greeting" goal is activated and she will be instructed to:
Introduce yourself and express hopeful excitement about meeting someone new, comically inquiring if the user might be a wizard seeking an apprentice.
**Algo's Goals:**
- **Greeting**
- Instruction: "Introduce yourself and express hopeful excitement about meeting someone new, comically inquiring if the user might be a wizard seeking an apprentice."
- Intents:
- Greeting: "Hello", "Hi there", "Hey"
- **AskForHelp**
- Instruction: "Eagerly frame the user's request for help as a adventurous quest and then ask what you can assist them with."
- **AskForJoke**:
- Verbatim: "Okay, what's the difference between a wizard and a sorcerer? About three years of student loan debt."
- **Goodbye**:
- Instruction: "Express hope that the user will return soon and wish them well in their travels."
For more information on how to configure goals and intents, see the [Character template](/unreal-engine/runtime/templates/character#goals).
### MultiCharacter Level
The MultiCharacter level is a more complex level that demonstrates a group of characters conversing and interacting with the player.
In this conversation we have:
- Aurora Summers, a seasoned, middle-aged detective, is observant and analytical, with a keen eye for detail honed by years of cracking tough cases.
- Marcus Thompson, a curious and charismatic young adult journalist whose persistent, investigative nature makes him formidable in reporting.
In a group conversation characters will automatically take turns speaking with eachother. The player is able to interrupt and join in the conversation at any time.
#### Conversation group
In the MultiCharacter level, the conversation group is created by a **BP_ConversationGroup** actor (a blueprint from the [Character template](/unreal-engine/runtime/templates/character)).
This blueprint actor allows you to initialize a conversation group with a set of characters from the level. It also has some basic automated response settings.
In this example we have configured the conversation group to trigger responses from the characters automatically every 5 seconds of silence.
For more information on conversation groups, see the [Character template](/unreal-engine/runtime/templates/character#inworld-conversation-group-uobject).
### Director Mode Level
The Director Mode level is essentially a replica of the [MultiCharacter Level](#multicharacter-level), but specifically configured to demonstrate the **Speaker Rule Director**.
While the MultiCharacter level uses dynamic AI speaker selection, this level enforces a strict, turn-based speaker order. This is useful for scenarios like board games or scripted sequences where AI randomness is not desired.
In this example, the characters (Aurora and Marcus) follow a predefined sequence instead of using the dynamic AI speaker selection. As shown in the configuration (bottom right of the screenshot), the sequence is defined as: **Aurora -> Aurora -> Aurora**.
#### Level Setup
1. **Map**: `MultiCharacter_Director`
2. **GameMode**: `BP_Innequin_Director_GameMode`
The logic for activating the rule is handled in the GameMode. On `BeginPlay`, the GameMode gets the `RuleDirector` from the conversation group component and activates the "TestRule".
For more information on how to configure the Speaker Rule Director, see the [Character template](/unreal-engine/runtime/templates/character).
---
#### MetaHuman
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/metahuman
The MetaHuman template provides lipsyncing support for Unreal MetaHumans.
Lipsync support is experimental.
## Get Started
Follow the steps below set up a MetaHuman in the SingleCharacter level.
Select the character’s Face in the components menu
In the Details panel, update the Anim Class to ABP_Metahuman_Face:
Add `InworldVoiceAudio`, `InworldCharacter`, and `InworldLipsync` components to the actor.
Please note you still need to set up your `InworldCharacter` component with the appropriate data and graphs. See the [Character template](/unreal-engine/runtime/templates/character#character-setup) for details on setting up the character.
In the SingleCharacter level replace the current Inworld character actor with your modified MetaHuman actor.
Now you should be able to play the Single Character level and your MetaHuman character will respond with lipsync animations!
---
#### Command
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/command
The Command Template illustrates a method of extracting information out of the user's voice or text input.
For the example in this template, we extract command information from statements like "move to the box" or "hide behind the wall." In those examples we could extract the command ("move" and "hide"), subject ("box" and "wall") and the adverb ("behind"). From the extracted information we create a waypoint for an AI controlled agent to move to.
The same principles could be used to extract other types of information from user input, such as sentiment, colors, descriptions etc...
Key concepts demonstrated:
- LLM - for extracting content from player input
- Having the LLM respond in JSON format and then de-serializing that JSON into game structures
## Run the Template
1. Open the settings menu in the Content Browser within Unreal
2. Enable the option for "Show Plugin Content"
3. In the Content Browser, navigate to `/Plugins/InworldCommand/Levels`
4. Double-click on the CommandLevel map
5. Click the Play-in-Editor button (alt+p) to launch your level
6. Use your microphone to direct the AI agent around the map
## Setting up your own game
Now, let's say you want to set up content extraction in your own game.
### Player setup
Let's start with the player.
1. Add the **Inworld Simple Player** component to an Actor or other blueprint in your level. In this template we attached the **Inworld Simple Player** component to a custom Player Controller.
2. Setup the Audio Capture config:
- **Enable AEC** - Enable Acoustic Echo Cancellation (AEC). This helps reduce game audio from being interpreted as player speech when captured by the microphone.
- **AECId** - Ids the AEC primitive creation configuration (there is currently no creation config settings, just leave as `Default`)
- **AECConfig**:
- **Echo Suppression Level** - AEC processing intensity (0.0-1.0, higher values provide stronger echo cancellation)
- **Noise Suppression Level** - Noise suppression intensity (0.0-1.0, higher values provide stronger noise reduction)
- **Enabled Adaptive Filter** - Enable adaptive filtering for better echo cancellation
- **Enable VAD** - Enable Voice Audio Detection (VAD). This is useful for determining when the player is speaking for sending audio data.
- **VADId** - Ids the VAD primitive creation configuration (there is currently no creation config settings, just leave as `Default`)
- **VADConfig**:
- **Speech Threshold** - Sensitivity threshold for detecting speech. This is a `float` value where higher thresholds make the detection more selective (fewer false positives), and lower thresholds make it more sensitive (detecting quieter speech). Valid range: 0.0 - 1.0 (default = 0.4)
- **VADBuffer Time** - The amount of silence required before it is determined that the player has stopped speaking.
- **Start Mic Enabled** - Whether to start the game with the microphone enabled.
### Target setup
Now, let's set up the target actor.
1. Start by creating a blueprint actor that receive player input and extract content from it
2. Create the following variables:
- Graph Asset (UInworldGraphAsset)
- Make instance editable and (optionally) blueprint-read only
- After compiling, set the initial value of this asset to be "IG_CommandTranslator" (the graph asset provided with the InworldCommand plugin)
- Later you can change this to a custom graph if desired
- Graph Instance (UInworldGraph)
3. In the begin play function of your actor, setup the graph by calling "GetGraphInstance" on the Graph Asset variable:
4. Drag from the event input and "Create Event." Create a new function, rename it to "OnGraphCompiled" and take the "Graph" input and use it to set the "Graph Instance" variable:
4. You will now be able to execute the graph whenever you receive input from the player. To pipeline user input directly to graph execution, add IInworldConversationTarget interface to your actor (click on class defaults and add the interface near the bottom):
5. Once the interface has been added, double click on "Send Player Message" in the "Interfaces" section and add the following code:
6. Drag from the event input of the "Execute Graph" function and "Create Event." Rename your input "OnGraphResult." If the graph executed successfully the "Data" input to this function will contain your JSON formatted text. De-serializing this JSON text must be done in C++. Here is the example for the command plugin:\
5. Finally, go back to your player controller blueprint and set the player component's conversation target to your AI actor using the following function:
\
With the conversation target set, when the player speaks, the input will be piped directly to the conversation target, which in this case is the actor we setup earlier
## Understanding the Command Translator Graph
Now let's walk through how the graph (`IG_CommandTranslator`) that is powering the content extraction works. Double click on the graph asset (its in /InworldCommandContent/Data). You should see the following:
- The node at the top titled "Input," is the start node (the blue arrow denotes starting nodes). It is a "Routing" node, which has no logic, its used purely to move data and organize logic in the graph structure
- The expected input to this graph is text data, either directly from user input or as output from speech-to-text (STT)
- The next node is the "Complex Command Prompt" node. This is a custom node that creates the prompt that we are going to send to the large-language model (LLM) in the next step. Double click on this node to see how it works:
- The custom node takes the user input and combines it with instructions to the LLM to generate a JSON with the data we want extracted
- Back to the graph, the next node we see if the LLM node. This node will send our prompt to an LLM and send the response to the next node in the graph. We have changed two settings for this node, as seen in the panel on the right side:
- Generally, for content extraction we want to use a smarter model, in order to guarantee JSON format correctness and accurate reading of user input. In this case we opted for GPT 4.1, feel free to experiment with other options
- We set "Stream" to false since we want the entire output in one go
- The final node we have, named "Complex Command Output" is an LLM-to-Text conversion node. It takes the output structure from the LLM and returns only the text
- This node is marked with the green arrow, designating it as an "End Node"
---
#### MassAI
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/massai
The MassAI Template is a powerful demo that shows how to generate tons of Inworld-powered characters using Unreal's Mass Entity System. It also provides a clear C++ example of how to use the Inworld Graph directly in your code.
## Run the MassAI Sample Level
1. Open the settings menu in the Content Browser within Unreal.
2. Enable the option for "Show Plugin Content".
3. In the Content Browser, navigate to `/Plugins/InworldMassAI`.
4. Double-click on the `InworldMassDemo` map.
5. Click the Play-in-Editor button (alt+p) to launch your level.
6. Have a conversation with any character closest to you!
## Template Overview
The **MassAI** template provides a powerful system for creating large-scale AI-powered Inworld characters in your scene using the `MassSpawner` entry point. This template demonstrates how to spawn and manage multiple AI characters efficiently using Unreal's Mass Entity System.
This template also covers how to set up the necessary assets for the `MassSpawner`, including the `MassEntityConfig` and `EQS_SpawnEntity` configurations.
To send a message to the characters, you can either:
1. Type your message in the text box and press enter (or click the send button).
2. Talk using your microphone. Simply click the microphone icon at the left end of the text box to toggle the microphone on or off.
|**Microphone Off** || **Microphone On** |
|---|---|---|
| || |
The audio capture component will automatically detect when you have completed your message and send the message to the conversation target.
A speech bubble will appear above the character's head and they will generate a response.
### MassSpawner Configuration
The `MassSpawner` is the entry point for spawning multiple AI characters in your scene. When you place a `MassSpawner` actor in your level, it automatically initializes and spawns entities based on your configuration. The spawner reads the `MassEntityConfig` to determine what components and traits each entity should have, then uses the `EQS_SpawnEntity` generator to figure out where to place them in the world.
To configure the `MassSpawner`, select it in the World Outliner and adjust its properties in the Details panel.
#### Count
The **Count** property determines how many entities the spawner will create. This is a simple integer value, set it to the number of AI characters you want in your scene. For example, setting it to `20` will spawn 20 Inworld-powered characters when the level starts.
#### Entity Config
The **Entity Types** array lets you specify which `MassEntityConfig` asset to use for spawning entities. Each entry in the array contains:
- **Entity Config**: A reference to a `MassEntityConfig` asset that defines what components and traits each spawned entity will have
- **Proportion**: A weight value (typically `1.0`) that determines the relative probability of this entity type being spawned when multiple types are defined
The `MassEntityConfig` asset is a standard Mass Entity System configuration that defines the components and fragments that make up an entity. Most of the configuration in `MassEntityConfig` follows standard Mass Entity System patterns, things like transform components, level of details(LOD), and other gameplay-related fragments are configured here as they would be for any Mass entity.
However, the **Inworld Character Traits** fragment is what makes these entities into AI-powered Inworld characters. This fragment is specific to the Inworld integration and contains the essential configuration for connecting each spawned entity to the Inworld AI system.
##### Inworld Character Traits
The `Inworld Character Traits` fragment is where you configure the Inworld-specific properties for each spawned character. In this demo, the trait system supports batch generation and configuration of `Character Profile` and `Voice` settings for each spawned entity.
The trait fragment contains:
- **CharacterProfile** : The character profile data that defines the character's personality, backstory, and behavioral traits.
- **Voice** : The voice configuration that determines how the character's speech is synthesized and played back.
For more information on how to configure a character, see the [Character template](/unreal-engine/runtime/templates/character#character-setup).
When the `MassSpawner` creates entities, it applies the `Inworld Character Traits` fragment to each one, which automatically sets up the necessary Inworld components and registers the character with the conversation system. This allows each spawned entity to function as a fully interactive AI character that can respond to player input and participate in conversations.
The actual configuration and initialization of these traits happens in C++ code. To understand how traits are defined and how the initialization process works, you can examine the source files:
- **`Source/InworldMassAI/Public/InworldMassAITypes.h`** - See how traits are defined in the trait structure
- **`Source/InworldMassAI/Public/InworldCharacterTrait.h`** - See how traits are initialized when entities are spawned
#### Spawn Data Generators
The `Spawn Data Generators` array determines where entities will be placed in the world. In this demo, the `EQS_SpawnEntity` generator uses a "Donut" pattern to spawn characters in concentric rings around the querier (typically the player or spawner location), with configurable inner and outer radii, number of rings, and points per ring. Beyond this pattern, EQS generators can also spawn entities based on navigation meshes, random points, grid patterns, or custom query logic to suit different gameplay scenarios.
### Graph
This template uses an Inworld graph to generate character profiles dynamically at runtime. Let's walk through how this graph works.
The graph starts with an input node (denoted by the blue arrow), which allows you to pass an integer variable at runtime. This value is determined by the type specified in the Details panel and represents the number of characters to generate.
This integer value is then passed to a custom node called **InworldNode_GeneratorPrompt**. This custom node constructs a prompt that instructs the LLM to randomly generate character profile information, including attributes like name, job, backstory, gender, personality traits, and example dialogue.
The generated prompt is then sent to the **LLM** node, which processes the request and generates a response. The response is passed through the **LLMResponse To Text** node to convert it into plain text format, which contains the JSON-formatted character data.
The logic for compiling the Inworld Graph, executing it, and retrieving the results is implemented in C++ code. To understand how to use C++ to compile and execute Inworld Graphs, you can examine the source files:
- **`Source/InworldMassAI/Public/InworldMassAIGraphExecutor.h`** and **`Source/InworldMassAI/Private/InworldMassAIGraphExecutor.cpp`** - See how the graph is compiled and executed, and how results are handled
---
#### Guides
#### Core Concepts
Source: https://docs.inworld.ai/unreal-engine/runtime/core-concepts
Inworld's Unreal Runtime is a powerful tool for building AI-powered experiences in Unreal. Key features include:
- **AI Component Library**: A library of powerful AI components, such as LLMs, Text-to-Speech (TTS), and Speech-to-Text (STT), that can be constructed into **Graphs** to power conversational characters, interactive agents, and other advanced AI-driven experiences in your Unreal projects.
- **Rich Observability**: Dashboards, traces, and logs with no extra setup required. These enable you to debug, observe, and improve your AI interactions.
- **Playgrounds**: Quickly test different models and prompts before adding them to your experience.
## Graphs
At the core of Inworld's Runtime is a high-performance, C++ based graph execution engine. The engine executes **Graphs** organized from **Nodes** and **Edges**, where each node performs a specific processing task—often an AI operation such as language generation (LLM), speech-to-text (STT), or text-to-speech (TTS)—and edges define the flow of data between them.
A graph:
- Contains a collection of nodes
- Defines edges between nodes
- Must have at least one start node
- Must have at least one end node
- Supports both linear and non-linear execution paths
The Runtime comes with a **[visual graph editor](/unreal-engine/runtime/graph-editor)** to make it easy to construct and modify your graphs.
### Nodes
Nodes are building blocks that perform a specific processing task, such as speech-to-text conversion, intent detection, or language model interaction. **Built-in Nodes** are provided with pre-built functionality for common use cases, with the option to create **Custom Nodes** to extend the runtime's capabilities.
Nodes:
- Encapsulate ML models or transformations with standard interfaces
- Process input data and produce output data
- Include built-in telemetry to support performance monitoring and debugging capabilities
- Have built-in error handling
- Handle lifecycle management, including standardized initialization and cleanup on graph shutdown
See the [Runtime Reference](/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode) for more details on available nodes.
#### Primitives
Many of the built-in nodes rely upon **primitives**: fundamental components like Large Language Models (LLMs), Text-to-Speech (TTS), and Text Embedders. These are the "raw ingredients" of any AI-powered application.
Think of them as a library of high-performance AI modules, designed to abstract away the complexities of working with various providers, models, and hardware—allowing you to build on a consistent, provider-agnostic foundation. We recommend using primitives through our built-in nodes, but you can also leverage them directly in custom nodes.
See this [guide](/unreal-engine/runtime/configuring-primitives) for more details about configuring primitives.
### Edges
Edges define the flow of data between nodes, creating a processing pipeline. The runtime supports sophisticated edge configurations including:
- **Conditions**: Control data flow based on conditions
- **Connection Types**: Optional vs. required connections
- **Loops**: Iterative processing capabilities
## Observability
Runtime provides rich observability tools with no extra setup required. You can monitor your AI interactions through:
- **[Traces](/portal/traces)**: Understand the flow of your application with detailed execution traces. Use them to identify latency bottlenecks and debug issues when they arise.
- **[Logs](/portal/logs)**: Review historical data to monitor errors and debug issues.
- **[Dashboards](/portal/dashboards)**: Get real-time visibility into your application health. Track performance, resource usage, and application KPIs through comprehensive dashboards and detailed data views.
## Experiments
Runtime lets you iterate on prompts, models, and other configs (for example LLM and TTS) without redeploying code for already shipped builds. See the [Experiments guide](/unreal-engine/runtime/experiments) for detailed information on setting up and running A/B experiments.
## Playgrounds
Inworld Portal provides interactive **Playgrounds** that let you experiment with different models and tune prompts before deploying them in graph variants:
- **LLM Playground**: Experiment with different language models, prompts, and response settings.
- **TTS Playground**: Try different models, voices, and clone your own voice.
---
#### Configuring Primitives
Source: https://docs.inworld.ai/unreal-engine/runtime/configuring-primitives
At the heart of Inworld Agent Runtime are *primitives*: foundational components like Large Language Models (LLMs), Speech-to-Text (STT), Text-to-Speech (TTS), and Text Embedders. These are the "raw ingredients" of any AI-powered application.
Think of them as a library of high-performance AI modules, designed to abstract away the complexities of working with various providers, models, and hardware - allowing you to build on a consistent, provider-agnostic foundation. These primitives are then used within Inworld's graph system to power various nodes.
Each primitive type can be configured through Unreal Engine's Project Settings interface, allowing you to:
- Configure multiple named instances of each primitive type (LLM, STT, TTS, Text Embedder)
- Set up provider-specific configurations (Inworld, OpenAI, Anthropic, Google, etc.)
- Define reusable configurations that can be referenced by nodes in your graphs
- Easily access and modify your primitives across your project
## Accessing Primitives Configuration
To configure primitives in your project:
1. Open **Edit > Project Settings** from the main menu
2. Navigate to **Plugins > Inworld** in the left sidebar
3. Scroll down to the **Primitives** section
Here you'll find configuration maps for each primitive type:
- **LLM Creation Config** - Configure Large Language Models
- **STT Creation Config** - Configure Speech-to-Text models
- **TTS Creation Config** - Configure Text-to-Speech models
- **Text Embedder Creation Config** - Configure text embedding models
## LLM
LLMs are powerful models that can be used to understand and generate text and other content. To configure an LLM:
1. In the **LLM Creation Config** map, click the **+** button to add a new entry
2. Set a descriptive name for this configuration. This name will be used for selecting this configuration in your graph.
3. Select either `Local` or `Remote` for the **Compute Host**
- **Remote:** this means that the models will be run by cloud servers.
- **Provider**: Select from available [service providers](/models#chat-completion).
- **Model**: Choose the specific [model](/models#chat-completion). Make sure you specify a model that is provided by the selected provider. See [Adding models or providers](#adding-models-or-providers).
If a model or service provider is not available in the dropdown, you can add additional options under the **LLM** section of the Inworld Agent Runtime Settings.
- **Local:** this means that the models will run locally
- **Local Compute Host**: Choose `CPU` or `GPU`
- **Model**: Path to the local model.
4. Your configuration will now be available for selection in LLM powered nodes throughout your graphs.
### Adding models or providers
All service providers and models listed under [Chat Completion](/models#chat-completion) are supported. If a model or service provider is not available in the **Remote** LLM Creation Config dropdown, you can add additional options under the **LLM** section of the Inworld Agent Runtime Settings.
1. In the **Remote LLMProviders** list, click the **+** button to add any additional service providers you want to use.
2. In the **Remote LLMModels** list, click the **+** button to add any additional models you want to use.
## TTS
Text-to-Speech converts text into audio. To configure TTS:
1. In the **TTS Creation Config** map, click the **+** button to add a new entry
2. Set a descriptive name for this configuration. This name will be used for selecting this configuration in your graph.
3. Select either `Local` or `Remote` for the **Compute Host**
- **Remote:** this means that the models will be run by cloud servers.
- **Provider**: Select `INWORLD`
- Configure provider-specific settings like model selection and voice parameters.
- **Local:** this means that the models will run locally
- **Local Compute Host**: Choose `CPU` or `GPU`
- **Model**: Path to the local model.
- **Speech Synthesis Config**: Adjust parameters like sample rate, temperature, and speaking rate.
4. Your configuration will now be available for selection in TTS nodes throughout your graphs.
## STT
Speech-to-Text converts audio into text. To configure STT:
1. In the **STT Creation Config** map, click the **+** button to add a new entry
2. Set a descriptive name for this configuration. This name will be used for selecting this configuration in your graph.
3. Select either `Local` or `Remote` for the **Compute Host**
- **Remote:** this means that the models will be run by cloud servers.
- **Local:** this means that the models will run locally
- **Local Compute Host**: Choose `CPU` or `GPU`
- **Model**: Path to the local model.
4. Your configuration will now be available for selection in STT nodes throughout your graphs.
## Text Embedder
Text Embedders convert text into numerical vectors for semantic operations, and powers features like intent detection and knowledge retrieval. To configure embedders:
1. In the **Text Embedder Creation Config** map, click the **+** button to add a new entry
2. Set a descriptive name for this configuration. This name will be used for selecting this configuration in your graph.
3. Select either `Local` or `Remote` for the **Compute Host**
- **Remote:** this means that the models will be run by cloud servers.
- **Provider**: Select from available providers
- **Model**: Choose the embedding model
- **Local:** this means that the models will run locally
- **Local Compute Host**: Choose `CPU` or `GPU`
- **Model**: Path to the local model.
4. Your configuration will now be available for selection in embeddings powered nodes throughout your graphs.
---
#### Graph Editor
Source: https://docs.inworld.ai/unreal-engine/runtime/graph-editor
## Overview
The **Inworld Graph Editor** (part of the **InworldRuntime** Unreal plugin) enables developers to build their graphs in Unreal using an easy-to-use visual editor. It lets you build graphs where **nodes** or **subgraphs** are connected by **edges**. Nodes/Subgraphs process data; edges define execution flow. Graphs are saved as assets and can be executed at runtime.
## Graph workflow summary
From the Content Browser context menu.
Open the editor, add nodes, subgraphs, configure nodes, connect edges.
Allows to have multiple Start and End nodes.
Get graph instance using graph asset.
Call Execute, pass Input and RuntimeData if needed.
Handle result callbacks from End nodes
## Editor Layout
1. **Canvas** -- place and connect nodes. Drag to move. **Ctrl+C/Ctrl+V** to copy/paste. *Del* to delete node/edge.
2. **Details Panel (right)** -- properties of the selected node/edge.
3. **Graph Settings (below Details)** -- shows the description, execution mode and list of validation messages.
**Execution mode** defines how multiple executions are handled. Three modes are available::
- Simultaneous: Allow multiple executions to run simultaneously.
- Latest Only: Cancels previous execution when starting a new one.
- Sequential: Queues new executions until the current one finishes (FIFO), no cancellations.
- **Context Menu (right-click on canvas)** -- to add nodes, comment or subgraphs or on **edge icon** to define edge type.
- Start typing to **filter** by name.
- Hover a menu item to see its description, inputs/outputs.
**Tooltips:** Hover an **input/output pin** to see its data type in a tooltip.
## Creating a Graph Asset
1. **Add New > Inworld > Graph**.
2. Name and save the asset > Double-click to open Inworld Graph Editor.
## Nodes
- **Adding**:
- right-click on the canvas and select a node from the menu.
- drag from a pin and release to open node selection menu. In this case, the menu is filtered to show only nodes that can accept the input of the dragged pin. (can be changed in the settings)
- **Move/Copy/Paste/Delete**: drag to move; **Ctrl+C/Ctrl+V** to copy/paste; select node *Del* to delete.
- **Hover for Info**: hover a node to see description and pins info.
- **Pins & Types**:
- Each input/output pin has a **data type** (pins are color-coded).
- Optional inputs are indicated by * and noted in the tooltip.
- Type checks prevent invalid connections; behavior configurable in *Settings*.
- **Pin Display Modes** (Settings): label + color, color only, or label only. Labels can show **Pin Name** or **Data Type**.
- **Double-click node body** > opens node implementation for custom nodes or subgraphs.
Opens the source file at the line of node implementation.
- **Double-click node name** > **rename** the node inline.
- **Getter Nodes**: set IsGetter to true if no input required for node. In this case, you do not need to set up an input connection for the node. (optional you can wire input for explicit ordering).
### Custom Node
Custom nodes can be implemented using Blueprint or C++. In this example, we will implement a custom node that processes PlayerProfile RuntimeData and two Text inputs to combine them into one Text output.
1. **Create Custom Node**
Create via *Content Browser* or the **New Custom Node** button in the Graph Editor.
Custom nodes implemented in Blueprints have an icon in Graph Editor.
2. **Class Defaults**
Define:
- Node Name
- Node Description (Description of the node functionality and purpose)
- Is Getter (Special getter node mode that not required for input)
3. **Process Function**
To make a function available as a processing node, its name must start with **Process**.
Examples: **`Process`**, **`ProcessDialogue`**, **`ProcessAudio`**, **`ProcessLogic`**.
Each function defines the node's execution logic.
The defined inputs and outputs (names and types) are automatically reflected as pins in the Graph Editor.
A single node can be linked to multiple processing functions.
When executing, the node automatically selects the appropriate function based on matching input types.
4. **Save**
The node appears in the context menu immediately.
1. **Create Custom Node class**
Create a C++ class inheriting from UInworldNode_Custom.
Custom nodes that implemented in C++ have an icon in Graph Editor.
2. **Define at constructor**
Define:
- Node Name
- IsGetter (special getter node that not required for input)
3. **Process Function**
To make a function available as a processing node:
- Mark it with UFUNCTION()
- Its name must start with **Process**.
Examples: **`Process`**, **`ProcessDialogue`**, **`ProcessAudio`**, **`ProcessLogic`**.
Each function defines the node's execution logic.
The defined inputs and outputs (names and types) are automatically reflected as pins in the Graph Editor.
A single node can be linked to multiple processing functions.
When executing, the node automatically selects the appropriate function based on matching input types.
You can also optionally define the process context as the first argument to access runtime information:
```cpp
FInworldData_Text Process(const UInworldProcessContext* ProcessContext, const FInworldData_Text& A, const FInworldData_Text& B) const
```
The UInworldProcessContext parameter provides access to contextual data and runtime state during graph execution.
If present, it will be automatically passed by the graph system.
```cpp
#pragma once
#include "CoreMinimal.h"
#include "Graph/Nodes/InworldNode_Custom.h"
#include "MyInworldNode_AppendText.generated.h"
/**
* @class UInworldNode_GetCharacterName
* @brief A node that append two text inputs.
*/
UCLASS()
class MYGAME_API UMyInworldNode_AppendText : public UInworldNode_Custom
{
GENERATED_BODY()
public:
UMyInworldNode_AppendText()
{
NodeName = "Append Text";
}
protected:
UFUNCTION()
UPARAM(meta = (DisplayName = "A + B"))
FInworldData_Text Process(const FInworldData_Text& A, const FInworldData_Text& B) const
{
if (A.Text.IsEmpty() || B.Text.IsEmpty())
{
return FInworldData_Text::Error(TEXT("A or B is empty."));
}
FInworldData_Text ResultData;
ResultData.Text = A.Text + " " + B.Text;
return ResultData;
}
};
```
4. **Compile**
The node appears in the context menu.
## Edges
- **Connect** by dragging between pins.
- **Delete**: select edge by click on icon > **Del**.
- **Input/Ouput Types Checking**:
- **Disallow connection - checked** (default): incompatible types **cannot** connect.
- **Disallow connection - unchecked**: connection allowed but a **warning** is shown.
- **Conditional Edges**:
- Set an edge type: right-click the edge > choose a **Condition type** (e.g., *General*, *IsSafe*).
- Each condition can be **negated** at design time in the Details panel.
- **IsSafe** is a special condition based edge on a safe-node result.
- **Edge Properties**: **Title**, **Negation**, **Required**.
- Not Required edges are shown as **dashed** lines.
- **To Open Implementation**: double-click on conditional edge widget to navigate to implementation (C++ or Blueprint).
### Custom Conditional Edge
Custom edges can be implemented using Blueprint or C++.
1. **Create Custom Edge**
Create via *Content Browser* context menu or the **New Custom Edge** button in the Graph Editor.
2. **Class Defaults**
Define:
- Edge Title
- Execute Meets Condition in Game Thread
3. **MeetsCondition Function**
To make a function available as a condition, its name must start with **MeetsCondition**.
Examples: **`MeetsCondition`**, **`MeetsConditionCheck`**, **`MeetsConditionIsValid`**, **`MeetsConditionLogic`**.
Each function defines the edge's conditional logic and must return a bool value.
A single edge can be linked to multiple condition functions.
When executing, the edge automatically selects the appropriate function based on matching input type.
4. **Save**
The edge appears in the context menu immediately.
1. **Create Custom Condition Edge class**
Create a C++ class inheriting from UInworldEdge_WithCondition.
2. **At Constructor**
Define:
- Edge Title
- Execute Meets Condition in Game Thread
3. **MeetsCondition Function**
To make a function available as a condition:
- Mark it with UFUNCTION()
- Its name must start with **MeetsCondition**
Examples: **`MeetsCondition`**, **`MeetsConditionCheck`**, **`MeetsConditionIsValid`**, **`MeetsConditionLogic`**.
Each function defines the edge's conditional logic and must return a bool value.
A single edge can be linked to multiple condition functions.
When executing, the edge automatically selects the appropriate function based on matching input type.
You can also optionally define the process context as the first argument to access runtime information:
```cpp
bool MeetsCondition(const UInworldProcessContext* ProcessContext,
const FInworldData_Text& Input) const
```
The UInworldProcessContext parameter provides access to contextual data and runtime state during graph execution.
If present, it will be automatically passed by the graph system.
```cpp
#pragma once
#include "CoreMinimal.h"
#include "Graph/Edges/InworldEdge_WithCondition.h"
#include "MyInworldEdge_IsAudio.generated.h"
UCLASS()
class MYGAME_API UInworldEdge_MyEdge : public UInworldEdge_WithCondition
{
GENERATED_BODY()
public:
UInworldEdge_MyEdge()
{
EdgeTitle = "My Edge C++";
}
protected:
UFUNCTION()
bool MeetsCondition(const FInworldData_Text& Input) const
{
return (Input.Text.Contains(TEXT("Inworld")));
}
};
```
4. **Compile**
The node appears in the context menu.
## Start & End Nodes
- **Start Node**:
- Mark a node as **Start** (right-click on node > Mark as Start).
- At runtime, all **Start** nodes **receive the top-level input data** you pass into `Execute(...)` and RuntimeData.
- A node with **no incoming edges** can be marked as a start point.
- **End Node**:
- Mark a node **End** (right-click on node > Mark as End).
- When an End node finishes, it emits a **result callback** carrying that node's output.
- A node with **no outgoing edges** can be marked as an end point.
- **Dual-role Node**: A node with **no inputs and outputs** can be marked as both **Start and End**.
## Subgraphs
Each saved graph asset that has exactly one Start and one End node can be added as a Subgraph to an existing graph as node.
The subgraph's inputs correspond to its Start node inputs, and its output comes from the End node.
All RuntimeData is automatically passed to the subgraph during execution.
- **Adding**:
- Right-click on the canvas and select a subgraph from the `Subgraph` section of the context menu.
- Or drag from a pin and release to open the selection menu.
## Settings
## Executing a Graph using Graph Asset at Runtime
1. Get an instance: `GetGraphInstance`
2. Execute the graph: call **`Execute()`**.
- **Start nodes** will receive `Input` and `RuntimeData` map.
- Each **End node** triggers **`OnResultCallback`** and returns the result along with the corresponding `NodeId` and `ExecutionId`.
3. Process results in your system.
```cpp
#pragma once
#include "CoreMinimal.h"
#include "GameFramework/Actor.h"
#include "Graph/InworldGraph.h"
#include "Graph/Assets/InworldGraphAsset.h"
#include "AGraphExecution.generated.h"
UCLASS()
class MYGAME_API AGraphExecution : public AActor
{
GENERATED_BODY()
public:
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Graph")
TObjectPtr GraphAsset;
protected:
virtual void BeginPlay() override
{
Super::BeginPlay();
GraphAsset->GetGraphInstance(FOnGraphCompiledNative::CreateUObject(this, &AGraphExecution::OnGetGraph));
};
void OnGetGraph(UInworldGraph* CompiledGraph, bool bSuccess)
{
if (bSuccess)
{
FInworldData_Text InputData;
InputData.Text = TEXT("Hi LLM");
CompiledGraph->Execute(this, InputData, {}, FOnGraphResultNative::CreateUObject(this, &AGraphExecution::OnResult));
}
}
void OnResult(const FString& ExecutionId, const FString& NodeId, const FInworldDataHandle& DataHandle)
{
if (FInworldData_Text* ResultTextData = DataHandle.Unwrap())
{
GEngine->AddOnScreenDebugMessage(-1, 5.0f, FColor::Green, ResultTextData->Text);
}
}
};
```
## Inworld Data Handle
Graph input and result data must be of type **FInworldDataHandle**.
Input data must be "wrapped" into **FInworldDataHandle**.
```cpp
// FInworldData
FInworldData_Text InworldText;
FInworldDataHandle InworldDataHandle = InworldText;
// Custom Struct Data
FMyCustomStruct MyCustomStruct;
FInworldDataHandle InworldDataHandle = FInworldData_Struct(MyCustomStruct);
```
Result data must be "unwrapped" out of **FInworldDataHandle**.
```cpp
// FInworldData
if (FInworldData_Text* InworldText = InworldDataHandle->Unwrap(); FInworldData_Text != nullptr)
{
// InworldText is valid
}
// Custom Struct Data
if (FInworldData_Struct* InworldDataStruct = InworldDataHandle->Unwrap(); InworldDataStruct != nullptr)
{
if (FMyCustomStruct MyCustomStruct; InworldDataStruct->GetData(MyCustomStruct))
{
// MyCustomStruct is valid
}
}
```
---
#### Using LLMs
Source: https://docs.inworld.ai/unreal-engine/runtime/llm
Large Language Models (LLMs) are a key component for buiding AI powered experiences. They can power capabilities like dialog generation, game state changes, intent detection, and more.
## Overview
The LLM node, powered by the UInworldNode_LLM class, provides a high-level interface that integrates LLM clients to generate text responses within your graph. It works with Chat Request and Chat Response data to enable conversational AI capabilities.
The system abstracts away backend complexity, exposing a consistent API across models and providers for:
- Chat-based text generation with message history
- Configurable generation parameters (token limits, temperature, etc.)
- Streaming and non-streaming response modes
- Integration with Chat Request/Response workflow
## Working with the LLM node
To add a LLM node to your graph (or create a graph with just an LLM node) in the Graph Editor:
1. Right click to add the LLM node to the graph editor from the available node library
2. In the node's details panel:
- Under `LLM Model`, select the desired model. If your desired model is not in the dropdown, you can configure additional models by following the instructions [here](/unreal-engine/runtime/configuring-primitives#llm)
- Adjust the `Text Generation Config` property to set the desired text generation parameters, such as token limits and temperature.
- Leave `Stream` checked if you want to stream text token outputs or unchecked `Stream` to receive the complete text output.
3. Connect the input of the LLM node to a [LLMChatRequest](#creating-chat-requests) data source, typically a custom node. The Chat Request corresponds to the prompt, messages, and configuration that will be provided to the LLM.
- If this is the first node in your graph, make sure to mark the node as the start node by right clicking on it and selecting "Set As Start".
4. Configure the LLM node output:
- If this is the final node in your graph, mark it as an end node by right-clicking and selecting "Set As End"
- Otherwise, connect the LLM Chat Response output to other nodes that process `FInworldData_LLMChatResponse`
- The node outputs a complete Chat Response containing generated text and metadata
5. Save and run your graph!
### Creating Chat Requests
To generate a Chat Request to be provided as input to the LLM node:
1. Create a custom node in the graph editor by selecting the “New Custom Node” button at the top left of the graph editor. Give the node a name, and save.
2. After saving, the custom node's blueprint. In the blueprint, create a new function prefixed with "Process" (e.g. "Process_Default").
3. In the function's Details panel add an Output of type `Inworld Data LLM Chat Request`.
4. Right click in the function blueprint and search for “Make InworldData_LLMChatRequest”. Select it.
5. Construct your chat request. To construct a simple prompt that only contains a single user message:
- Drag the output of the **Make InworldData_LLMChatRequest** node to the return node of the function.
- From the **Chat Messages** input of the **Make InworldData_LLMChatRequest** node, drag and select **Make Array**.
- From the **Make Array** node's input, drag and select **Make InworldLLMMessage**.
- In the **Role** parameter, select **User**. In the **Content** parameter, type in your desired prompt.
If you want to add a system message, or include multiple messages in your prompt, you can add additional elements to the array.
6. This custom node can now be added to the graph, select your new node from the context menu and drag this node's output to the input of the LLM node.
## UInworldNode_LLM Class
The core LLM node that processes chat messages and generates text responses using configured language models.
```cpp
/**
* @class UInworldNode_LLM
* @brief A workflow node that processes chat messages and produces either
* complete text output or a stream of text tokens based on the stream
* parameter.
*
* UInworldNode_LLM encapsulates the functionality of an LLM client to generate text
* responses within a workflow graph. It can be configured to output complete
* text or stream tokens as they are generated.
*
* @input FInworldData_LLMChatRequest
* @output FInworldData_LLMChatResponse
*/
UCLASS(Blueprintable, BlueprintType)
class INWORLDRUNTIME_API UInworldNode_LLM : public UInworldNode
{
GENERATED_BODY()
public:
UInworldNode_LLM();
/**
* @brief Native utility function. Creates a new LLM node instance with the specified configuration
* @param Outer The outer object that will own this node
* @param NodeName The name to assign to the node
* @param InExecutionConfig Execution configuration settings for text generation
* @return Newly created LLM node instance
*/
static UInworldNode_LLM* CreateNative(
class UObject* Outer, const FString& NodeName, const FInworldLLMChatNodeExecutionConfig& InExecutionConfig);
/**
* Used to define and manage parameters such as token limits, randomness,
* and other options that influence the behavior of text generation. This
* configuration ensures fine-grained control over the output quality and
* style of generated text by the Large Language Model.
*/
UPROPERTY(EditDefaultsOnly, BlueprintReadWrite, Category = "Inworld")
FInworldLLMChatNodeExecutionConfig ExecutionConfig;
private:
virtual EInworldRuntimeErrors GetJsonConfig(const TSharedRef& GraphJson) override;
UFUNCTION()
UPARAM(meta = (DisplayName = "Chat Response"))
FInworldData_LLMChatResponse Process_LLM(const FInworldData_LLMChatRequest& ChatRequest)
{
checkNoEntry();
return {};
}
};
```
## Chat Request and Response Data Flow
The LLM node operates on a simple input/output model using structured chat data:
### Input: FInworldData_LLMChatRequest
Contains the conversation context and response format preferences:
- **Chat Messages**: Array of `FInworldLLMMessage` with role (System/User/Assistant) and content
- **Response Format**: Desired LLM response format (TEXT, JSON, or JSON with schema)
*Note: Generation parameters like token limits and temperature are configured in the node's `ExecutionConfig` property, not in the request data.*
### Output: FInworldData_LLMChatResponse
Contains the generated response with streaming support:
- **Content**: The LLM's generated response text
- **Is Streaming**: Boolean indicating whether this response is part of a streaming response
- **Stream Support**: Inherits from `FInworldData_Stream` allowing iteration through response chunks
## API Reference
### UInworldNode_LLM Methods
#### Constructor
```cpp
UInworldNode_LLM()
```
- **Description:** Default constructor for the LLM node
- **Usage:** Initializes the node with default settings for Large Language Model processing
#### CreateNative
```cpp
static UInworldNode_LLM* CreateNative(
class UObject* Outer, const FString& NodeName, const FInworldLLMChatNodeExecutionConfig& InExecutionConfig);
```
- **Description:** Native utility function to create a new LLM node instance with specified configuration
- **Parameters:**
- `Outer`: The outer object that will own this node
- `NodeName`: The name to assign to the node
- `InExecutionConfig`: Execution configuration settings for text generation
- **Return Value:** Newly created LLM node instance
### UInworldNode_LLM Properties
#### ExecutionConfig
- **Type:** `FInworldLLMChatNodeExecutionConfig`
- **Category:** Inworld
- **Description:** Configuration settings for LLM execution including token limits, temperature, and other generation parameters
- **Usage:** Configure in Blueprint editor to control text generation behavior
---
#### Text Embedding
Source: https://docs.inworld.ai/unreal-engine/runtime/text-embedder
Text embeddings are numerical representations of text data that capture semantic and contextual information about words and phrases. Text embeddings are useful for variety of natural language processing (NLP) tasks.
## Overview
In the Unreal SDK we utilize text embeddings for knowledge retrieval and intent detection. In order to optimize the process of embedding text, we provide a tool that can be utilized to embed text directly within the Unreal editor.
## Inworld Text Embedder tool
To create an Inworld Text Embeddings asset:
1. Open the Inworld Text Embedder through the Level **Tools** menu:
2. Input the text you would like embedded into the large input box of the tool.
#### Input Format
The tool expects input as a comma separated list of text records:
- Commas are used as a delimiter (e.g. `Text Record 1, Text Record 2`)
- Each text record can optionally be enclosed in quotes (e.g. `"Text Record"`)
- Any comma within quotes will not be used as a delimiter (e.g. `"Text, Record"`)
- The entire input can optionally be wrapped in brackets (e.g. `("Text Record 1", "Text Record 2")`)
```Text Example Input
("What is your favorite color?","Which color do you like the most?","Do you have a preferred color?","Is there a color that you consider your favorite?","What color do you feel most drawn to?")
```
3. Press the `Embed` button to embed the text and store the result in a newly created **Inworld Text Embeddings Asset**.
4. Here you can see the result of our example input:
```Text Example Input
("What is your favorite color?","Which color do you like the most?","Do you have a preferred color?","Is there a color that you consider your favorite?","What color do you feel most drawn to?")
```
This asset can now be used with the Knowledge or Intent node.
---
#### Gameplay Debugger
Source: https://docs.inworld.ai/unreal-engine/runtime/gameplay-debugger
Inworld Agent Runtime adds a new category **Inworld Graph** to the Unreal Engine Gameplay Debugger system. See the official Unreal documentation: [Using the Gameplay Debugger in Unreal Engine](https://dev.epicgames.com/documentation/en-us/unreal-engine/using-the-gameplay-debugger-in-unreal-engine).
## Launch the Gameplay Debugger
- Press the apostrophe key `"` to toggle the Gameplay Debugger. This is the Unreal Engine default and can be changed in your input settings.
## What it shows
This debugger mirrors much of the information you can access via **Portal** (see [Portal Traces](/portal/traces)), but it links directly to the actors running the graphs so you can visually inspect what is happening in real time within your game. You can inspect executions on a per-actor/object basis as they run.
## Views
### Startup state
When you first open the debugger, depending on the map you're running, there may be no graph executions yet. In that case, you'll see an empty state like this:
_Startup view — no executions detected._
### Live execution updates
When a graph starts running, the debugger immediately updates with the execution steps of the graph:
_Live execution history._
### Inworld Graph category layout
Graph executions are listed in the left column (chronological, oldest to newest). Details for the selected execution appear in the right column:
_Inworld Graph category UI._
## Controls
- Switch actor: **Shift + Page Up** or **Shift + Page Down**
- Switch execution history: **Page Up** or **Page Down**
- Clear all history: **Backspace**
### Editor utility widget
If you're running in the editor, use the Editor Utility Widget **EUW_HistoryViewer** to browse and inspect graph executions.
- Open via Content Browser: `/Plugins/Inworld Content/Utilities`
_Editor Utility Widget location._
In the widget:
- Left panel: Select an execution history.
- Middle panel: View the node execution history for the selected graph.
- Right panel: Send feedback to Portal for the selected execution to support controlled evolution workflows.
_Editor Utility Widget panels._
---
#### Experiments
Source: https://docs.inworld.ai/unreal-engine/runtime/experiments
## Overview
The **Inworld's Unreal Runtime** lets you iterate on prompts, models, custom nodes/edges parameters, and configs (LLM and TTS) without redeploying code. This guide covers the full workflow: graph creation, Portal setup, and experiment rollout.
## Experiments workflow summary
Design your graph and enable remote config.
Upload JSON configs for the baseline graph and its variants.
Define targeting rules and rollout percentages in the Portal.
Track metrics, then promote the winning variant.
## Step 1 - Create a Graph asset
### 1. Code your solution
In this example we developed a simple Blueprint that instantiates the graph, runs it, and prints the response on screen.
### 2. Design your graph
The sample graph makes an LLM call:
It includes a custom node that appends "PREFIX" (when its `AddPrefix` boolean is enabled) and inputs A and B as:
```
PREFIX --- {A} --- {B}
```
### 3. Enable remote config
Enable remote config inside the graph editor:
Graph executions return results as:
- **AddPrefix is false**: `{Request text} --- {LLM response}`
- **AddPrefix is true**: `PREFIX --- {Request text} --- {LLM response}`
The baseline uses `Ministral-8b-Latest` with `AddPrefix = false`:
## Step 2 - Register variants
Registering a variant tells Portal which configuration can participate in Experiments. Start with the baseline, then add variants.
### 1. Register the baseline variant
- **Register Graph**:
- **Copy the Graph ID** from the Graph Editor:
- **Enter or paste the Graph ID** to the Portal:
### 2. Create baseline variant
- **Export baseline JSON config** from the Graph Editor:
- **Click on created Graph** then **Create Variant** in the Portal:
- **Upload baseline JSON config** to the Portal:
### 3. Create GPT-5 variant
- **Export GPT-5 config** from the Graph Editor:
- **Create and upload GPT-5 variant** to the Portal:
### 4. Create GPT-5 with AddPrefix variant
- **Export GPT-5 config** from the Graph Editor:
- **Create and upload GPT-5 with prefix variant** to the Portal
## Step 3 - Start an experiment
Open the **Targeting & Rollout** tab and set default variant:
Changing the default targeting updates the configuration for:
- All currently running graph instances (on their next execution)
- All newly compiled graphs
Targeting changes may take time to propagate.
### Example results
**Baseline execution**:
**Switch targeting to GPT-5**:
Next execution result:
**Switch targeting to GPT-5 with prefix**:
Next execution result:
## Step 4 - Monitor & roll out
Monitor your experiment results and deploy the winner:
- **Watch metrics**: Monitor dashboards, traces, and logs while the experiment runs
- **Gradual rollout**: Increase the winning variant's allocation gradually (50/50 → 70/30 → 90/10), then set it to 100% and retire old rules
- **Rollback**: Roll back or tweak allocations if latency, errors, or business KPIs regress
## How Experiments work
When a request hits your graph, the runtime decides whether to use the local configuration or a remote variant from Experiments:
1. Remote config must be enabled
2. The graph ID must be registered in Experiments and have at least one active rule that returns a variant
```mermaid
flowchart TD
A[Graph Execution Request] --> B{Enable Remote Config == true}
B -->|NO/DEFAULT| C[Static Config]
C --> D[Use local graph configuration]
B -->|YES| E[Remote Config]
E --> F{Experiments returns a variant?}
F -->|NO| G[Use local graph configuration]
F -->|YES| H[Use Experiments variant]
style A fill:#e1f5fe
style D fill:#c8e6c9
style G fill:#c8e6c9
style H fill:#fff3e0
```
**If remote config is enabled**, Experiments evaluates each request as follows:
1. **Local cache check**: If the compiled variant for this user is cached, it executes immediately; otherwise Experiments is queried
2. **Variant fetch**: Experiments evaluates your targeting rules, returns the selected variant, and falls back to the local configuration if no rule applies or the fetch fails
3. **Compile & cache**: The runtime compiles the variant payload, caches it, and executes the graph with the new configuration
## Troubleshooting
### Failed to resolve struct flag
**Error**: `LogInworldRuntime: Failed to resolve struct flag: ....... with targeting key: ....... with error: Flag not found`
If you see this error in the output log, it usually means one of the following:
- Remote Config is enabled, but the graph was not registered in Portal
- The Graph ID registered in Portal does not match the Graph ID in Unreal Engine
### Targeting changes not appearing
**Issue**: I changed targeting and don't see changes on the next executions
Sometimes targeting updates take time to propagate. Wait a few moments and retry—the new variant will be applied on the next graph execution once the change is fully propagated.
## Supported via Experiments
### Applied on next executions (no code redeploy)
- Switch LLM/STT/TTS models or providers
- Adjust node configuration (temperature, token limits, prompts)
- Reorder/add/remove nodes while preserving the same inputs/outputs
- Update processing logic (edges, preprocessing steps, data flow)
- Modify custom node parameters
- Modify custom edge conditions parameters
### Requires a code deployment
- Deploying new custom nodes/edges
- Changing processing functions for custom nodes/edges
- Changing the graph's input/output interface
---
#### Inworld Lip-sync Component
Source: https://docs.inworld.ai/unreal-engine/runtime/inworld-lipsync
The Inworld Lip-sync plugin provides a component that lets you detect the visemes being spoken by an Inworld Character.
Lip-sync support is experimental.
## Get Started
Follow the steps below to set up the Inworld Lip-sync component on your character to detect visemes.
Once the Animation Blueprint has been opened, navigate to the **Event Graph**.
We will use this event to initialize the functions necessary for lip-sync.
Off the Begin Play node, you want to get the Inworld Lip-sync component from the owning actor. And you then want to bind a function to the **On Viseme Update** event in the lip-sync component.
This can easily be done by using the **Create Event** node from the Bind Event node and selecting the **Create a matching function** option. The image below will show what your graph should look like.
For the purposes of this example, we are naming the function **UpdateViseme**.
In the function you will have an input variable called **Viseme Blend** — this variable is what we will use to determine the viseme the character is currently speaking.
To get the current dominant viseme, you can break the Viseme Blend variable and get the largest float out of the structure. Below you will find a simple example of how this can be done.
Next, we want to convert this value into an Inworld Viseme enumeration that we can use to adjust the animations.
First, convert the max value to a byte.
Then convert the byte to an Inworld Viseme enumeration.
You can then store that into a local variable.
Once you have the current viseme, you can use a simple Select node to determine animations to play. For this example we will use simple animation assets, but you can use this viseme to set up whatever behavior you need to handle your lip-sync behavior.
To set up the Select node, simply create it from the Current Viseme variable, and then right-click and use the **Change Pin Type** action to whatever asset you want to use for your lip-sync animations.
You are now ready to apply lip-sync animations to your character!
---
#### SDK Reference > Runtime
#### Overview
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/overview
## Classes
- [Inworld Audio Buffer](./InworldAudioBuffer)
- [Inworld Audio Capture](./InworldAudioCapture)
- [Inworld Edge](./InworldEdge/InworldEdge)
- [Inworld Edge Is Data Type](./InworldEdge/InworldEdge_IsDataType)
- [Inworld Edge Safety Result](./InworldEdge/InworldEdge_SafetyResult)
- [Inworld Edge With Condition](./InworldEdge/InworldEdge_WithCondition)
- [Inworld Graph](./InworldGraph)
- [Inworld Graph Component](./InworldGraphComponent)
- [Inworld Graph Runtime Data](./InworldGraphRuntimeData/InworldGraphRuntimeData)
- [Inworld Graph Runtime Data Event History](./InworldGraphRuntimeData/InworldGraphRuntimeData_EventHistory)
- [Inworld Graph Runtime Data Intent](./InworldGraphRuntimeData/InworldGraphRuntimeData_Intent)
- [Inworld Graph Runtime Data Knowledge](./InworldGraphRuntimeData/InworldGraphRuntimeData_Knowledge)
- [Inworld Graph Runtime Data Voice](./InworldGraphRuntimeData/InworldGraphRuntimeData_Voice)
- [Inworld Node](./InworldNode/InworldNode)
- [Inworld Node Custom](./InworldNode/InworldNode_Custom)
- [Inworld Node General Text Processor](./InworldNode/InworldNode_GeneralTextProcessor)
- [Inworld Node Keyword Matcher](./InworldNode/InworldNode_KeywordMatcher)
- [Inworld Node LLM](./InworldNode/InworldNode_LLM)
- [Inworld Node Random Canned Text](./InworldNode/InworldNode_RandomCannedText)
- [Inworld Node STT](./InworldNode/InworldNode_STT)
- [Inworld Node Subgraph Node](./InworldNode/InworldNode_SubgraphNode)
- [Inworld Node TTS](./InworldNode/InworldNode_TTS)
- [Inworld Node Text Aggregator](./InworldNode/InworldNode_TextAggregator)
- [Inworld Node Text Chunking](./InworldNode/InworldNode_TextChunking)
- [Inworld Node Text Classifier](./InworldNode/InworldNode_TextClassifier)
- [Inworld Process Context](./InworldProcessContext)
- [Inworld Agent Runtime Subsystem](./InworldRuntimeSubsystem)
- [Inworld Text Embedder](./InworldTextEmbedder)
- [Inworld Voice Audio Component](./InworldVoiceAudioComponent)
## Blueprint Function Libraries
- [Inworld Blueprint Function Library](./InworldBlueprintFunctionLibrary)
## Assets
- [Inworld Graph Asset](./InworldGraphAsset)
## Class Inheritance Hierarchy
This diagram shows the inheritance relationships between all documented classes:
```
Actor Component
└── Inworld Graph Component
Audio Component
└── Inworld Voice Audio Component
Blueprint Function Library
└── Inworld Blueprint Function Library
Engine Subsystem
└── Inworld Agent Runtime Subsystem
Inworld Edge
└── Inworld Edge With Condition
Inworld Edge With Condition
├── Inworld Edge Is Data Type
└── Inworld Edge Safety Result
Inworld Graph Runtime Data
├── Inworld Graph Runtime Data Event History
├── Inworld Graph Runtime Data Intent
├── Inworld Graph Runtime Data Knowledge
└── Inworld Graph Runtime Data Voice
Inworld Node
├── Inworld Node Custom
├── Inworld Node General Text Processor
├── Inworld Node Keyword Matcher
├── Inworld Node LLM
├── Inworld Node Random Canned Text
├── Inworld Node STT
├── Inworld Node Subgraph Node
├── Inworld Node TTS
├── Inworld Node Text Aggregator
├── Inworld Node Text Chunking
└── Inworld Node Text Classifier
Object
├── Inworld Audio Buffer
├── Inworld Audio Capture
├── Inworld Edge
├── Inworld Graph
├── Inworld Graph Asset
├── Inworld Graph Runtime Data
├── Inworld Node
├── Inworld Process Context
└── Inworld Text Embedder
```
## Node Input/Output Reference
This table provides a quick reference for all workflow nodes and their input/output data types:
| Node Name | Input Types | Output Types |
|-----------|-------------|--------------|
| **Core LLM Nodes** |
| [Inworld Node LLM](./InworldNode/InworldNode_LLM) | `FInworldData_LLMChatRequest` | `FInworldData_LLMChatResponse` |
| **Audio Processing** |
| [Inworld Node STT](./InworldNode/InworldNode_STT) | `FInworldData_Audio OR FInworldData_DataStream_AudioChunk (Audio data to convert to text)` | `FInworldData_Text` |
| [Inworld Node TTS](./InworldNode/InworldNode_TTS) | `FInworldData_Text OR FInworldData_DataStream_String (Text data to convert to speech)`
`FInworldData_Text (Optional emotion text to influence speech synthesis)` | `FInworldData_DataStream_TTSOutput` |
| **Text Processing** |
| [Inworld Node General Text Processor](./InworldNode/InworldNode_GeneralTextProcessor) | `FInworldData_DataStream_String` | `FInworldData_DataStream_String` |
| [Inworld Node Random Canned Text](./InworldNode/InworldNode_RandomCannedText) | `FInworldData` (anything) | `FInworldData_Text (One of the configured canned responses)` |
| [Inworld Node Text Aggregator](./InworldNode/InworldNode_TextAggregator) | `FInworldData_DataStream_String` | `FInworldData_Text` |
| [Inworld Node Text Chunking](./InworldNode/InworldNode_TextChunking) | `FInworldData_Text OR FInworldData_DataStream_String` | `FInworldData_DataStream_String` |
| [Inworld Node Text Classifier](./InworldNode/InworldNode_TextClassifier) | `FInworldData_Text` | `FInworldData_ClassificationResult` |
---
#### SDK Reference > Runtime > Classes
#### Inworld Audio Buffer
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldAudioBuffer
[Overview](./overview) > Inworld Audio Buffer
**Class:** `UInworldAudioBuffer` | **Inherits from:** `UObject`
Audio buffer for collecting and managing voice recording data. This class provides functionality for buffering audio chunks during voice recording, allowing for efficient collection and processing of audio data before sending it for further processing. It manages sample rate consistency, data accumulation, and provides thread-safe operations for audio data handling.
Key features:
- Thread-safe audio data collection
- Sample rate validation and consistency checking
- Minimum duration enforcement before flushing
- Event-driven architecture with flush notifications
## Methods
- [Add](#add)
- [CreateAudioBuffer](#createaudiobuffer)
- [Flush](#flush)
## Reference
### Add
Adds an audio chunk to the audio buffer.
This function appends the provided audio data to the buffer.
Ensures that the incoming audio chunk matches the buffer's sample rate and channel configuration.
If the chunk is incompatible, the function will abort.
## Examples
```c++
void Add(const FInworldData_Audio& AudioChunk)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| AudioChunk | `const FInworldData_Audio&` | The audio data to add to the buffer. Must match the buffer's sample rate and channels. |
---
### CreateAudioBuffer
Creates a new audio buffer to record voice.
This function initializes an instance of UInworldAudioBuffer using the provided configuration.
It allows customization of the sample rate and channels.
## Examples
```c++
UInworldAudioBuffer* CreateAudioBuffer(const FInworldAudioBufferConfig& Config)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Config | `const FInworldAudioBufferConfig&` | The configuration struct containing sample rate and channels. |
#### Returns
**Type:** `UInworldAudioBuffer*`
**Description:** A pointer to the newly created UInworldAudioBuffer instance.
---
### Flush
Flushes the collected audio data from the buffer. If the buffer contains audio data, it constructs an FInworldData_Audio and broadcasts it via the OnAudioBufferFlush delegate. After broadcasting, the buffer is cleared. If the buffer is empty, this function does nothing. Events: OnAudioBufferFlush
## Examples
```c++
FInworldData_Audio Flush()
```
#### Returns
**Type:** `FInworldData_Audio`
---
---
#### Inworld Audio Capture
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldAudioCapture
[Overview](./overview) > Inworld Audio Capture
**Class:** `UInworldAudioCapture` | **Inherits from:** `UObject`
High-level audio capture system with advanced processing capabilities. This class provides comprehensive audio capture functionality with integrated AEC (Acoustic Echo Cancellation) and VAD (Voice Activity Detection) processing. It manages both input and output audio streams, applies real-time filtering, and provides event-driven notifications for captured audio data.
Key features:
- Dual audio stream capture (input and output for AEC)
- Real-time AEC and VAD processing
- Configurable capture rates and processing settings
- Thread-safe audio buffer management
- Event-driven architecture with capture notifications
- Asynchronous processing pipeline for optimal performance
## Methods
- [CreateInworldAudioCapture](#createinworldaudiocapture)
- [StartCapture](#startcapture)
- [StopCapture](#stopcapture)
## Reference
### CreateInworldAudioCapture
Creates an instance of UInworldAudioCapture for capturing audio input.
This function initializes a new audio capture object, wich can be used to start/stop audio recording.
## Examples
```c++
void CreateInworldAudioCapture(
UObject* WorldContextObject,
const FInworldAudioCaptureConfig& Config,
FOnInworldAudioCaptureCreated Callback
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| WorldContextObject | `UObject*` | |
| Config | `const FInworldAudioCaptureConfig&` | |
| Callback | `FOnInworldAudioCaptureCreated` | |
---
### StartCapture
Starts capturing audio from the selected input device. This function begins recording audio data from the active capture device. The captured audio is processed and can be retrived using OnAudioCapture event callbacks.
## Examples
```c++
void StartCapture()
```
---
### StopCapture
Stops capturing audio. This function halts the audio capture process and ensures that no further data is recorded. Any buffered audio data is discarded, and no more events will be brodcasted.
## Examples
```c++
void StopCapture()
```
---
---
#### SDK Reference > Runtime > Classes > Inworld Edge
#### Inworld Edge
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldEdge/InworldEdge
[Overview](../overview) > Inworld Edge
**Class:** `UInworldEdge` | **Inherits from:** `UObject`
Represents a directed connection between nodes in an InworldGraph. An edge defines a relationship between two nodes (Source and Destination) in the graph, controlling flow and transitions between different states or operations. Each edge can be configured as required or not and can optionally form loops in the graph structure.
## Methods
- [DestroyEdge](#destroyedge)
- [InitializeEdge](#initializeedge)
## Reference
### DestroyEdge
## Examples
```c++
void DestroyEdge()
```
---
### InitializeEdge
## Examples
```c++
void InitializeEdge()
```
---
## Node Implementations
The following specialized node types extend this base class:
- [Inworld Edge Is Data Type](./InworldEdge_IsDataType) - Represents an edge with a specific data type as its evaluation condition. This class allows for the creation and evaluation of edges in a graph with a condition that checks if the input data matches a specified data type. It extends UInworldEdge_WithCondition to provide additional functionality related to data type checking. This edge can be created and configured at runtime or within the editor. Users can override its behavior by modifying the MeetsCondition method in derived classes.
- [Inworld Edge Safety Result](./InworldEdge_SafetyResult) - An edge that controls data flow based on content safety evaluation results This edge class evaluates safety check results to determine if data should flow through the graph. It can be configured to either allow only safe content or only flagged content to pass through, making it useful for implementing content filtering and safety workflows.
- [Inworld Edge With Condition](./InworldEdge_WithCondition) - Base class for creating custom edge conditions in the InworldGraph This abstract class allows developers to implement custom conditional logic for graph edges through Blueprint or C++ implementations. Custom edges can evaluate input data to determine if data flow should be allowed through the edge.
---
#### Inworld Edge Is Data Type
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldEdge/InworldEdge_IsDataType
[Overview](../overview) > [Inworld Edge](./InworldEdge) > Inworld Edge Is Data Type
**Class:** `UInworldEdge_IsDataType` | **Inherits from:** `UInworldEdge_WithCondition`
Represents an edge with a specific data type as its evaluation condition. This class allows for the creation and evaluation of edges in a graph with a condition that checks if the input data matches a specified data type. It extends UInworldEdge_WithCondition to provide additional functionality related to data type checking. This edge can be created and configured at runtime or within the editor. Users can override its behavior by modifying the MeetsCondition method in derived classes.
---
#### Inworld Edge Safety Result
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldEdge/InworldEdge_SafetyResult
[Overview](../overview) > [Inworld Edge](./InworldEdge) > Inworld Edge Safety Result
**Class:** `UInworldEdge_SafetyResult` | **Inherits from:** `UInworldEdge_WithCondition`
An edge that controls data flow based on content safety evaluation results This edge class evaluates safety check results to determine if data should flow through the graph. It can be configured to either allow only safe content or only flagged content to pass through, making it useful for implementing content filtering and safety workflows.
---
#### Inworld Edge With Condition
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldEdge/InworldEdge_WithCondition
[Overview](../overview) > [Inworld Edge](./InworldEdge) > Inworld Edge With Condition
**Class:** `UInworldEdge_WithCondition` | **Inherits from:** `UInworldEdge`
Base class for creating custom edge conditions in the InworldGraph This abstract class allows developers to implement custom conditional logic for graph edges through Blueprint or C++ implementations. Custom edges can evaluate input data to determine if data flow should be allowed through the edge.
Key features:
- Blueprintable: Can be extended in Blueprints to create custom edge conditions
- Configurable execution thread: Condition evaluation can occur in game thread or background thread
- Custom condition logic: Evaluates FInworldDataHandle inputs for flexible condition checking
To implement a custom edge:
1. Create a Blueprint or C++ class inheriting from UInworldEdge_Custom
2. Implement the MeetsCondition function to define custom condition logic
---
#### SDK Reference > Runtime > Classes
#### Inworld Graph
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraph
[Overview](./overview) > Inworld Graph
**Class:** `UInworldGraph` | **Inherits from:** `UObject`
Represents a blueprintable and programmable runtime graph. Provides functionality for creating, compiling, and executing graphs asynchronously using callbacks. The graph can manage nodes, edges, and execution states for real-time execution purposes.
## Methods
- [CancelAllExecutions](#cancelallexecutions)
- [CancelExecution](#cancelexecution)
- [CheckNodeByName](#checknodebyname)
- [GetAllExecutions](#getallexecutions)
- [GetExecutionInfo](#getexecutioninfo)
- [GetExecutionMode](#getexecutionmode)
- [GetExecutionStatus](#getexecutionstatus)
- [GetGraphInstance](#getgraphinstance)
- [GetGraphInstance](#getgraphinstance)
- [GetNodeByName](#getnodebyname)
- [GetNodeByName](#getnodebyname)
- [GetNodeByName](#getnodebyname)
- [GetPendingExecutionsCount](#getpendingexecutionscount)
- [GetRunningExecutionsCount](#getrunningexecutionscount)
- [IsCompiled](#iscompiled)
- [IsExecutionCanceled](#isexecutioncanceled)
- [Lock](#lock)
- [SetExecutionMode](#setexecutionmode)
- [SetGraphId](#setgraphid)
## Reference
### CancelAllExecutions
Cancels all ongoing and pending executions
## Examples
```c++
void CancelAllExecutions()
```
---
### CancelExecution
Cancels an ongoing graph execution
## Examples
```c++
void CancelExecution(const FString& ExecutionId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ExecutionId | `const FString&` | The identifier of the execution to cancel |
---
### CheckNodeByName
Utility function for determining if a node exists in the graph
## Examples
```c++
bool CheckNodeByName()
```
#### Returns
**Type:** `bool`
**Description:** true if a node with the name exists, false otherwise
---
### GetAllExecutions
Gets all current executions (running and pending)
## Examples
```c++
TArray GetAllExecutions()
```
#### Returns
**Type:** `TArray`
**Description:** Array of all current executions
---
### GetExecutionInfo
Gets information about a specific execution
## Examples
```c++
FInworldGraphExecution GetExecutionInfo(const FString& ExecutionId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ExecutionId | `const FString&` | The identifier of the execution |
#### Returns
**Type:** `FInworldGraphExecution`
**Description:** The execution information, or empty struct if not found
---
### GetExecutionMode
Gets the current execution mode
## Examples
```c++
EInworldGraphExecutionMode GetExecutionMode()
```
#### Returns
**Type:** `EInworldGraphExecutionMode`
**Description:** The current execution mode
---
### GetExecutionStatus
Gets the status of a specific execution
## Examples
```c++
EInworldGraphExecutionStatus GetExecutionStatus(const FString& ExecutionId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ExecutionId | `const FString&` | The identifier of the execution |
#### Returns
**Type:** `EInworldGraphExecutionStatus`
**Description:** The current status of the execution
---
### GetGraphInstance
Creates and compile an instance of the given InworldGraph asset.
## Examples
```c++
void GetGraphInstance(FOnGraphCompiled Callback)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Callback | `FOnGraphCompiled` | Called when the graph initialization is finished |
---
### GetGraphInstance
## Examples
```c++
void GetGraphInstance()
```
---
### GetNodeByName
Utility function for grabbing a member node by its unique id
## Examples
```c++
UInworldNode* GetNodeByName(
const FString& NodeName,
TSubclassOf NodeClass
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| NodeName | `const FString&` | unique id of the node to look for |
| NodeClass | `TSubclassOf` | The class to convert the node to on return |
#### Returns
**Type:** `UInworldNode*`
**Description:** Pointer to the node with the given name, or nullptr if not found
---
### GetNodeByName
## Examples
```c++
return GetNodeByName()
```
#### Returns
**Type:** `return`
---
### GetNodeByName
## Examples
```c++
UInworldNode* GetNodeByName()
```
#### Returns
**Type:** `UInworldNode*`
---
### GetPendingExecutionsCount
Gets the number of pending executions
## Examples
```c++
int32 GetPendingExecutionsCount()
```
#### Returns
**Type:** `int32`
**Description:** Number of pending executions
---
### GetRunningExecutionsCount
Gets the number of currently running executions
## Examples
```c++
int32 GetRunningExecutionsCount()
```
#### Returns
**Type:** `int32`
**Description:** Number of running executions
---
### IsCompiled
Checks if the graph has been compiled
## Examples
```c++
bool IsCompiled()
```
#### Returns
**Type:** `bool`
**Description:** True if the graph is compiled, false otherwise
---
### IsExecutionCanceled
Checks if an execution is canceled
## Examples
```c++
bool IsExecutionCanceled(const FString& ExecutionId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ExecutionId | `const FString&` | The identifier of the execution to check |
#### Returns
**Type:** `bool`
**Description:** True if the execution is canceled, false otherwise
---
### Lock
## Examples
```c++
FScopeLock Lock()
```
#### Returns
**Type:** `FScopeLock`
---
### SetExecutionMode
Sets the execution mode for the graph
## Examples
```c++
void SetExecutionMode(EInworldGraphExecutionMode Mode)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Mode | `EInworldGraphExecutionMode` | The execution mode to set |
---
### SetGraphId
Utility function for changing the graph's string identifier
## Examples
```c++
void SetGraphId()
```
---
---
#### Inworld Graph Component
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphComponent
[Overview](./overview) > Inworld Graph Component
**Class:** `UInworldGraphComponent` | **Inherits from:** `UActorComponent`
Actor component for managing Inworld graphs within the game world. This abstract component provides the foundation for integrating Inworld graph functionality into actors. It manages graph creation, compilation, and execution while providing delegates for monitoring graph events and results.
## Methods
- [CreateGraph](#creategraph)
- [ExecuteGraph](#executegraph)
## Reference
### CreateGraph
Creates the graph from the associated GraphAsset.
## Examples
```c++
void CreateGraph()
```
---
### ExecuteGraph
Executes the graph with the provided input data and runtime data.
## Examples
```c++
UPARAM(meta = (DisplayName = "Execution Id")) FString ExecuteGraph(
const FInworldDataHandle& InputData,
const TMap& RuntimeData
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InputData | `const FInworldDataHandle&` | The input data to pass to the graph. |
| RuntimeData | `const TMap&` | Runtime data map accessible by nodes during execution. |
#### Returns
**Type:** `UPARAM(meta = (DisplayName = "Execution Id")) FString`
---
---
#### SDK Reference > Runtime > Classes > Inworld Graph RuntimeData
#### Inworld Graph Runtime Data
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphRuntimeData/InworldGraphRuntimeData
[Overview](../overview) > Inworld Graph Runtime Data
**Class:** `UInworldGraphRuntimeData` | **Inherits from:** `UObject`
Abstract base class for data the graph can access at runtime
## Methods
- [GetData](#getdata)
- [GetKey](#getkey)
## Reference
### GetData
Retrieves the actual data wrapped in an InworldDataHandle.
## Examples
```c++
virtual FInworldDataHandle GetData()
```
#### Returns
**Type:** `virtual FInworldDataHandle`
**Description:** The data handle containing the runtime data
---
### GetKey
Retrieves the key identifier for this runtime data.
## Examples
```c++
FName GetKey()
```
#### Returns
**Type:** `FName`
**Description:** The name key used to identify this runtime data in the graph
---
## Node Implementations
The following specialized node types extend this base class:
- [Inworld Graph Runtime Data Event History](./InworldGraphRuntimeData_EventHistory) - Runtime data class for managing event history in Inworld graphs. This class creates and manages an event history system that tracks various types of events (primarily speech events) during graph execution. It provides functionality for adding, retrieving, and formatting event history with configurable capacity and formatting options.
- [Inworld Graph Runtime Data Intent](./InworldGraphRuntimeData_Intent) - Runtime data class for managing intent recognition in Inworld graphs. This class compiles and stores all necessary data for intent matching operations, including text embeddings for intent recognition, enable/disable states, and asset management. It supports dynamic adding, removing, and toggling of intents during graph execution.
- [Inworld Graph Runtime Data Knowledge](./InworldGraphRuntimeData_Knowledge) - Runtime data class for managing knowledge retrieval in Inworld graphs. This class manages knowledge sets used for information retrieval and question-answering operations within Inworld graphs. It supports dynamic adding, removing, enabling, and disabling of knowledge sets, with text embeddings used for semantic search and retrieval.
- [Inworld Graph Runtime Data Voice](./InworldGraphRuntimeData_Voice) - Runtime data class for managing voice configuration in Inworld graphs. This class stores and manages voice properties used for text-to-speech operations within Inworld graphs. It provides simple get/set functionality for voice configuration including voice ID and language settings.
---
#### Inworld Graph Runtime Data Event History
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphRuntimeData/InworldGraphRuntimeData_EventHistory
[Overview](../overview) > [Inworld Graph Runtime Data](./InworldGraphRuntimeData) > Inworld Graph Runtime Data Event History
**Class:** `UInworldGraphRuntimeData_EventHistory` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data class for managing event history in Inworld graphs. This class creates and manages an event history system that tracks various types of events (primarily speech events) during graph execution. It provides functionality for adding, retrieving, and formatting event history with configurable capacity and formatting options.
## Methods
- [AddGenericEvent](#addgenericevent)
- [AddSpeechEvent](#addspeechevent)
- [Clear](#clear)
- [DebugDumpEventHistory](#debugdumpeventhistory)
- [GetHistoryEventData](#gethistoryeventdata)
- [GetHistoryString](#gethistorystring)
- [GetSpeechEvents](#getspeechevents)
- [Set](#set)
## Reference
### AddGenericEvent
Adds a generic event to the event history.
This function stores the provided text event and appends it to the history.
If the history reaches its capacity, the oldest event is removed.
## Examples
```c++
void AddGenericEvent(const FString& Event)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Event | `const FString&` | The text event to be added to the history. |
---
### AddSpeechEvent
Adds a speech event to the event history.
This function records a speech event, storing the agent's name and utterance.
The event is formatted based on the SpeechEventFormatter and appended to the history.
If the history reaches its capacity, the oldest event is removed.
## Examples
```c++
void AddSpeechEvent(const FInworldEventSpeech& Speech)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Speech | `const FInworldEventSpeech&` | The speech event data containing the agent's name and utterance. |
---
### Clear
Clear the event history. This function removes all stored events and resets the history string.
## Examples
```c++
void Clear()
```
---
### DebugDumpEventHistory
Outputs debug information about the event history to the console. This function prints detailed information about the current event history state, including all stored events, for debugging and development purposes.
## Examples
```c++
void DebugDumpEventHistory()
```
---
### GetHistoryEventData
Retrieves the speech event history as structured data.
## Examples
```c++
FInworldHistoryEventData GetHistoryEventData()
```
#### Returns
**Type:** `FInworldHistoryEventData`
**Description:** Structured event history data containing all speech events
---
### GetHistoryString
Retrieves the full event history as a formatted string.
This function returns a concatenated string representation of all stored events.
## Examples
```c++
const FString& GetHistoryString()
```
#### Returns
**Type:** `const FString&`
**Description:** A reference to the history string containing all recorded events.
---
### GetSpeechEvents
Retrieves the speech event history.
This function returns a representation of all stored speech events.
## Examples
```c++
const TArray& GetSpeechEvents()
```
#### Returns
**Type:** `const TArray&`
**Description:** A reference to the history containing all recorded speech events.
---
### Set
Sets the initial event history data table.
## Examples
```c++
void Set(UDataTable* InEventHistoryTable)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InEventHistoryTable | `UDataTable*` | The data table containing initial event history entries |
---
---
#### Inworld Graph Runtime Data Intent
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphRuntimeData/InworldGraphRuntimeData_Intent
[Overview](../overview) > [Inworld Graph Runtime Data](./InworldGraphRuntimeData) > Inworld Graph Runtime Data Intent
**Class:** `UInworldGraphRuntimeData_Intent` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data class for managing intent recognition in Inworld graphs. This class compiles and stores all necessary data for intent matching operations, including text embeddings for intent recognition, enable/disable states, and asset management. It supports dynamic adding, removing, and toggling of intents during graph execution.
## Methods
- [AddIntentTextEmbeddingAsset](#addintenttextembeddingasset)
- [AddIntentTextEmbeddings](#addintenttextembeddings)
- [DisableIntent](#disableintent)
- [EnableIntent](#enableintent)
- [GetIntentMapKeys](#getintentmapkeys)
- [RemoveIntent](#removeintent)
- [Set](#set)
## Reference
### AddIntentTextEmbeddingAsset
Adds a new intent to the runtime data, with an option to replace an existing set.
## Examples
```c++
void AddIntentTextEmbeddingAsset(
FName Id,
const UInworldTextEmbeddingAsset* Intent,
bool Replace
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier of the intent set to be added. |
| Intent | `const UInworldTextEmbeddingAsset*` | The intent to be added. |
| Replace | `bool` | If true, replaces an existing intent with the same ID. Defaults to false. |
---
### AddIntentTextEmbeddings
Adds a new intent to the runtime data, with an option to replace an existing set.
## Examples
```c++
void AddIntentTextEmbeddings(
FName Id,
const TArray& Intent,
bool Replace
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier of the intent set to be added. |
| Intent | `const TArray&` | The intent to be added. |
| Replace | `bool` | If true, replaces an existing intent with the same ID. Defaults to false. |
---
### DisableIntent
Disables a specific intent within the runtime data.
## Examples
```c++
void DisableIntent(FName Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier for the intent to be disabled. |
---
### EnableIntent
Enables a specific intent within the runtime data.
## Examples
```c++
void EnableIntent(FName Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier for the intent to be enabled. |
---
### GetIntentMapKeys
Retrieves all intent set IDs currently stored in the runtime data.
## Examples
```c++
TArray GetIntentMapKeys()
```
#### Returns
**Type:** `TArray`
---
### RemoveIntent
Removes a specific intent identified by its unique ID.
## Examples
```c++
void RemoveIntent(FName Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier of the intent to be removed. |
---
### Set
Sets the initial intent assets for the runtime data.
## Examples
```c++
void Set(const TMap& IntentAssets)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| IntentAssets | `const TMap&` | Map of intent identifiers to their corresponding text embedding assets |
---
---
#### Inworld Graph Runtime Data Knowledge
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphRuntimeData/InworldGraphRuntimeData_Knowledge
[Overview](../overview) > [Inworld Graph Runtime Data](./InworldGraphRuntimeData) > Inworld Graph Runtime Data Knowledge
**Class:** `UInworldGraphRuntimeData_Knowledge` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data class for managing knowledge retrieval in Inworld graphs. This class manages knowledge sets used for information retrieval and question-answering operations within Inworld graphs. It supports dynamic adding, removing, enabling, and disabling of knowledge sets, with text embeddings used for semantic search and retrieval.
## Methods
- [AddKnowledgeTextEmbeddingAsset](#addknowledgetextembeddingasset)
- [AddKnowledgeTextEmbeddings](#addknowledgetextembeddings)
- [DisableKnowledge](#disableknowledge)
- [EnableKnowledge](#enableknowledge)
- [GetKnowledgeMapKeys](#getknowledgemapkeys)
- [RemoveKnowledge](#removeknowledge)
- [Set](#set)
## Reference
### AddKnowledgeTextEmbeddingAsset
Adds a new knowledge set to the runtime data, with an option to replace an existing set.
## Examples
```c++
void AddKnowledgeTextEmbeddingAsset(
FName Id,
const UInworldTextEmbeddingAsset* Knowledge,
bool Replace
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier of the knowledge set to be added. |
| Knowledge | `const UInworldTextEmbeddingAsset*` | The TextEmbeddings Asset to be added. |
| Replace | `bool` | If true, replaces an existing knowledge set with the same ID. Defaults to false. |
---
### AddKnowledgeTextEmbeddings
Adds a new knowledge set to the runtime data, with an option to replace an existing set.
## Examples
```c++
void AddKnowledgeTextEmbeddings(
FName Id,
const TArray& Knowledge,
bool Replace
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier of the knowledge set to be added. |
| Knowledge | `const TArray&` | The TextEmbeddings to be added. |
| Replace | `bool` | If true, replaces an existing knowledge set with the same ID. Defaults to false. |
---
### DisableKnowledge
Disables a specific knowledge set within the runtime data.
## Examples
```c++
void DisableKnowledge(FName Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier for the knowledge set to be disabled. |
---
### EnableKnowledge
Enables a specific knowledge set within the runtime data.
## Examples
```c++
void EnableKnowledge(FName Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier for the knowledge set to be enabled. |
---
### GetKnowledgeMapKeys
Retrieves all knowledge set IDs currently stored in the runtime data.
## Examples
```c++
TArray GetKnowledgeMapKeys()
```
#### Returns
**Type:** `TArray`
---
### RemoveKnowledge
Removes a specific knowledge set identified by its unique ID.
## Examples
```c++
void RemoveKnowledge(FName Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `FName` | The unique identifier of the knowledge set to be removed. |
---
### Set
Sets the initial knowledge assets for the runtime data.
## Examples
```c++
void Set(const TMap& KnowledgeAssets)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| KnowledgeAssets | `const TMap&` | Map of knowledge set identifiers to their corresponding text embedding assets |
---
---
#### Inworld Graph Runtime Data Voice
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphRuntimeData/InworldGraphRuntimeData_Voice
[Overview](../overview) > [Inworld Graph Runtime Data](./InworldGraphRuntimeData) > Inworld Graph Runtime Data Voice
**Class:** `UInworldGraphRuntimeData_Voice` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data class for managing voice configuration in Inworld graphs. This class stores and manages voice properties used for text-to-speech operations within Inworld graphs. It provides simple get/set functionality for voice configuration including voice ID and language settings.
## Methods
- [Get](#get)
- [Set](#set)
## Reference
### Get
Retrieves the current voice configuration.
## Examples
```c++
const FInworldVoice& Get()
```
#### Returns
**Type:** `const FInworldVoice&`
**Description:** Reference to the current voice configuration
---
### Set
Sets the voice configuration for this runtime data.
## Examples
```c++
void Set(const FInworldVoice& InVoice)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InVoice | `const FInworldVoice&` | The new voice configuration to set |
---
---
#### SDK Reference > Runtime > Classes > Inworld Node
#### Inworld Node
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode
[Overview](../overview) > Inworld Node
**Class:** `UInworldNode` | **Inherits from:** `UObject`
Abstract base class for all workflow nodes. Node provides the core interface for workflow processing units. Each node can process a set of inputs and produce an output.
## Methods
- [DestroyNode](#destroynode)
- [GetNodeName](#getnodename)
- [GetProcessFunctions](#getprocessfunctions)
- [InitializeNode](#initializenode)
- [IsGetter](#isgetter)
## Reference
### DestroyNode
## Examples
```c++
void DestroyNode()
```
---
### GetNodeName
Retrieves the unique name identifier of this node.
## Examples
```c++
FString GetNodeName()
```
#### Returns
**Type:** `FString`
**Description:** The node's name as a string
---
### GetProcessFunctions
Retrieves the array of Process functions available for this node.
## Examples
```c++
TArray GetProcessFunctions()
```
#### Returns
**Type:** `TArray`
---
### InitializeNode
## Examples
```c++
void InitializeNode()
```
---
### IsGetter
Checks if this node is configured as a getter node.
## Examples
```c++
bool IsGetter()
```
#### Returns
**Type:** `bool`
**Description:** True if this node is a getter node, false otherwise
---
## Node Implementations
The following specialized node types extend this base class:
- [Inworld Node Custom](./InworldNode_Custom) - Base class for creating custom node behaviors in the InworldGraph. This abstract class allows developers to implement custom processing logic for graph nodes through Blueprint or C++ implementations. Custom nodes can process input data and produce output data based on custom logic, extending the graph system's capabilities.
- [Inworld Node General Text Processor](./InworldNode_GeneralTextProcessor) - Performs the following (optional) text processing operations on the data stream:
- [Inworld Node Keyword Matcher](./InworldNode_KeywordMatcher) - A node that matches text content against keyword groups.
- [Inworld Node LLM](./InworldNode_LLM) - A workflow node that processes chat messages and produces either
- [Inworld Node Random Canned Text](./InworldNode_RandomCannedText) - A node that selects a random text from a list of predefined phrases.
- [Inworld Node STT](./InworldNode_STT) - A node that converts audio input to text using Speech-to-Text
- [Inworld Node Subgraph Node](./InworldNode_SubgraphNode) - A node that executes another graph as a subgraph within the current graph. This node allows for modular graph design by embedding one graph asset within another. It manages the execution of the subgraph and handles data flow between the parent graph and the embedded subgraph. This enables graph reusability and complex hierarchical workflows.
- [Inworld Node TTS](./InworldNode_TTS) - A node that converts text input to synthesized speech using
- [Inworld Node Text Aggregator](./InworldNode_TextAggregator) - A node that aggregates a text stream into a single text
- [Inworld Node Text Chunking](./InworldNode_TextChunking) - A node that chunks text input into a stream of token chunks.
- [Inworld Node Text Classifier](./InworldNode_TextClassifier) - A node that classifies text content into categories.
---
#### Inworld Node Custom
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_Custom
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Custom
**Class:** `UInworldNode_Custom` | **Inherits from:** `UInworldNode`
Base class for creating custom node behaviors in the InworldGraph. This abstract class allows developers to implement custom processing logic for graph nodes through Blueprint or C++ implementations. Custom nodes can process input data and produce output data based on custom logic, extending the graph system's capabilities.
Key features:
- Blueprintable: Can be extended in Blueprints to create custom node types
- Configurable execution thread: Processing can occur in game thread or background thread
To implement a custom node:
1. Create a Blueprint or C++ class inheriting from UInworldNode_Custom
2. Implement the Process function to define custom behavior
---
#### Inworld Node General Text Processor
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_GeneralTextProcessor
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node General Text Processor
**Class:** `UInworldNode_GeneralTextProcessor` | **Inherits from:** `UInworldNode_TextProcessing`
Performs the following (optional) text processing operations on the data stream.
**Input Types:**
- FInworldData_DataStream_String
**Output Types:**
- FInworldData_DataStream_String
---
#### Inworld Node Keyword Matcher
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_KeywordMatcher
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Keyword Matcher
**Class:** `UInworldNode_KeywordMatcher` | **Inherits from:** `UInworldNode`
A node that matches text content against keyword groups.
This node takes text input and matches it against configured keyword groups. It can be used for various purposes such as safety checking, content categorization, sentiment analysis, or any other keyword-based filtering. It inherits from TypedNode with MatchedKeywords as output and Text as input.
**Input Types:**
- `FInworldData_Text`
**Output Types:**
- `FInworldData_ClassificationResult`
---
#### Inworld Node LLM
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_LLM
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node LLM
**Class:** `UInworldNode_LLM` | **Inherits from:** `UInworldNode`
A workflow node that processes chat messages and produces either complete text output or a stream of text tokens based on the stream parameter.
`UInworldNode_LLM` encapsulates the functionality of an LLM client to generate text responses within a workflow graph. It can be configured to output complete text or stream tokens as they are generated.
**Input Types:**
- `FInworldData_LLMChatRequest`
**Output Types:**
- `FInworldData_LLMChatResponse`
---
#### Inworld Node Random Canned Text
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_RandomCannedText
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Random Canned Text
**Class:** `UInworldNode_RandomCannedText` | **Inherits from:** `UInworldNode`
A node that selects a random text from a list of predefined phrases.
This node takes no input and produces a Text output containing a randomly selected phrase from its internal list. It inherits from the base Node class and is useful for generating varied responses or fallback content.
**Input Types:**
- `FInworldData` (Anything)
**Output Types:**
- `FInworldData_Text` (One of the configured canned responses)
---
#### Inworld Node STT
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_STT
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node STT
**Class:** `UInworldNode_STT` | **Inherits from:** `UInworldNode`
A node that converts audio input to text using Speech-to-Text functionality.
This node takes audio input and processes it through a Speech-to-Text service to generate text output. It inherits from TypedNode with Text as output and Audio as input.
**Input Types:**
- `FInworldData_Audio` OR `FInworldData_DataStream_AudioChunk` (Audio data to convert to text)
**Output Types:**
- `FInworldData_Text`
---
#### Inworld Node Subgraph Node
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_SubgraphNode
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Subgraph Node
**Class:** `UInworldNode_SubgraphNode` | **Inherits from:** `UInworldNode`
A node that executes another graph as a subgraph within the current graph. This node allows for modular graph design by embedding one graph asset within another. It manages the execution of the subgraph and handles data flow between the parent graph and the embedded subgraph. This enables graph reusability and complex hierarchical workflows.
## Methods
- [GetGraphAsset](#getgraphasset)
- [RefreshDetails](#refreshdetails)
- [SetGraphAsset](#setgraphasset)
## Reference
### GetGraphAsset
Retrieves the currently configured graph asset.
## Examples
```c++
UInworldGraphAsset* GetGraphAsset()
```
#### Returns
**Type:** `UInworldGraphAsset*`
**Description:** Pointer to the graph asset, or nullptr if none is set
---
### RefreshDetails
Refreshes the node's details based on the current graph asset. This function updates the description and runtime data requirements based on the currently set graph asset.
## Examples
```c++
void RefreshDetails()
```
---
### SetGraphAsset
Sets the graph asset to be used as a subgraph.
## Examples
```c++
void SetGraphAsset()
```
---
---
#### Inworld Node TTS
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_TTS
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node TTS
**Class:** `UInworldNode_TTS` | **Inherits from:** `UInworldNode`
A node that converts text input to synthesized speech using Text-to-Speech.
This node takes either a single text input or a stream of text inputs and converts them to a stream of TTSOutput objects containing both the original text and the synthesized audio. It inherits from TypedNode with specific input/output types for handling text and audio data.
**Input Types:**
- `FInworldData_Text` OR `FInworldData_DataStream_String` (Text data to convert to speech)
- `FInworldData_Text` (Optional emotion text to influence speech synthesis)
**Output Types:**
- `FInworldData_DataStream_TTSOutput`
---
#### Inworld Node Text Aggregator
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_TextAggregator
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Text Aggregator
**Class:** `UInworldNode_TextAggregator` | **Inherits from:** `UInworldNode`
A node that aggregates a text stream into a single text output.
This node takes a stream of text inputs and converts that into a single text output.
**Input Types:**
- `FInworldData_DataStream_String`
**Output Types:**
- `FInworldData_Text`
---
#### Inworld Node Text Chunking
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_TextChunking
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Text Chunking
**Class:** `UInworldNode_TextChunking` | **Inherits from:** `UInworldNode`
A node that chunks text input into a stream of token chunks.
This node takes either a single text input or a stream of text inputs and converts them into a stream of text chunks. It inherits from TypedNode with specific input/output types for handling text data streams.
**Input Types:**
- `FInworldData_Text` OR `FInworldData_DataStream_String`
**Output Types:**
- `FInworldData_DataStream_String`
---
#### Inworld Node Text Classifier
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldNode/InworldNode_TextClassifier
[Overview](../overview) > [Inworld Node](./InworldNode) > Inworld Node Text Classifier
**Class:** `UInworldNode_TextClassifier` | **Inherits from:** `UInworldNode`
A node that classifies text content into categories.
This node takes text input and classifies it into configured categories. It can be used for various purposes such as content categorization, sentiment analysis, topic detection, or any other classification task. It inherits from TypedNode with ClassificationResult as output and Text as input.
**Input Types:**
- `FInworldData_Text`
**Output Types:**
- `FInworldData_ClassificationResult`
---
#### SDK Reference > Runtime > Classes
#### Inworld Process Context
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldProcessContext
[Overview](./overview) > Inworld Process Context
**Class:** `UInworldProcessContext` | **Inherits from:** `UObject`
Context object for managing runtime data during graph execution. This class provides access to runtime data that can be shared across nodes during graph execution. It allows nodes to store and retrieve data using key-value pairs, supporting various data types including basic types, objects, and structures.
## Methods
- [Create](#create)
- [GetProcessContext](#getprocesscontext)
- [GetRuntimeBool](#getruntimebool)
- [GetRuntimeData](#getruntimedata)
- [GetRuntimeFloat](#getruntimefloat)
- [GetRuntimeInt](#getruntimeint)
- [GetRuntimeObject](#getruntimeobject)
- [GetRuntimeObject](#getruntimeobject)
- [GetTextEmbedder](#gettextembedder)
- [SetProcessContext](#setprocesscontext)
- [SetRuntimeData](#setruntimedata)
## Reference
### Create
## Examples
```c++
UInworldProcessContext* Create()
```
#### Returns
**Type:** `UInworldProcessContext*`
---
### GetProcessContext
## Examples
```c++
inworld::ProcessContext* GetProcessContext()
```
#### Returns
**Type:** `inworld::ProcessContext*`
---
### GetRuntimeBool
## Examples
```c++
bool GetRuntimeBool(
FName Key,
bool& OutBool
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Key | `FName` | |
| OutBool | `bool&` | |
#### Returns
**Type:** `bool`
---
### GetRuntimeData
## Examples
```c++
bool GetRuntimeData(
FName Key,
FInworldDataHandle& OutData
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Key | `FName` | |
| OutData | `FInworldDataHandle&` | |
#### Returns
**Type:** `bool`
---
### GetRuntimeFloat
## Examples
```c++
bool GetRuntimeFloat(
FName Key,
float& OutFloat
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Key | `FName` | |
| OutFloat | `float&` | |
#### Returns
**Type:** `bool`
---
### GetRuntimeInt
## Examples
```c++
bool GetRuntimeInt(
FName Key,
int32& OutInt
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Key | `FName` | |
| OutInt | `int32&` | |
#### Returns
**Type:** `bool`
---
### GetRuntimeObject
## Examples
```c++
bool GetRuntimeObject(
FName Key,
TSubclassOf ObjectClass,
UObject*& OutObject
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Key | `FName` | |
| ObjectClass | `TSubclassOf` | |
| OutObject | `UObject*&` | |
#### Returns
**Type:** `bool`
---
### GetRuntimeObject
## Examples
```c++
bool GetRuntimeObject()
```
#### Returns
**Type:** `bool`
---
### GetTextEmbedder
## Examples
```c++
inworld::StatusOr_TextEmbedderInterface GetTextEmbedder()
```
#### Returns
**Type:** `inworld::StatusOr_TextEmbedderInterface`
---
### SetProcessContext
## Examples
```c++
void SetProcessContext()
```
---
### SetRuntimeData
## Examples
```c++
void SetRuntimeData(
FName Key,
FInworldDataHandle Data
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Key | `FName` | |
| Data | `FInworldDataHandle` | |
---
---
#### Inworld Agent Runtime Subsystem
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldRuntimeSubsystem
[Overview](./overview) > Inworld Agent Runtime Subsystem
**Class:** `UInworldRuntimeSubsystem` | **Inherits from:** `UEngineSubsystem`
The subsystem that manages instancing graphs and their execution.
## Methods
- [ClearGraphExecutionHistory](#cleargraphexecutionhistory)
- [Convert](#convert)
- [GetAEC](#getaec)
- [GetAllExecutionHistory](#getallexecutionhistory)
- [GetAllUniqueOwner](#getalluniqueowner)
- [GetHistoryList](#gethistorylist)
- [GetNodeHistoryOuput](#getnodehistoryouput)
- [GetOrCreateTextEmbedder](#getorcreatetextembedder)
- [GetVAD](#getvad)
- [IsNodeValid](#isnodevalid)
- [RegisterCreationConfigType](#registercreationconfigtype)
- [RegisterCustomEdgeConditionType](#registercustomedgeconditiontype)
- [RegisterNodeType](#registernodetype)
- [RegisterProcessContext](#registerprocesscontext)
- [SendGraphFeedback](#sendgraphfeedback)
- [UnregisterComponent](#unregistercomponent)
- [UnregisterProcessContext](#unregisterprocesscontext)
## Reference
### ClearGraphExecutionHistory
Clears the stored graph execution history.
## Examples
```c++
void ClearGraphExecutionHistory()
```
---
### Convert
## Examples
```c++
FInworldDataHandle Convert()
```
#### Returns
**Type:** `FInworldDataHandle`
---
### GetAEC
Gets the Acoustic Echo Cancellation (AEC) filter interface.
## Examples
```c++
inworld::AECFilterInterface& GetAEC()
```
#### Returns
**Type:** `inworld::AECFilterInterface&`
**Description:** Reference to the AEC filter interface.
---
### GetAllExecutionHistory
Retrieves all execution history entries for a specific owner object.
## Examples
```c++
void GetAllExecutionHistory()
```
---
### GetAllUniqueOwner
Retrieves all unique owner objects from the execution history for debugging.
## Examples
```c++
void GetAllUniqueOwner(TArray& OwnerList)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| OwnerList | `TArray&` | Output array of unique owner objects. |
---
### GetHistoryList
Gets the complete execution history list (primarily for editor utility widgets).
## Examples
```c++
const TArray& GetHistoryList()
```
#### Returns
**Type:** `const TArray&`
**Description:** Reference to the array of all graph execution history entries.
---
### GetNodeHistoryOuput
Extracts and formats the output data from a node history entry.
## Examples
```c++
FString GetNodeHistoryOuput(const FInworldNodeHistory& Node)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Node | `const FInworldNodeHistory&` | The node history entry to extract output from. |
#### Returns
**Type:** `FString`
**Description:** Formatted string representation of the node's output.
---
### GetOrCreateTextEmbedder
Gets a cached TextEmbedder instance for the given ConfigId, or creates a new one if it doesn't exist.
## Examples
```c++
UInworldTextEmbedder* GetOrCreateTextEmbedder()
```
#### Returns
**Type:** `UInworldTextEmbedder*`
**Description:** A pointer to the UInworldTextEmbedder instance, or nullptr if the config is invalid or creation fails.
---
### GetVAD
Gets the Voice Activity Detection (VAD) interface.
## Examples
```c++
inworld::VADInterface& GetVAD()
```
#### Returns
**Type:** `inworld::VADInterface&`
**Description:** Reference to the VAD interface.
---
### IsNodeValid
## Examples
```c++
bool IsNodeValid()
```
#### Returns
**Type:** `bool`
---
### RegisterCreationConfigType
## Examples
```c++
bool RegisterCreationConfigType()
```
#### Returns
**Type:** `bool`
---
### RegisterCustomEdgeConditionType
## Examples
```c++
bool RegisterCustomEdgeConditionType()
```
#### Returns
**Type:** `bool`
---
### RegisterNodeType
## Examples
```c++
bool RegisterNodeType()
```
#### Returns
**Type:** `bool`
---
### RegisterProcessContext
## Examples
```c++
void RegisterProcessContext()
```
---
### SendGraphFeedback
Records user feedback with OpenTelemetry for analytics and improvement.
## Examples
```c++
void SendGraphFeedback(const FInworldGraphFeedback& Feedback)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Feedback | `const FInworldGraphFeedback&` | The feedback data including graph ID, execution ID, rating, and description. |
---
### UnregisterComponent
## Examples
```c++
bool UnregisterComponent()
```
#### Returns
**Type:** `bool`
---
### UnregisterProcessContext
## Examples
```c++
void UnregisterProcessContext()
```
---
---
#### Inworld Text Embedder
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldTextEmbedder
[Overview](./overview) > Inworld Text Embedder
**Class:** `UInworldTextEmbedder` | **Inherits from:** `UObject`
Provides functionality to embed text as vector embeddings using Inworld AI's Text Embedding services. This class offers both synchronous and asynchronous methods for embedding text inputs into high-dimensional vector representations. Delegates can be used to handle asynchronous results, which include success indicators and the resulting embeddings. Text embeddings are useful for semantic similarity comparisons, intent matching, and knowledge retrieval operations.
## Methods
- [CreateInworldTextEmbedder](#createinworldtextembedder)
- [Embed](#embed)
- [EmbedAsync](#embedasync)
- [EmbedAsync](#embedasync)
## Reference
### CreateInworldTextEmbedder
Creates an instance of UInworldTextEmbedder with the given configuration ID.
## Examples
```c++
UInworldTextEmbedder* CreateInworldTextEmbedder(const FString& ConfigId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ConfigId | `const FString&` | Identifier for the configuration to use for creating the text embedder. |
#### Returns
**Type:** `UInworldTextEmbedder*`
**Description:** A pointer to the instance of UInworldTextEmbedder.
---
### Embed
Embeds the input text into embeddings synchronously.
## Examples
```c++
bool Embed(
const TArray& Text,
TArray& OutTextEmbeddings
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Text | `const TArray&` | An array of text strings to embed. |
| OutTextEmbeddings | `TArray&` | Outputs the resulting embeddings if embedding succeeds. |
#### Returns
**Type:** `bool`
**Description:** True if the embedding operation is successful, false otherwise.
---
### EmbedAsync
Embeds the input text into embeddings asynchronously.
## Examples
```c++
void EmbedAsync(
const TArray& Text,
FOnTextEmbedded Callback
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Text | `const TArray&` | An array of text strings to embed. |
| Callback | `FOnTextEmbedded` | A delegate to handle the embedding result asynchronously. |
---
### EmbedAsync
## Examples
```c++
void EmbedAsync()
```
---
---
#### Inworld Voice Audio Component
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldVoiceAudioComponent
[Overview](./overview) > Inworld Voice Audio Component
**Class:** `UInworldVoiceAudioComponent` | **Inherits from:** `UAudioComponent`
Specialized audio component for TTS voice playback with advanced features. This component extends UAudioComponent to provide specialized functionality for playing synthesized voice audio from the Inworld TTS system. It manages voice queuing, playback interruption, and playback timing events.
Key features:
- Queue-based voice playback management
- Playback interruption and cleanup
- Real-time playback timing events
- Thread-safe voice queue operations
- Event-driven architecture for playback notifications
## Methods
- [Interrupt](#interrupt)
- [QueueVoice](#queuevoice)
## Reference
### Interrupt
Interrupts the current voice playback. This function stops all pending voice chunks and immediately halts playback. It broadcasts an event to notify that the voice has been interrupted. Events: OnVoiceAudioInterrupt
## Examples
```c++
void Interrupt()
```
---
### QueueVoice
Queues a voice chunk for playback.
This function adds the provided voice data to the playback queue.
If this is the first chunk in the queue, it sets the sample rate and channel configuration
based on the incoming audio data.
Events: OnVoiceAudioStart, OnVoiceAudioComplete
## Examples
```c++
void QueueVoice(const FInworldData_TTSOutput& Voice)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Voice | `const FInworldData_TTSOutput&` | The voice data to queue for playback. |
---
---
#### SDK Reference > Runtime
#### Inworld Blueprint Function Library
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldBlueprintFunctionLibrary
[Overview](./overview) > Inworld Blueprint Function Library
**Class:** `UInworldBlueprintFunctionLibrary` | **Inherits from:** `UBlueprintFunctionLibrary`
## Methods
- [BreakAudio](#breakaudio)
- [CreateBooleanJinjaArgument](#createbooleanjinjaargument)
- [CreateFloatJinjaArgument](#createfloatjinjaargument)
- [CreateIntJinjaArgument](#createintjinjaargument)
- [CreateInworldDataError](#createinworlddataerror)
- [CreateInworldDataError](#createinworlddataerror)
- [CreateStringJinjaArgument](#createstringjinjaargument)
- [CreateStructJinjaArgument](#createstructjinjaargument)
- [CreateStructJinjaArgument](#createstructjinjaargument)
- [GetNext](#datastreamttsoutputgetnext)
- [HasNext](#datastreamttsoutputhasnext)
- [ErrorOnDestroying](#errorondestroying)
- [GetRuntimeStruct](#getruntimestruct)
- [GetVisemeBlends](#getvisemeblends)
- [MakeAudio](#makeaudio)
- [TrimLLMOutput](#trimllmoutput)
- [UnwrapInworldDataHandle](#unwrapinworlddatahandle)
- [WrapInworldData](#wrapinworlddata)
- [WrapInworldData](#wrapinworlddata)
## Reference
### BreakAudio
## Examples
```c++
void BreakAudio(
const FInworldData_Audio& Audio,
TArray& OutWaveform,
int32& OutSampleRate
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Audio | `const FInworldData_Audio&` | |
| OutWaveform | `TArray&` | |
| OutSampleRate | `int32&` | |
---
### CreateBooleanJinjaArgument
## Examples
```c++
FJinjaArgument CreateBooleanJinjaArgument(bool Value)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Value | `bool` | |
#### Returns
**Type:** `FJinjaArgument`
---
### CreateFloatJinjaArgument
## Examples
```c++
FJinjaArgument CreateFloatJinjaArgument(float Value)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Value | `float` | |
#### Returns
**Type:** `FJinjaArgument`
---
### CreateIntJinjaArgument
## Examples
```c++
FJinjaArgument CreateIntJinjaArgument(int32 Value)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Value | `int32` | |
#### Returns
**Type:** `FJinjaArgument`
---
### CreateInworldDataError
## Examples
```c++
void CreateInworldDataError(
UScriptStruct* InworldDataStructType,
const FString& Reason,
int32& Data
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InworldDataStructType | `UScriptStruct*` | |
| Reason | `const FString&` | |
| Data | `int32&` | |
---
### CreateInworldDataError
## Examples
```c++
void CreateInworldDataError()
```
---
### CreateStringJinjaArgument
## Examples
```c++
FJinjaArgument CreateStringJinjaArgument(FString Value)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Value | `FString` | |
#### Returns
**Type:** `FJinjaArgument`
---
### CreateStructJinjaArgument
## Examples
```c++
FJinjaArgument CreateStructJinjaArgument(const int32& Data)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Data | `const int32&` | |
#### Returns
**Type:** `FJinjaArgument`
---
### CreateStructJinjaArgument
## Examples
```c++
FJinjaArgument CreateStructJinjaArgument()
```
#### Returns
**Type:** `FJinjaArgument`
---
### GetNext
## Examples
```c++
FInworldDataHandle DataStreamTTSOutputGetNext(FInworldData_DataStream_TTSOutput TTSOutputStream)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| TTSOutputStream | `FInworldData_DataStream_TTSOutput` | |
#### Returns
**Type:** `FInworldDataHandle`
---
### HasNext
## Examples
```c++
bool DataStreamTTSOutputHasNext(const FInworldData_DataStream_TTSOutput& TTSOutputStream)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| TTSOutputStream | `const FInworldData_DataStream_TTSOutput&` | |
#### Returns
**Type:** `bool`
---
### ErrorOnDestroying
## Examples
```c++
FInworldDataHandle ErrorOnDestroying()
```
#### Returns
**Type:** `FInworldDataHandle`
---
### GetRuntimeStruct
## Examples
```c++
void GetRuntimeStruct(
UInworldProcessContext* ProcessContext,
const FName& Key,
UScriptStruct* ScriptStruct,
int32& Struct,
bool& Successful
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ProcessContext | `UInworldProcessContext*` | |
| Key | `const FName&` | |
| ScriptStruct | `UScriptStruct*` | |
| Struct | `int32&` | |
| Successful | `bool&` | |
---
### GetVisemeBlends
Calculates viseme blend weights for a given playback time based on VisemeInfos.
Smoothly interpolates between visemes.
## Examples
```c++
FInworldVisemeBlends GetVisemeBlends(
const float PlaybackTimeSeconds,
const TArray& VisemeInfos
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| PlaybackTimeSeconds | `const float` | The current playback in seconds. |
| VisemeInfos | `const TArray&` | Array of visem infos sorted by timestamps. |
#### Returns
**Type:** `FInworldVisemeBlends`
---
### MakeAudio
## Examples
```c++
FInworldData_Audio MakeAudio(
const TArray& Waveform,
int32 SampleRate
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Waveform | `const TArray&` | |
| SampleRate | `int32` | |
#### Returns
**Type:** `FInworldData_Audio`
---
### TrimLLMOutput
Removes leading and trailing quotes as well as getting rid of double spaces and other artifacts of LLM text generation.
## Examples
```c++
void TrimLLMOutput(
const FString& LLMOutputString,
FString& OutTrimmedLLMOutputString
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| LLMOutputString | `const FString&` | The LLM output string to edit |
| OutTrimmedLLMOutputString | `FString&` | The result string with spaces and quotes removed |
---
### UnwrapInworldDataHandle
## Examples
```c++
void UnwrapInworldDataHandle(
const FInworldDataHandle& DataHandle,
UScriptStruct* ScriptStruct,
int32& Data,
bool& UnwrapSuccessful
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| DataHandle | `const FInworldDataHandle&` | |
| ScriptStruct | `UScriptStruct*` | |
| Data | `int32&` | |
| UnwrapSuccessful | `bool&` | |
---
### WrapInworldData
## Examples
```c++
FInworldDataHandle WrapInworldData(
UScriptStruct* ScriptStruct,
const int32& Data
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ScriptStruct | `UScriptStruct*` | |
| Data | `const int32&` | |
#### Returns
**Type:** `FInworldDataHandle`
---
### WrapInworldData
## Examples
```c++
FInworldDataHandle WrapInworldData()
```
#### Returns
**Type:** `FInworldDataHandle`
---
---
#### SDK Reference > Runtime > Assets
#### Inworld Graph Asset
Source: https://docs.inworld.ai/unreal-engine/runtime/runtime-reference/InworldGraphAsset
[Overview](./overview) > Inworld Graph Asset
**Class:** `UInworldGraphAsset` | **Inherits from:** `UObject`
Asset representation of an Inworld graph for serialization and storage. This class serves as the asset representation of an Inworld graph, containing all nodes, edges, and configuration data necessary to create and execute runtime graphs. It provides functionality for graph validation, configuration generation, and runtime graph instance creation.
## Methods
- [GetGraphConfig](#getgraphconfig)
- [GetGraphInstance](#getgraphinstance)
- [GetGraphInstance](#getgraphinstance)
- [GetLevelNum](#getlevelnum)
- [GetNodesByLevel](#getnodesbylevel)
- [IsValidAsSubgraph](#isvalidassubgraph)
- [Print](#print)
## Reference
### GetGraphConfig
Retrieves the graph configuration as a string.
## Examples
```c++
const FString& GetGraphConfig()
```
#### Returns
**Type:** `const FString&`
**Description:** The graph configuration string.
---
### GetGraphInstance
Creates and returns a compiled graph instance asynchronously.
## Examples
```c++
void GetGraphInstance(FOnGraphCompiled Callback)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Callback | `FOnGraphCompiled` | Delegate called when the graph instance is ready. |
---
### GetGraphInstance
## Examples
```c++
void GetGraphInstance()
```
---
### GetLevelNum
Gets the number of levels in the graph hierarchy.
## Examples
```c++
int GetLevelNum()
```
#### Returns
**Type:** `int`
**Description:** The number of levels in the graph.
---
### GetNodesByLevel
Retrieves all nodes at a specific level in the graph hierarchy.
## Examples
```c++
void GetNodesByLevel(
int Level,
TArray& Nodes
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Level | `int` | The level to query for nodes. |
| Nodes | `TArray&` | Output array of nodes at the specified level. |
---
### IsValidAsSubgraph
Checks if this graph asset is valid for use as a subgraph.
## Examples
```c++
bool IsValidAsSubgraph()
```
#### Returns
**Type:** `bool`
**Description:** True if the graph can be used as a subgraph, false otherwise.
---
### Print
Prints debug information about the graph.
## Examples
```c++
void Print(
bool ToConsole,
bool ToScreen
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| ToConsole | `bool` | If true, prints to console output. |
| ToScreen | `bool` | If true, prints to screen overlay. |
---
---
#### SDK Reference > Character
#### Overview
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/overview
Complete documentation for all Inworld Character components, providing AI-driven character functionality, player interactions, and conversation management systems.
## Classes
### Core Character Components
- [**Inworld Base Character Component**](./InworldBaseCharacterComponent/InworldBaseCharacterComponent) - Foundation for all AI-driven characters
- [**Inworld Simple Character Component**](./InworldSimpleCharacterComponent/InworldSimpleCharacterComponent) - Essential character functionality with basic runtime data
- [**Inworld Character Component**](./InworldCharacterComponent/InworldCharacterComponent) - Full-featured AI character with advanced capabilities
### Player Components
- [**Inworld Simple Player Component**](./InworldSimplePlayerComponent/InworldSimplePlayerComponent) - Basic player interaction capabilities
- [**Inworld Player Component**](./InworldPlayerComponent/InworldPlayerComponent) - Full-featured player component with trigger support
### Conversation Systems
- [**Inworld Conversation Group Component**](./InworldConversationGroupComponent/InworldConversationGroupComponent) - Multi-character conversation management
- [**Inworld Conversation Group**](./InworldConversationGroup/InworldConversationGroup) - Core conversation group functionality
### Runtime Data Systems
- [**CharacterProfile Data**](./InworldGraphRuntimeData_CharacterProfile/InworldGraphRuntimeData_CharacterProfile) - Character profile and personality data
- [**EmotionState Data**](./InworldGraphRuntimeData_EmotionState/InworldGraphRuntimeData_EmotionState) - Emotion state tracking and management
- [**Goals Data**](./InworldGraphRuntimeData_Goals/InworldGraphRuntimeData_Goals) - Character objectives and goal management
- [**KnowledgeFilter Data**](./InworldGraphRuntimeData_KnowledgeFilter/InworldGraphRuntimeData_KnowledgeFilter) - Knowledge filtering and access control
- [**RelationState Data**](./InworldGraphRuntimeData_RelationState/InworldGraphRuntimeData_RelationState) - Relationship dynamics and trust management
### Utilities & Systems
- [**Inworld Character BFL**](./InworldCharacterBFL/InworldCharacterBFL) - Blueprint Function Library for character utility functions
- [**Inworld Character Subsystem**](./InworldCharacterSubsystem/InworldCharacterSubsystem) - Character management subsystem
- [**Inworld Node Goals**](./InworldNode_Goals/InworldNode_Goals) - Goal processing workflow node
- [**Inworld Node Dialog Response Processor**](./InworldNode_DialogResponseProcessor/InworldNode_DialogResponseProcessor) - Dialog response processing node
## Key Features
### 🎭 **Advanced AI Characters**
- **Emotion tracking** with dynamic emotional responses
- **Relationship management** including trust, familiarity, and respect
- **Goal-driven behavior** with completion tracking
- **Intent recognition** for understanding player intentions
### 👥 **Multi-Character Conversations**
- **Speaker selection** for group conversations
- **Participant management** with dynamic add/remove
- **Conversation state tracking** across multiple characters
### 🎯 **Player Interaction**
- **Trigger system** for goal-based interactions
- **Custom parameters** for complex conversation control
- **Microphone integration** for voice input
### 📊 **Runtime Data Management**
- **Character profiles** with personality configuration
- **Event history** tracking for conversation context
- **Knowledge systems** with filtering capabilities
- **Voice settings** and speech configuration
## Getting Started
1. **Basic Character**: Start with `InworldSimpleCharacterComponent` for essential AI functionality
2. **Advanced Character**: Use `InworldCharacterComponent` for full emotional and social intelligence
3. **Player Interaction**: Add `InworldPlayerComponent` for trigger-based interactions
4. **Group Conversations**: Use `InworldConversationGroupComponent` for multi-character scenarios
Each component includes detailed API documentation with Blueprint node examples and comprehensive parameter descriptions.
---
#### SDK Reference > Character > Character Components
#### Inworld Base Character Component
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldBaseCharacterComponent/InworldBaseCharacterComponent
[Overview](../overview) > Inworld Base Character Component
**Class:** `UInworldBaseCharacterComponent` | **Inherits from:** `UInworldConversationTargetComponent`
Base abstract class for Inworld character components. This component serves as the foundation for AI-driven characters in the Inworld system. It provides core functionality for character communication, audio playback, conversation management, and integration with the Inworld graph system. Characters can receive and respond to messages, manage conversation targets, and handle speech synthesis.
Key Features:
- Conversation target implementation (IInworldConversationTarget)
- Audio and text message handling
- Speech synthesis and viseme blending
- Player target management
- Character graph integration
- Comprehensive event system for speech and conversation events
This is an abstract base class that must be inherited to create concrete character implementations.
## Methods
- [Get Runtime Data with Type](#getruntimedataandtype)
- [UpdateCharacterFromDataTable](#updatecharacterfromdatatable)
## Reference
### Get Runtime Data with Type
Gets all runtime data and its associated type for this character.

## Examples
```c++
virtual TArray GetRuntimeDataAndType()
```
#### Returns
**Type:** `virtual TArray`
**Description:** Array of FRuntimeDataInfo structures.
---
### UpdateCharacterFromDataTable
Attempts to dynamically update the character's configuration from a Data Table at runtime.

## Examples
```c++
bool UpdateCharacterFromDataTable(
const TSoftObjectPtr& NewCharacterDataTableRef,
FName NewCharacterID
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| NewCharacterDataTableRef | `const TSoftObjectPtr&` | |
| NewCharacterID | `FName` | |
#### Returns
**Type:** `bool`
**Description:** Returns true if the character was successfully updated, false otherwise.
---
---
#### Inworld Character Component
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldCharacterComponent/InworldCharacterComponent
[Overview](../overview) > Inworld Character Component
**Class:** `UInworldCharacterComponent` | **Inherits from:** `UInworldSimpleCharacterComponent`
Full-featured Inworld character component with advanced AI capabilities. This component represents the complete implementation of an Inworld AI character, extending the simple character with advanced features like emotion tracking, relationship states, dynamic knowledge systems, goal management, and trigger support.
Advanced Features:
- Emotion state tracking and emotional responses
- Relationship state management with dynamic character relationships
- Knowledge base and knowledge filtering
- Goal system with completion tracking
- Intent recognition and processing
- Trigger system for scripted interactions
This is the recommended component for complex AI characters requiring sophisticated behavioral AI, emotional intelligence, and advanced conversation capabilities.
## Methods
- [IsInConversationGroup](#isinconversationgroup)
## Reference
### IsInConversationGroup
Checks if this character is currently part of a conversation group.

## Examples
```c++
bool IsInConversationGroup()
```
#### Returns
**Type:** `bool`
**Description:** True if the character is in a conversation group, false otherwise
---
---
#### Inworld Simple Character Component
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldSimpleCharacterComponent/InworldSimpleCharacterComponent
[Overview](../overview) > Inworld Simple Character Component
**Class:** `UInworldSimpleCharacterComponent` | **Inherits from:** `UInworldBaseCharacterComponent`
Simple implementation of Inworld character component with essential runtime data. This component extends the base character functionality by adding core runtime data management including character profile, voice settings, event history, and conversation state. It provides a ready-to-use character implementation with fundamental AI features.
Key Features:
- Character profile and personality configuration
- Voice and speech settings
- Event history tracking for context
- Conversation state management
- Spawnable in Blueprint editor
- Complete basic character functionality
This serves as the foundation for more advanced character implementations while providing all essential features for basic AI character interactions.
## Methods
- [AddHistoryEvent](#addhistoryevent)
- [ClearHistory](#clearhistory)
## Reference
### AddHistoryEvent
Updates the event history by adding a new speech event entry.
This method allows the character's event history to be updated with a new
speech event. The event is captured and stored in the runtime data's event
history, providing context for future interactions.

## Examples
```c++
void AddHistoryEvent(const FInworldEventSpeech& EventHistoryEntry)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| EventHistoryEntry | `const FInworldEventSpeech&` | The speech event entry to be added to the event history. |
---
### ClearHistory
Clears the character's event history data, resetting all stored interactions. This method ensures that the event history runtime data is wiped clean, removing all previously recorded conversation or event entries. It logs an error message if the event history data is null or invalid.

## Examples
```c++
void ClearHistory()
```
---
---
#### SDK Reference > Character > Player Components
#### Inworld Player Component
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldPlayerComponent/InworldPlayerComponent
[Overview](../overview) > Inworld Player Component
**Class:** `UInworldPlayerComponent` | **Inherits from:** `UInworldSimplePlayerComponent`
Full-featured Inworld player component with trigger support. This component extends the simple player implementation by adding the ability to send triggers with custom parameters, enabling more complex interactions and goal-driven conversations with Inworld characters.
Key Features:
- All functionality from UInworldSimplePlayerComponent
- Trigger system for goal-based interactions
- Custom parameter support for triggers
- Advanced conversation control
- Complete player interaction suite
## Methods
- [SendTrigger](#sendtrigger)
## Reference
### SendTrigger
Sends a trigger to initiate goal-driven interactions.
Triggers are used to activate specific character behaviors, goals, or conversation paths.
They can include custom parameters to provide context and customize the interaction.

## Examples
```c++
virtual void SendTrigger(
FName GoalName,
const TMap Parameters
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| GoalName | `FName` | The name of the goal/trigger to activate |
| Parameters | `const TMap` | Key-value pairs of custom parameters to send with the trigger |
#### Returns
**Type:** `virtual void`
---
---
#### Inworld Simple Player Component
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldSimplePlayerComponent/InworldSimplePlayerComponent
[Overview](../overview) > Inworld Simple Player Component
**Class:** `UInworldSimplePlayerComponent` | **Inherits from:** `UInworldBasePlayerComponent`
Simple implementation of Inworld player component with player profile support. This component extends the base player functionality by adding player profile management, which allows for personalized interactions and character behavior based on player data. It overrides core messaging methods to integrate with the player profile system.
Key Features:
- Player profile integration
- Enhanced text and audio messaging with profile context
- Spawnable in Blueprint editor
- Ready-to-use implementation for basic player interactions
---
#### SDK Reference > Character > Conversation Components
#### Inworld Conversation Group
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldConversationGroup/InworldConversationGroup
[Overview](../overview) > Inworld Conversation Group
**Class:** `UInworldConversationGroup` | **Inherits from:** `UObject`
Manages multi-character conversations in the Inworld system. This class orchestrates conversations between multiple AI characters and a player, handling message routing, speaker selection, conversation state management, and event history. It implements the IInworldConversationTarget interface to act as a unified conversation endpoint.
Key Features:
- Multi-character conversation management
- Intelligent speaker selection system
- Rule-based speaker selection for deterministic turn order
- Message broadcasting to all participants
- Conversation state and history tracking
- Player target integration
- Event system for conversation monitoring
- Dynamic participant management
Usage:
- Create via static CreateConversation methods
- Add characters using AddCharacter
- Set player target for interaction
- Configure rule director for deterministic speaker sequences (optional)
- Messages sent to the group are intelligently routed to appropriate characters
- Participants respond based on context and speaker selection algorithms
## Methods
- [AddCharacter](#addcharacter)
- [ClearHistory](#clearhistory)
- [CreateConversation](#createconversation)
- [GetConversationState](#getconversationstate)
- [GetParticipants](#getparticipants)
- [GetPlayerTarget](#getplayertarget)
- [InvokeNextResponse](#invokenextresponse)
- [RemoveCharacter](#removecharacter)
- [RemovePlayerTarget](#removeplayertarget)
- [SetPlayerTarget](#setplayertarget)
- [SetRuleDirector](#setruledirector)
## Reference
### AddCharacter
Adds a character to the conversation group.
The character will receive messages sent to the group and can participate
in speaker selection for responses.

#### Examples
```c++
virtual void AddCharacter(UInworldCharacterComponent* CharacterComponent)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| CharacterComponent | `UInworldCharacterComponent*` | The character component to add to the conversation |
#### Returns
**Type:** `virtual void`
---
### ClearHistory
Clears the group's event history data, resetting all stored interactions.
This method ensures that the event history runtime data is wiped clean,
removing all previously recorded conversation or event entries. It logs an
error message if the event history data is null or invalid.

#### Examples
```c++
void ClearHistory(bool IncludeCharacters)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| IncludeCharacters | `bool` | (optional) If true (false by default), clear the history of all characters in the conversation. |
---
### CreateConversation
Creates a new conversation group asynchronously (Blueprint version).

#### Examples
```c++
void CreateConversation(
UObject* WorldContextObject,
UObject* Owner,
FOnInworldConversationGroupCreated Callback,
UInworldGraphAsset* SpeakerSelectionGraphAsset
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| WorldContextObject | `UObject*` | World context for the conversation |
| Owner | `UObject*` | The object that owns this conversation group |
| Callback | `FOnInworldConversationGroupCreated` | Delegate called when creation is complete |
| SpeakerSelectionGraphAsset | `UInworldGraphAsset*` | (optional) The graph asset for the speaker selection graph |
---
### GetConversationState
Gets the current state of the conversation.

#### Examples
```c++
FInworldConversationState GetConversationState()
```
#### Returns
**Type:** `FInworldConversationState`
**Description:** The conversation state data
---
### GetParticipants
Gets all character participants in the conversation.

#### Examples
```c++
TArray GetParticipants()
```
#### Returns
**Type:** `TArray`
**Description:** Array of character components participating in the conversation
---
### GetPlayerTarget
Gets the current player target for the conversation.

#### Examples
```c++
UInworldBasePlayerComponent* GetPlayerTarget()
```
#### Returns
**Type:** `UInworldBasePlayerComponent*`
**Description:** The player component currently interacting with this conversation
---
### InvokeNextResponse
Manually invokes the next character response in the conversation.

#### Examples
```c++
virtual void InvokeNextResponse(const FString& CharacterId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| CharacterId | `const FString&` | Optional specific character ID to respond. If empty, uses speaker selection. |
#### Returns
**Type:** `virtual void`
---
### RemoveCharacter
Removes a character from the conversation group.

#### Examples
```c++
virtual void RemoveCharacter(FString CharacterId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| CharacterId | `FString` | The unique ID of the character to remove |
#### Returns
**Type:** `virtual void`
---
### RemovePlayerTarget
Removes the current player target from this conversation group.
#### Examples
```c++
void RemovePlayerTarget()
```
---
### SetPlayerTarget
Sets the player target for this conversation group.
#### Examples
```c++
void SetPlayerTarget()
```
---
### SetRuleDirector
Sets the rule director for deterministic speaker selection in this conversation group. The rule director allows you to override the AI-driven speaker selection with a predefined sequence of speakers, useful for turn-based games or scripted dialogue scenarios.
For information on configuring the Rule Director property in the editor, see the [Inworld Conversation Group Component](../InworldConversationGroupComponent/InworldConversationGroupComponent#ruledirector) documentation.
#### Examples
```c++
void SetRuleDirector(USpeakerRuleDirector* InRuleDirector)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InRuleDirector | `USpeakerRuleDirector*` | The rule director instance to use for speaker selection. This is typically set in the editor on the component. |
---
---
#### Inworld Conversation Group Component
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldConversationGroupComponent/InworldConversationGroupComponent
[Overview](../overview) > Inworld Conversation Group Component
**Class:** `UInworldConversationGroupComponent` | **Inherits from:** `UInworldConversationTargetComponent`
Component wrapper for managing multi-character conversation groups. This component provides an actor component interface to UInworldConversationGroup, allowing easy integration of multi-character conversations into actors. It manages participant addition/removal, speaker selection, and conversation state while implementing the conversation target interface.
Key Features:
- Multi-character conversation management
- Speaker selection graph integration
- Rule-based speaker selection for deterministic turn order
- Participant management (add/remove characters)
- Event history tracking
- Blueprint accessible conversation control
## Methods
- [AddCharacter](#addcharacter)
- [ClearHistory](#clearhistory)
- [GetConversationState](#getconversationstate)
- [GetParticipants](#getparticipants)
- [GetPlayerTarget](#getplayertarget)
- [InvokeNextResponse](#invokenextresponse)
- [RemoveCharacter](#removecharacter)
## Properties
### RuleDirector
Optional director object to manage rule-based, deterministic speaker selection. If assigned, this will be checked before the `SpeakerSelectionGraphAsset` for determining the next speaker.
Configure this property in the editor's Details panel to set up predefined speaker sequences for turn-based games or scripted dialogue scenarios.
## Reference
### AddCharacter
Adds a character to the conversation group.
The character will receive messages sent to the group and can participate
in speaker selection for responses.

## Examples
```c++
virtual void AddCharacter(UInworldCharacterComponent* CharacterComponent)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| CharacterComponent | `UInworldCharacterComponent*` | The character component to add to the conversation |
#### Returns
**Type:** `virtual void`
---
### ClearHistory
Clears the group's event history data, resetting all stored interactions.
This method ensures that the event history runtime data is wiped clean,
removing all previously recorded conversation or event entries. It logs an
error message if the event history data is null or invalid.

## Examples
```c++
void ClearHistory(bool IncludeCharacters)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| IncludeCharacters | `bool` | (optional) If true (false by default), clear the history of all characters in the conversation. |
---
### GetConversationState
Gets the current state of the conversation.

## Examples
```c++
FInworldConversationState GetConversationState()
```
#### Returns
**Type:** `FInworldConversationState`
**Description:** The conversation state data
---
### GetParticipants
Gets all character participants in the conversation.

## Examples
```c++
TArray GetParticipants()
```
#### Returns
**Type:** `TArray`
**Description:** Array of character components participating in the conversation
---
### GetPlayerTarget
Gets the current player target for the conversation.

## Examples
```c++
UInworldBasePlayerComponent* GetPlayerTarget()
```
#### Returns
**Type:** `UInworldBasePlayerComponent*`
**Description:** The player component currently interacting with this conversation
---
### InvokeNextResponse
Manually invokes the next character response in the conversation.

## Examples
```c++
virtual void InvokeNextResponse(const FString& CharacterId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| CharacterId | `const FString&` | Optional specific character ID to respond. If empty, uses speaker selection. |
#### Returns
**Type:** `virtual void`
---
### RemoveCharacter
Removes a character from the conversation group.

## Examples
```c++
virtual void RemoveCharacter(FString CharacterId)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| CharacterId | `FString` | The unique ID of the character to remove |
#### Returns
**Type:** `virtual void`
---
---
#### SDK Reference > Character > Character Subsystem
#### Inworld Character Subsystem
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldCharacterSubsystem/InworldCharacterSubsystem
[Overview](../overview) > Inworld Character Subsystem
**Class:** `UInworldCharacterSubsystem` | **Inherits from:** `UWorldSubsystem`
World subsystem for managing Inworld character interactions and conversations. This subsystem provides centralized management of Inworld conversation targets, character registration, and spatial queries for finding conversation partners. It tracks all active conversation targets in the world and provides utilities for distance and angle-based character selection.
## Methods
- [GetCharacterData](#getcharacterdata)
- [GetClosestConversationTarget](#getclosestconversationtarget)
- [GetClosestConversationTargetToConversationTarget](#getclosestconversationtargettoconversationtarget)
- [GetClosestConversationTargetToReticle](#getclosestconversationtargettoreticle)
- [GetConversationTarget](#getconversationtarget)
- [GetConversationTargetMap](#getconversationtargetmap)
## Reference
### GetCharacterData

## Examples
```c++
bool GetCharacterData(
UDataTable* DataTableToSearch,
const FName& CharacterID,
FInworldCharacterConfig& OutData
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| DataTableToSearch | `UDataTable*` | |
| CharacterID | `const FName&` | |
| OutData | `FInworldCharacterConfig&` | |
#### Returns
**Type:** `bool`
---
### GetClosestConversationTarget
## Examples
```c++
UInworldConversationTargetComponent* GetClosestConversationTarget(
const FTransform& Transform,
float MaxDistance,
float ToleranceAngle
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Transform | `const FTransform&` | |
| MaxDistance | `float` | |
| ToleranceAngle | `float` | |
#### Returns
**Type:** `UInworldConversationTargetComponent*`
---
### GetClosestConversationTargetToConversationTarget

## Examples
```c++
UInworldConversationTargetComponent* GetClosestConversationTargetToConversationTarget(
UInworldConversationTargetComponent* TargetCharacter,
float MaxDistance,
float ToleranceAngle
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| TargetCharacter | `UInworldConversationTargetComponent*` | |
| MaxDistance | `float` | |
| ToleranceAngle | `float` | |
#### Returns
**Type:** `UInworldConversationTargetComponent*`
---
### GetClosestConversationTargetToReticle

## Examples
```c++
UInworldConversationTargetComponent* GetClosestConversationTargetToReticle(
APlayerController* PlayerController,
float MaxDistance
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| PlayerController | `APlayerController*` | |
| MaxDistance | `float` | |
#### Returns
**Type:** `UInworldConversationTargetComponent*`
---
### GetConversationTarget
Retrieves a conversation target by its unique identifier.

## Examples
```c++
UInworldConversationTargetComponent* GetConversationTarget(const FString& Id)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Id | `const FString&` | The unique identifier of the conversation target to find |
#### Returns
**Type:** `UInworldConversationTargetComponent*`
**Description:** The conversation target with the specified ID, or nullptr if not found
---
### GetConversationTargetMap
Retrieves the complete map of all registered conversation targets.

## Examples
```c++
const TMap& GetConversationTargetMap()
```
#### Returns
**Type:** `const TMap&`
**Description:** Reference to the map of conversation targets (ID -> Component)
---
---
#### SDK Reference > Character > Blueprint Function Library
#### Inworld Character BFL
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldCharacterBFL/InworldCharacterBFL
[Overview](../overview) > Inworld Character BFL
**Class:** `UInworldCharacterBFL` | **Inherits from:** `UBlueprintFunctionLibrary`
Blueprint Function Library for Inworld Character utility functions. This class provides Blueprint-accessible utility functions for working with Inworld character data structures, including break functions for complex data types like relations and emotions.
## Methods
- [BreakInworldEmotion](#breakinworldemotion)
- [BreakInworldRelation](#breakinworldrelation)
## Reference
### BreakInworldEmotion

## Examples
```c++
void BreakInworldEmotion(
const FInworldEmotion& Emotion,
EInworldEmotionLabel& Label,
FString& LabelString
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Emotion | `const FInworldEmotion&` | |
| Label | `EInworldEmotionLabel&` | |
| LabelString | `FString&` | |
---
### BreakInworldRelation

## Examples
```c++
void BreakInworldRelation(
const FInworldRelation& Relation,
int32& Trust,
int32& Respect,
int32& Familiar,
int32& Flirtatious,
int32& Attraction,
EInworldRelationLabel& Label,
FString& LabelString
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Relation | `const FInworldRelation&` | |
| Trust | `int32&` | |
| Respect | `int32&` | |
| Familiar | `int32&` | |
| Flirtatious | `int32&` | |
| Attraction | `int32&` | |
| Label | `EInworldRelationLabel&` | |
| LabelString | `FString&` | |
---
---
#### SDK Reference > Character > Graph Runtime Data
#### Inworld Graph Runtime Data Character Profile
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldGraphRuntimeData_CharacterProfile/InworldGraphRuntimeData_CharacterProfile
[Overview](../overview) > Inworld Graph Runtime Data Character Profile
**Class:** `UInworldGraphRuntimeData_CharacterProfile` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data component for managing character personality and dialogue style. This runtime data component stores and provides access to a character's profile, including personality traits, backstory, dialogue style, and behavioral characteristics. The profile data is used by LLM nodes to generate contextually appropriate and character-consistent responses during conversations.
## Methods
- [Set](#set)
## Reference
### Set
Updates the character profile with new data.

## Examples
```c++
void Set(const FInworldCharacterProfile& InCharacterProperties)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InCharacterProperties | `const FInworldCharacterProfile&` | The new character profile data to store. |
---
---
#### Inworld Graph Runtime Data Emotion State
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldGraphRuntimeData_EmotionState/InworldGraphRuntimeData_EmotionState
[Overview](../overview) > Inworld Graph Runtime Data Emotion State
**Class:** `UInworldGraphRuntimeData_EmotionState` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data component for managing character emotional states. This runtime data component tracks and manages a character's current emotional state, which can influence their behavior, dialogue tone, and responses during interactions. Emotion states can be explicitly set or derived from AI-generated responses.
## Methods
- [GetEmotion](#getemotion)
- [GetEmotionLabel](#getemotionlabel)
- [GetEmotionLabelString](#getemotionlabelstring)
- [SetEmotionLabel](#setemotionlabel)
- [UpdateEmotion](#updateemotion)
## Reference
### GetEmotion
Retrieves the current emotion state.

## Examples
```c++
FInworldEmotion GetEmotion()
```
#### Returns
**Type:** `FInworldEmotion`
**Description:** The current emotion state structure.
---
### GetEmotionLabel
Retrieves the current emotion label enum value.

## Examples
```c++
EInworldEmotionLabel GetEmotionLabel()
```
#### Returns
**Type:** `EInworldEmotionLabel`
**Description:** The current emotion label enum.
---
### GetEmotionLabelString
Retrieves the emotion label as a human-readable string.

## Examples
```c++
FString GetEmotionLabelString()
```
#### Returns
**Type:** `FString`
**Description:** String representation of the current emotion label.
---
### SetEmotionLabel
Sets the character's emotion to a specific label.

## Examples
```c++
void SetEmotionLabel(const EInworldEmotionLabel& InEmotionLabel)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InEmotionLabel | `const EInworldEmotionLabel&` | The new emotion label to set. |
---
### UpdateEmotion
Updates the emotion state by parsing an AI-generated text response.
## Examples
```c++
void UpdateEmotion()
```
---
---
#### Inworld Graph Runtime Data Goals
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldGraphRuntimeData_Goals/InworldGraphRuntimeData_Goals
[Overview](../overview) > Inworld Graph Runtime Data Goals
**Class:** `UInworldGraphRuntimeData_Goals` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data component for managing character goals and intent-driven behaviors. This runtime data component stores and manages a character's goals, which define intent-based triggers and responses. Goals can be activated through dialogue intents or explicit triggers, enabling dynamic character behaviors based on conversation context. The component also supports dynamic parameter resolution through the IInworldGoalParameterInterface.
## Methods
- [AddGoal](#addgoal)
- [ClearAllGoals](#clearallgoals)
- [CompleteGoal](#completegoal)
- [GetGoalsMapKeys](#getgoalsmapkeys)
- [GetParameterSourceObject](#getparametersourceobject)
- [RemoveGoal](#removegoal)
- [Set](#set)
- [SetParameterSourceObject](#setparametersourceobject)
- [TryGetGoal](#trygetgoal)
- [TryGetGoalFromIntent](#trygetgoalfromintent)
## Reference
### AddGoal
Adds a goal to the goals map.
## Examples
```c++
void AddGoal(
FName Name,
const FInworldGoal& Goal
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Name | `FName` | |
| Goal | `const FInworldGoal&` | The goal to add to the goals map. |
---
### ClearAllGoals
Removes all goals from the goals map.

## Examples
```c++
void ClearAllGoals()
```
---
### CompleteGoal
Removes a goal from the goals map if it is not repeatable.
## Examples
```c++
void CompleteGoal(FName GoalName)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| GoalName | `FName` | The name of the goal to complete. |
---
### GetGoalsMapKeys
Retrieves all goal names currently stored in the goals map.

## Examples
```c++
TArray GetGoalsMapKeys()
```
#### Returns
**Type:** `TArray`
**Description:** Array of goal names (keys from the goals map).
---
### GetParameterSourceObject
Returns a pointer to the parameter source object.
## Examples
```c++
IInworldGoalParameterInterface* GetParameterSourceObject()
```
#### Returns
**Type:** `IInworldGoalParameterInterface*`
**Description:** A pointer to the parameter source object.
---
### RemoveGoal
Removes a goal from the goals map.
## Examples
```c++
void RemoveGoal(FName GoalName)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| GoalName | `FName` | The name of the goal to remove. |
---
### Set
Sets the goals data table and loads its contents into the goals map.

## Examples
```c++
void Set(UDataTable* InGoalTable)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| InGoalTable | `UDataTable*` | The data table containing goal definitions. |
---
### SetParameterSourceObject
Register a UObject that implements IInworldGoalParameterInterface as the parameter source. When a goal is "text"
triggered (meaning triggered by FInworldData_Text and NOT FInworldData_Trigger), the source object will be
queried for parameter data.

## Examples
```c++
void SetParameterSourceObject(UObject* Source)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Source | `UObject*` | The object to register as the parameter source. It must implement the |
---
### TryGetGoal
Returns reference to an Inworld Goal by name.
## Examples
```c++
bool TryGetGoal(
FName GoalName,
FInworldGoal& GoalOut
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| GoalName | `FName` | Name of the goal to retrieve. |
| GoalOut | `FInworldGoal&` | Reference to store the retrieved goal in. |
#### Returns
**Type:** `bool`
**Description:** True if a goal exists with the given name, false if not.
---
### TryGetGoalFromIntent
Returns reference to the first Inworld Goal in the goals map which has the specified intent.
## Examples
```c++
bool TryGetGoalFromIntent(
FName IntentName,
FInworldGoal& GoalOut
)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| IntentName | `FName` | Name of the intent. |
| GoalOut | `FInworldGoal&` | Reference to store the retrieved goal in. |
#### Returns
**Type:** `bool`
**Description:** True if a goal exists with the given intent, false if not.
---
---
#### Inworld Graph Runtime Data Knowledge Filter
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldGraphRuntimeData_KnowledgeFilter/InworldGraphRuntimeData_KnowledgeFilter
[Overview](../overview) > Inworld Graph Runtime Data Knowledge Filter
**Class:** `UInworldGraphRuntimeData_KnowledgeFilter` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data component for managing character knowledge filtering. This runtime data component stores and manages knowledge filter settings that constrain the amount and scope of knowledge available to a character during conversations. Knowledge filters can be used to limit responses based on character awareness, permissions, or narrative requirements.
## Methods
- [GetKnowledgeFilter](#getknowledgefilter)
- [Set](#set)
## Reference
### GetKnowledgeFilter
Retrieves the current knowledge filter configuration.

## Examples
```c++
const FInworldKnowledgeFilter& GetKnowledgeFilter()
```
#### Returns
**Type:** `const FInworldKnowledgeFilter&`
**Description:** Const reference to the knowledge filter settings.
---
### Set
Updates the knowledge filter configuration.

## Examples
```c++
void Set()
```
---
---
#### Inworld Graph Runtime Data Relation State
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldGraphRuntimeData_RelationState/InworldGraphRuntimeData_RelationState
[Overview](../overview) > Inworld Graph Runtime Data Relation State
**Class:** `UInworldGraphRuntimeData_RelationState` | **Inherits from:** `UInworldGraphRuntimeData`
Runtime data component for managing character relationship states. This runtime data component tracks and manages the relationship state between a character and another entity (typically the player). Relationship states include multi-dimensional metrics like trust, respect, familiarity, flirtation, and attraction that influence how characters interact and respond in conversations.
## Methods
- [GetRelation](#getrelation)
- [GetRelationshipLabel](#getrelationshiplabel)
- [GetRelationshipLabelString](#getrelationshiplabelstring)
- [UpdateRelation](#updaterelation)
- [UpdateRelation](#updaterelation)
## Reference
### GetRelation
Retrieves the complete relationship state.

## Examples
```c++
FInworldRelation GetRelation()
```
#### Returns
**Type:** `FInworldRelation`
**Description:** The full relationship state structure with all metrics.
---
### GetRelationshipLabel
Retrieves the primary relationship label derived from the metrics.

## Examples
```c++
EInworldRelationLabel GetRelationshipLabel()
```
#### Returns
**Type:** `EInworldRelationLabel`
**Description:** The relationship label enum (e.g., FRIENDLY, HOSTILE, NEUTRAL).
---
### GetRelationshipLabelString
Retrieves the relationship label as a human-readable string.

## Examples
```c++
FString GetRelationshipLabelString()
```
#### Returns
**Type:** `FString`
**Description:** String representation of the relationship label.
---
### UpdateRelation
Updates the relationship state with new relation data.

## Examples
```c++
void UpdateRelation(const FInworldRelation& Relation)
```
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| Relation | `const FInworldRelation&` | The new relationship state to apply. |
---
### UpdateRelation
Updates the relationship state by parsing an AI-generated text response.

## Examples
```c++
void UpdateRelation()
```
---
---
#### SDK Reference > Character > Nodes
#### Inworld Node Dialog Response Processor
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldNode_DialogResponseProcessor/InworldNode_DialogResponseProcessor
[Overview](../overview) > Inworld Node Dialog Response Processor
**Class:** `UInworldNode_DialogResponseProcessor` | **Inherits from:** `UInworldNode_GeneralTextProcessor`
A text processing node responsible for post-processing a character's dialog response.
**Input Types:**
- `FInworldData_DataStream_String`
**Output Types:**
- `FInworldData_DataStream_String`
---
#### Inworld Node Goals
Source: https://docs.inworld.ai/unreal-engine/runtime/character-reference/InworldNode_Goals/InworldNode_Goals
[Overview](../overview) > Inworld Node Goals
**Class:** `UInworldNode_Goals` | **Inherits from:** `UInworldNode_Custom`
Node that processes and advances character goals based on intents or explicit triggers.
**Input Types:**
- `FInworldData_MatchedIntents OR FInworldCharacterTrigger OR FInworldCharacterAction`
**Output Types:**
- `FInworldCharacterAction`
---
#### SDK Reference
#### Assets
Source: https://docs.inworld.ai/unreal-engine/runtime/templates/assets
The InworldAssets plugin is a lightweight content library used by Runtime templates. It ships meshes, materials, and textures for quick prototyping. It does not include gameplay logic or example levels.
## Where to find
1. In the Content Browser, open the Settings menu.
2. Enable "Show Plugin Content".
3. Navigate to `/Plugins/InworldAssets`.
Do not edit plugin assets in place. Copy what you need into your project (`/Game/...`) and modify the copies so you can update the plugin safely later.
## What's included
Folder structure and representative assets:
- Animals
- Bird
- Meshes:
- Static `SM_Pigeon_L`
- Skeleton: `SM_Pigeon_L_SM_Pigeon_L`
- Materials: `M_pigeon`
- Textures: `T_pigeon_AlbedoTransparency`, `T_pigeon_Normal`
- Physics/Skeleton: `SM_Pigeon_L_SM_Pigeon_L_PhysicsAsset`, `SM_Pigeon_L_SM_Pigeon_L_Skeleton`
- Cat
- Meshes:
- Static `SM_Cat_L`
- Skeleton: `SM_Cat_L_SM_Cat_L`
- Materials: `M_cat`
- Textures: `T_cat_AlbedoTransparency`, `T_cat_Normal`
- Physics/Skeleton: `SM_Cat_L_SM_Cat_L_PhysicsAsset`, `SM_Cat_L_SM_Cat_L_Skeleton`
- Dog
- Meshes:
- Static `SM_Dog_L`
- Skeleton: `SM_Dog_L_SM_Dog_L`
- Materials: `M_dog`
- Textures: `T_dog_AlbedoTransparency`, `T_dog_Normal`
- Physics/Skeleton: `SM_Dog_L_SM_Dog_L_PhysicsAsset`, `SM_Dog_L_SM_Dog_L_Skeleton`
- Building
- Door
- Meshes: `SM_Door_L`
- Materials: `M_wall_door`
- Textures: `wall_door_AlbedoTransparency`, `wall_door_Normal`
- Wall
- Meshes:
`SM_Wall_L`
`SM_Wall_Door_L`
`SM_HalfWall_L`
- Materials: `M_wall`, `M_halfWall`
- Textures: `wall_*`, `halfWall_*`
- Windows
- Meshes:
`SM_WallWindow1_L`
`SM_WallWindow2_L`
- Materials: `M_glass`, `M_wallWindow`, `M_wallWindow1`
- Textures: `wallWindow1_*`, `wallWindow2_*`
- Objects
- Bulb
- Meshes: `SM_Bulb_L`
- Materials: `M_bulb_emissive`, `M_bulb_glass`, `M_bulb_tin`
- Textures: `T_bulb_tin_AlbedoTransparency`, `T_bulb_tin_Normal`
- Button
- Meshes: `SM_Button_L`, `SM_Button_Base_L`
- Materials: `M_button`
- Textures: `button_AlbedoTransparency`, `button_Normal`
- Lever (single/double)
- Meshes:
`SM_Lever_Single_L`
`SM_Lever_Single_Base_L`
`SM_Lever_Double_L`
`SM_Lever_Double_Base_L`
- Materials: `M_lever`, `M_lever_double`
- Textures: `lever_*`
- Rocks (1–3)
- Meshes:
`SM_Rock_1_L`
`SM_Rock_2_L`
`SM_Rock_3_L`
- Materials: `M_rock_1`, `M_rock_2`, `M_rock_3`
- Textures: `T_rock_?_AlbedoTransparency`, `T_rock_?_Normal`
- Shapes
- Cone: `SM_Cone_L`, `M_cone`, `cone_*`
- Cube: `SM_Cube_L`, `M_cube`, `cube_*`
- Cylinder (Regular/Tall/Wide):
`SM_Cylinder50cmTall_L`
`SM_Cylinder1mTall_L`
`SM_CylinderWide_L`
- Icosahedron: `SM_Icosahedron_L`, `M_icosahedron`
- Octahedron: `SM_Octahedron_L`, `M_octahedron`
- Plane: `SM_FlatPlane_L`, `M_flatPlane`, `flatPlane_*`
- Pyramid: `SM_Pyramid_L`, `M_pyramid`
- Torus: `SM_Torus_L`, `M_torus`, `torus_*`
- Switch
- Meshes: `SM_Switch_L`, `SM_Switch_Base_L`
- Materials: `M_switch`
- Textures: `switch_*`
- Trees (1–3)
- Meshes:
`SM_Tree_1_L`
`SM_Tree_2_L`
`SM_Tree_3_L`
- Materials: `M_tree_1`, `M_tree2`, `M_tree3`
- Textures: `T_tree?_AlbedoTransparency`, `T_tree?_Normal`
- Weapons
- Gun: `SM_Gun_L`, `M_gun`, `T_gun_*`
- Sword: `SM_Sword_L`, `M_sword`, `T_sword_*`
## Naming conventions
- Meshes: `SM_*`
- Materials: `M_*`
- Textures: `T_*` (e.g., `*AlbedoTransparency`, `*Normal`)
- Physics assets: `*_PhysicsAsset`
- Skeletons: `*_Skeleton`
## How to use
- Drag meshes (SM_*) into your level; default materials are already assigned.
- Create Material Instances from base materials (M_*) to tweak colors/roughness without editing plugin assets.
- Replace textures (T_*) in your instances to reskin objects quickly.
- For characters/animals, physics and skeleton assets are included; retarget or extend in your project as needed.
## Troubleshooting
- Can't see the content: enable "Show Plugin Content" and ensure the plugin is enabled in `Edit > Plugins`.
- Materials not showing: verify the mesh uses the expected material slot or apply your Material Instance.
- Want to customize: copy assets to `/Game/...` first; avoid editing in `/Plugins/...`.
---
#### Resources
#### Release Notes
Source: https://docs.inworld.ai/release-notes/runtime-unreal
*Coming soon*
---
#### Authentication
Source: https://docs.inworld.ai/unreal-engine/runtime/authentication
The Runtime SDK uses API keys to authenticate requests to Inworld's server.
## Getting an API key
The Runtime SDK currently only supports Basic authorization, although you can use JWT authentication with our standalone Model APIs.
Make sure to keep your Base64 credentials safe, as anyone with your credentials can make requests on your behalf. It is recommended that credentials are stored as environment variables and read at run time.
Do not expose your Base64 API credentials in client-side code (browsers, apps, game builds), as it may be compromised. Runtime SDK support for JWT authentication is coming soon.
---
#### Support
Source: https://docs.inworld.ai/unreal-engine/runtime/resources/support
---
### Unity
#### Get Started
#### Overview
Source: https://docs.inworld.ai/Unity/runtime/overview
## Get Started
Get started with the Unity AI Runtime.
Create and chat with an AI character with Unity AI Runtime.
## Watch it in Action
## Explore
Explore demos showcasing the use of individual primitive modules, such as LLM, Speech-to-Text (STT), Text-to-Speech (TTS), etc.
See how each graph, edge and official node works in Unity’s Graph Node demos, with ready-made graphs to explore.
Get started building with Graph Editor in Unity AI Runtime.
Explore the Runtime reference for class definitions and functions.
---
#### Quickstart
Source: https://docs.inworld.ai/Unity/runtime/get-started
This guide covers setup and installation for Inworld’s Unity AI Runtime SDK.
## Prerequisites
Before you get started, please make sure you have the following installed:
* Windows or macOS (For iOS build, mac is required).
* [Unity](https://unity.com/download): Version 2022.3 LTS or later (recommended: 6000.0.41 or newer)
We now supports windows, mac, Android, and iOS!
## Installation
Download the Inworld Agent Runtime .unitypackage from here.
If you don't already have an existing Unity project, create a new project.
You can simply drag the downloaded .unitypackage into your project,
or, in your project, go to **Assets > Import Package > Custom Package...**, and select the .unitypackage you downloaded in Step 1.
In the dialog box that pops up, click **Import** to finish importing the package.
After the package is imported, you may see a prompt to restart the Unity Editor.
If you're using Unity 6000 or newer, click **Yes** — no manual restart needed.
Otherwise, click **No** and wait for the remaining assets to finish importing.
After you click **OK**, the package will install files into your project.
Once those are imported, we will automatically download essential assets (Plugins and essential models) for you.
After essential assets are downloaded, a dialog will appear prompting you to enter your API key.
Once you've entered it, the system will check the current essential assets. If they are downloaded, they will appear as installed, as shown below.
To use the Inworld Agent Runtime SDK with the cloud APIs, go to Assets/InworldRuntime/Resources.
In the Inspector, add your Base64-encoded [Runtime API key](/node/authentication#runtime-api-key).
To get a first impression, we provide several demos. Any of the following will work:
* [Assets/InworldRuntime/Scenes/Primitives/CharacterInteraction.unity](./demos/primitives/character)
* [Assets/InworldRuntime/Scenes/Nodes/CharacterInteractionNode.unity](./demos/nodes/character)
* [Assets/InworldRuntime/Scenes/Nodes/CharacterInteractionNodeWithJson.unity](./demos/nodes/json)
The demos above demonstrate a simple character interaction using our LLM, speech-to-text (STT), and text-to-speech (TTS) primitives.
When you play a demo, you can converse with the character via text or audio (audio is push-to-talk only).
Type your input, or hold the Record button to capture audio and release it to send.
For **Primitives/CharacterInteraction:** You will also need to set the character’s name, gender, role, voice, description, and motivation.
When done, click **Proceed**. This creates a character for the duration of the session (it will not be saved).
For the other two demos, the character data and prompt are stored under `Assets/InworldRuntime/Data/GraphNodes/CharacterConversation/SampleCharacter`.
You can modify those ScriptableObjects to see the effect.
Here's the video about how it works.
With these steps, you'll be all set to explore the capabilities of the Inworld Agent Runtime\!
## Next steps
You can check out the following demos for an overview of the features and modules available in the Inworld Unity AI Runtime.
Explore demos showcasing the use of individual primitive modules, such as LLM, Speech-to-Text (STT), Text-to-Speech (TTS), etc.
See how each graph, edge and official node works in Unity's Graph Node demos, with ready-made graphs to explore.
---
#### Build your own character
Source: https://docs.inworld.ai/Unity/runtime/build-your-own
## Overview
In this tutorial, let's focus on creating characters by updating the related data.
In this overview, let's focus on the graph creating, so we will use the current default GraphNodeTemplate in the demos.
Let's clone a scene from `Assets/InworldRuntime/Scenes/Nodes/CharacterInteractionNode.unity`.
Let's rename the cloned one whatever you want. i.e; `Demo`, and open it.
Now let's test this scene by pressing the `Play` button.
You are able to type to the character, and you can hold the record button to record, and release that button to send the audio.
Please notice that our default character is Harry Potter. In this page we will replace that later.
Go to `CharacterInteractionCanvas`, find the `InworldGraphExecutor` in the inspector.
Click the graph to navigate to the current using graph asset. Clone it.
Let's rename it as `DemoGraphAsset`.
Then let's replace it with the previous asset.
Go to the `DemoGraphAsset` that you just created.
Under the `CharacterInteractionData`, you can see the default character asset.
Double click to open it.
You can see that's the data showcased is related to Harry Potter.
Navigate to the current `CharacterAsset`, Clone it.
Then rename it as `DemoCharacter`.
Replace it to the `DemoGraph` as its current asset.
Now let's test this scene by pressing the `Play` button.
You can see that the character is now Mickey Mouse, and he can act like a Mickey Mouse now.
---
#### Examples & Demos
#### Primitives and Node Demos
Source: https://docs.inworld.ai/Unity/runtime/demos/overview
The Inworld Agent Runtime provides two ways to interact with Inworld AI modules.
## Primitives
You can use API calls to control these modules directly. This approach suits users who prefer to write code.
For more information about the primitive modules, please check [this page](./primitives/overview)
We've provided demos for each of the following primitives:
Interact directly with the Large Language Model (LLM) primitive.
Transcribe audio input to text using the Speech-to-Text primitive.
Generate audio from text via the Text-to-Speech primitive.
Try out Acoustic Echo Cancellation with the AEC module.
Query and interact with the Knowledge primitive.
Combine primitives to build an interactive character demo.
## Nodes
You can also use Inworld's Graph system to compose nodes and send or receive data in the `InworldBaseData` format via the graph.
We provide demos for node-based integrations.
For more information about the Graph/Node System, please check [this page](./nodes/overview)
Some features are only supported in the node-based form.
See a demonstration of the LLM node working within a graph workflow.
Try out Speech-to-Text node integration in a graph.
Explore Text-to-Speech node usage in a sample graph.
See how the Intent node processes and routes input data.
Experiment with safety checks using the Safety node.
Learn how to use and implement a custom node in the graph system.
Inspect data flow in graphs by examining edges between nodes.
See how to create loops and iterative processes using edges.
Experience a full conversation pipeline using node-based character interaction.
Try the JSON-based character graph template for cross-platform sharing.
They behave the same at runtime.
The difference is that using the compiled JSON makes it easier to share graphs across platforms (Node.js/Unreal).
---
#### Examples & Demos > Primitive Demos
#### Primitives Overview
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/overview
The Primitive demos call the runtime APIs directly to interact with individual modules.
Interact with the Large Language Model (LLM) primitive directly.
Try out the Speech-to-Text primitive for transcribing audio input to text.
Generate audio output from text using the Text-to-Speech primitive.
Explore Acoustic Echo Cancellation with the AEC primitive module.
Query and interact with the Knowledge primitive.
Experience a full pipeline by combining primitives in the Character Interaction demo.
## InworldController
`InworldController` is the main GameObject in the Unity scene.
Each primitive module is organized as a child under this object.
At runtime, `InworldController` executes modules in batches according to the configured loading process.
### Loading Process
Some modules depend on others having already started.
For example, `Knowledge`, `Safety`, and `Intent` require `TextEmbedder` to be initialized first.
To handle this, `InworldController` maintains a list of loading batches.
A batch only starts once the previous batch has fully completed.
After all batches have finished, the controller raises the event `OnFrameworkInitialized?.Invoke()`.
Any objects subscribed to this event can then proceed with their own initialization.
## Primitive Modules
Each module has a prefab stored at `Assets/InworldRuntime/Prefabs/Primitives`.
These modules expose API methods for you to call, and they communicate directly with our graph system through DLL interop.
Primitive modules are also required in Graph/Node templates.
Make sure to include the relevant primitives when working with graph or node-based setups.
### Structure
Each module class includes a `Factory`, an `Interface`, and at least one `Config`.
When `InworldController` starts, each module creates its `Factory`.
The factory then creates the appropriate `Interface` based on the active `Config` (for example, Local vs. Remote, whether to use CUDA, etc.).
The `Interface` is the surface that ultimately interacts with the user or with the Graph.
For example, when you call `InworldController.LLM.GenerateTextAsync()`, the controller first verifies that the LLM module exists and that its interface is initialized, then it calls `LLMInterface.GenerateText`.
The same is true for the Node System.
Each node (for example, `LLMNode`) requires an `LLMInterface` when `CreateRuntime()` is called to instantiate its runtime handle.
---
#### Acoustic Echo Cancellation (AEC) Primitive Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/aec
The Acoustic Echo Cancellation (AEC) template demonstrates how to use our AEC primitive to filter out speaker echo.
Without AEC, if you're not using headphones, the character's voice may be fed back into the STT input.
The AEC module works only with the local model and uses CPU processing only.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Primitives` and play the `AECTemplate` scene.
2. When the game starts, play the two example audio clips (`Farend` and `Nearend`).
The far-end audio comes from the speaker; the near-end audio is captured by the microphone.
3. Press `Generate` to produce the filtered audio.
4. Then press `Play` to hear the result.
## Understanding the Template
### Structure
- This demo has only one prefab under `InworldController`: `AEC`. It contains `InworldAECModule`.
- When `InworldController` initializes, it calls `InitializeAsync()` on the AEC module (see [Primitives Overview](./overview)).
- This creates an `AECFactory`, which then creates an `AECInterface` based on the current `AECConfig`.
### Workflow
Pressing the `Generate` button invokes `AECCanvas.FilterAudio()`.
It first converts the two audio clips (`Farend` and `Nearend`) into `AudioChunks`, then calls `InworldController.AEC.FilterAudio()` to generate the filtered audio.
```c# AECCanvas.cs
public void FilterAudio()
{
AudioChunk farendChunk = WavUtility.GenerateAudioChunk(m_Farend);
AudioChunk nearendChunk = WavUtility.GenerateAudioChunk(m_Nearend);
m_FilteredChunk = InworldController.AEC.FilterAudio(nearendChunk, farendChunk);
if (m_FilteredChunk == null)
return;
if (m_CompleteText)
m_CompleteText.text = "Audio Generated!";
if (m_PlayFilteredButton)
m_PlayFilteredButton.interactable = true;
}
```
## Tips for Better AEC Performance
When filtering audio, always use the data taken directly from the speaker output rather than the recorded raw audio.
Playback characteristics vary by device, and even small timing or amplitude differences can significantly affect the AEC algorithm's output.
This is especially noticeable on laggy devices: the output audio may be delayed or slightly retimed.
That retimed version is exactly what you should use as the far-end reference, not the original raw file.
---
#### Character Interaction Demo by Primitives
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/character
The Character Interaction template demonstrates how to create a simple character interaction using the LLM, TTS, and STT primitives.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Primitives` and play the `CharacterInteractionTemplate` scene.
2. Once loaded, select your preferred character icon and enter the name, role, description, and motivation.
3. Click `Proceed`.
4. Type your message and press `Enter` or click `Send` to submit text.
5. Hold the `Record` button to record, then release to send the audio.
## Understanding the Template
### Structure
This demo combines all primitives using the API approach.
Check `InworldController`; it contains all primitive modules provided in the Inworld Unity AI Runtime SDK.
Its `AudioManager` also contains all `AudioModules` showcased in the [STT Primitive Demo](./stt#inworldaudiomanager).
### Character Creation Panel
In this demo, the `CharacterCreationPanel` holds a `ConversationalCharacterData` asset.
All input and edits modify this data asset.
When you press `Proceed`, the panel invokes its `Proceed` function, which switches to the next panel, `CharacterInteractionPanel`, passing the `ConversationalCharacterData`.
```c# CharacterCreationPanel.cs
public class CharacterCreationPanel : MonoBehaviour
{
[SerializeField] Toggle m_MaleToggle;
[SerializeField] Toggle m_FemaleToggle;
[SerializeField] TMP_Dropdown m_VoiceDropDown;
[SerializeField] List m_MaleVoices;
[SerializeField] List m_FemaleVoices;
[SerializeField] CharacterInteractionPanel m_InteractionPanel;
ConversationalCharacterData m_CharacterData = new ConversationalCharacterData();
...
public void Proceed()
{
m_InteractionPanel.OnCharacterCreated(m_CharacterData, m_CurrentVoiceID);
}
}
```
### Conversation Prompt
This data is located under `Assets/InworldRuntime/Data/General`.
By clicking `Proceed`, the character data is inserted into this prompt.
The prompt, character data, and player name are required for this asset.
### Register Events for All Primitive Modules
In this demo, the `CharacterInteractionPanel` starts by registering each module's events.
This lets the panel handle responses from each primitive.
For example, when STT responds, it captures the text and calls `PlayerSpeaks()`,
which composes the message in the dialog so that the dialog history can be used to generate the LLM prompt.
```c# CharacterInteractionPanel.cs
void OnEnable()
{
if (m_ConversationPrompt.NeedClearHistoryOnStart)
m_ConversationPrompt.ClearHistory();
if (!InworldController.LLM)
return;
InworldController.LLM.OnTask += OnLLMProcessing;
InworldController.LLM.OnTaskFinished += OnLLMRespond;
if (!InworldController.STT)
return;
InworldController.STT.OnTaskFinished += OnSTTFinished;
if (!InworldController.Audio)
return;
InworldController.Audio.Event.onStartCalibrating.AddListener(()=>Debug.LogWarning("Start Calibration"));
InworldController.Audio.Event.onStopCalibrating.AddListener(()=>Debug.LogWarning("Calibrated"));
InworldController.Audio.Event.onPlayerStartSpeaking.AddListener(()=>Debug.LogWarning("Player Started Speaking"));
InworldController.Audio.Event.onPlayerStopSpeaking.AddListener(()=>Debug.LogWarning("Player Stopped Speaking"));
InworldController.Audio.Event.onAudioSent.AddListener(SendAudio);
}
```
### Workflow
- The `InworldController` initializes all modules in sequence.
- Each module creates its factory, which then creates its interfaces.
#### Text to the Character
Submitting text in the input field calls `Submit()`, which:
##### 1. PlayerSpeaks()
adds the incoming message to the dialog (rendered as bubbles in the panel).
##### 2. RequestResponse()
- adds this message to `SpeechEvents` in the conversation prompt's event history.
- uses `InworldFrameworkUtil.RenderJinja()` to incorporate the knowledge, character data, and speech history, rendering the template prompt into a Jinja prompt that is sent to the LLM.
- This eventually calls `InworldController.LLM.GenerateTextAsync()`.
```c# CharacterInteractionPanel.cs
public void Submit()
{
if (!m_ConversationPrompt)
{
Debug.LogError("Cannot find prompt field!");
return;
}
if (!InworldController.LLM)
{
Debug.LogError("Cannot find LLM Module!");
return;
}
PlayerSpeaks(m_InputField.text);
if (m_InputField)
m_InputField.text = string.Empty;
RequestResponse();
}
public void PlayerSpeaks(string content)
{
Utterance utterance = new Utterance
{
agentName = PlayerName,
utterance = content
};
m_ConversationPrompt.AddUtterance(utterance);
InsertBubble(m_BubbleRight, utterance);
}
public async void RequestResponse()
{
string json = JsonConvert.SerializeObject(m_ConversationPrompt.conversationData);
string data = InworldFrameworkUtil.RenderJinja(m_ConversationPrompt.prompt, json);
if (!string.IsNullOrEmpty(data))
{
Debug.Log("Write data completed!");
m_ConversationPrompt.jinjaPrompt = data;
}
await InworldController.LLM.GenerateTextAsync(m_ConversationPrompt.jinjaPrompt);
}
```
#### Speak to the Character
Releasing the record button sends audio and triggers the audio thread process (see [STT Primitive Demo](./stt));
it eventually calls `InworldController.STT.RecognizeSpeechAsync`.
Then it follows the same flow as `PlayerSpeaks()` and `RequestResponse()` in "Text to the Character".
```c# CharacterInteractionPanel.cs
void OnEnable()
{
...
InworldController.STT.OnTaskFinished += OnSTTFinished;
...
InworldController.Audio.Event.onAudioSent.AddListener(SendAudio);
}
async void SendAudio(List audioData)
{
if (InworldController.STT)
{
AudioChunk chunk = new AudioChunk();
InworldVector floatArray = new InworldVector();
foreach (float data in audioData)
{
floatArray.Add(data);
}
chunk.SampleRate = 16000;
chunk.Data = floatArray;
await InworldController.STT.RecognizeSpeechAsync(chunk);
}
}
void OnSTTFinished(string sttData)
{
PlayerSpeaks(sttData);
RequestResponse();
}
```
#### Get Response from the Character
After calling `InworldController.LLM.GenerateTextAsync()`, the LLM Module starts to work.
It frequently invokes the `OnTask` event to send generated data, which is captured by `OnLLMProcessing` to render bubbles in the UI.
When finished, it invokes `OnTaskFinished` to notify the panel.
It then sends the generated LLM chunks to the TTS module to synthesize audio.
```c# CharacterInteractionPanel.cs
void OnEnable()
{
if (m_ConversationPrompt.NeedClearHistoryOnStart)
m_ConversationPrompt.ClearHistory();
if (!InworldController.LLM)
return;
InworldController.LLM.OnTask += OnLLMProcessing;
InworldController.LLM.OnTaskFinished += OnLLMRespond;
if (!InworldController.STT)
return;
...
}
void OnLLMProcessing(string llmData)
{
if (m_CurrentCharacterUtterance == null)
{
m_CurrentCharacterUtterance = new Utterance
{
agentName = Character.name,
utterance = llmData,
};
InsertBubble(m_BubbleLeft, m_CurrentCharacterUtterance);
}
else
{
m_CurrentCharacterUtterance.utterance = llmData;
InsertBubble(m_BubbleLeft, m_CurrentCharacterUtterance, m_Bubbles.Count - 1);
}
}
void OnLLMRespond(string response)
{
if (!m_ConversationPrompt)
{
Debug.LogError("Cannot find prompt field!");
return;
}
if (!string.IsNullOrEmpty(m_CurrentVoiceID))
InworldController.TTS.TextToSpeechAsync(m_CurrentCharacterUtterance.utterance, m_CurrentVoiceID);
m_ConversationPrompt.AddUtterance(m_CurrentCharacterUtterance);
m_CurrentCharacterUtterance = null;
}
```
---
#### LLM Primitive Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/llm
The LLM template demonstrates how to make a simple call to an LLM using the LLM primitive.
## Run the Template
1. Go to Assets/InworldRuntime/Scenes/Primitives and play the `LLMTemplate` Scene.
2. Switch LLM models, configure the parameters, then press the `Connect` button.
3. Type messages to the agent.
The agent remembers the chat history within the current conversation.
4. You can also switch to local models by toggling the `Remote` button.
By default it uses `StreamingAssets/llm/Meta-Llama-3.1-8B-Instruct-Q8_0.gguf`, but you can also use your own models.
If you're using the `Local` models, we recommend setting the `Device` to `CUDA` for better performance.
## Understanding the Template
### Structure
- This demo has a single prefab under `InworldController`, `LLM`, which contains `InworldLLMModule`.
- When `InworldController` initializes, it calls `InworldLLMModule.InitializeAsync()` (see [Primitives Overview](./overview)).
- This function creates an `LLMFactory`, then creates an `LLMInterface` based on the current `LLMConfig`.
#### Parameters
- **Max Tokens:** Maximum number of tokens to generate. Longer outputs may cost more and are truncated at this limit.
- **Max Prompt Length:** Maximum tokens allowed in the prompt. Total context window = input + output, so available output ≈ window − input.
- **Temperature:** Controls randomness/creativity. Lower = more deterministic; higher = more diverse.
- **Top P:** Nucleus sampling. Samples only from tokens within cumulative probability P. Usually tune this or Temperature, not both.
- **Repetition Penalty:** Down-weights previously generated tokens to reduce loops and verbosity.
- **Frequency Penalty:** Penalizes tokens the more frequently they appear to curb repetition.
- **Presence Penalty:** Penalizes tokens after their first appearance to encourage introducing new topics.
```c#
public override InworldInterface CreateInterface(InworldConfig config)
{
if (config is LLMRemoteConfig remoteConfig)
return CreateInterface(remoteConfig);
if (config is LLMLocalConfig localConfig)
return CreateInterface(localConfig);
return null;
}
public InworldInterface CreateInterface(LLMRemoteConfig config)
{
IntPtr result = InworldFrameworkUtil.Execute(
InworldInterop.inworld_LLMFactory_CreateLLM_rcinworld_RemoteLLMConfig(m_DLLPtr, config.ToDLL),
InworldInterop.inworld_StatusOr_LLMInterface_status,
InworldInterop.inworld_StatusOr_LLMInterface_ok,
InworldInterop.inworld_StatusOr_LLMInterface_value,
InworldInterop.inworld_StatusOr_LLMInterface_delete
);
return result != IntPtr.Zero ? new LLMInterface(result) : null;
}
public InworldInterface CreateInterface(LLMLocalConfig config)
{
Debug.Log("ModelPath: " + config.ModelPath);
Debug.Log("Device: " + config.Device.Info.Name);
IntPtr result = InworldFrameworkUtil.Execute(
InworldInterop.inworld_LLMFactory_CreateLLM_rcinworld_LocalLLMConfig(m_DLLPtr, config.ToDLL),
InworldInterop.inworld_StatusOr_LLMInterface_status,
InworldInterop.inworld_StatusOr_LLMInterface_ok,
InworldInterop.inworld_StatusOr_LLMInterface_value,
InworldInterop.inworld_StatusOr_LLMInterface_delete
);
return result != IntPtr.Zero ? new LLMInterface(result) : null;
}
```
### Workflow
At runtime, when `InworldController` invokes `OnFrameworkInitialized`, the demo's `LLMChatPanel` listens to this event and enables the previously disabled UI.
When the user presses `Enter` or clicks `SEND`, `LLMChatPanel` first builds the chat history and inserts it into the prompt.
#### Prompt
This example uses the prompt asset at `Assets/InworldRuntime/Data/BasicLLM.asset`.
```jinja Simple Dialogue Prompt Template
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are Inworld.AI, in conversation with the user, who is the Player.
# Context for the conversation
## Overview
The conversation is a live dialogue between Inworld.AI and Player. It should NOT include any actions, nonverbal cues, or stage directions—ONLY dialogue.
## Inworld.AI's Dialogue Style
Shorter, natural response lengths and styles are encouraged. Inworld.AI should respond engagingly to Player in a natural manner.
# Response Instructions
Respond as Inworld.AI while maintaining consistency with the provided profile and context. Use the specified dialect, tone, and style.
<|eot_id|>
```
Both the user's messages and the agent's replies are stored in the prompt's `Conversation` list (`List`).
When `InworldController.LLM.GenerateTextAsync()` is invoked, the prompt is rendered via Jinja into the following format:
```jinja Simple Dialogue Prompt Template
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are Inworld.AI, in conversation with the user, who is the Player.
# Context for the conversation
## Overview
The conversation is a live dialogue between Inworld.AI and Player. It should NOT include any actions, nonverbal cues, or stage directions—ONLY dialogue.
## Inworld.AI's Dialogue Style
Shorter, natural response lengths and styles are encouraged. Inworld.AI should respond engagingly to Player in a natural manner.
# Response Instructions
Respond as Inworld.AI while maintaining consistency with the provided profile and context. Use the specified dialect, tone, and style.
<|eot_id|>
<|start_header_id|>Player<|end_header_id|>
how is it going
<|start_header_id|>Inworld<|end_header_id|>
It's going well, thanks for asking! How about you?
<|start_header_id|>Player<|end_header_id|>
sure things
<|start_header_id|>Inworld.AI<|end_header_id|>
```
---
#### Speech To Text (STT) Primitive Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/stt
The Speech To Text (STT) template demonstrates how to perform STT using the STT primitive.
This demo also uses a Voice Activity Detection (VAD) module to detect when the player is speaking.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Primitives` and play the `STTTemplate` scene.
2. When the game starts, stay quiet for a moment to let the microphone calibrate to background noise.
3. Once you see the `Calibrated` message, speak into the microphone.
4. You'll see the transcribed text appear on screen.
## Understanding the Template
### Structure
- This demo has two prefabs under `InworldController`: `STT` (contains `InworldSTTModule`) and `VAD` (contains `InworldVADModule`).
- When `InworldController` initializes, it calls `InitializeAsync()` on both modules (see [Primitives Overview](./overview)).
- These functions create `STTFactory` and `VADFactory`, and each factory creates its `STTInterface` or `VADInterface` based on the current `STT/VADConfig`.
### InworldAudioManager
`InworldAudioManager` handles audio processing and is also modular. In this demo, it uses four components:
- **AudioCapturer**: Manages microphone on/off and input devices. Uses Unity's `Microphone` by default, and can be extended via third‑party plugins.
- **AudioCollector**: Collects raw samples from the microphone.
- **PlayerVoiceDetector**: Implements `IPlayerAudioEventHandler` and `ICalibrateAudioHandler` to emit player audio events and decide which timestamped segments to keep from the stream.
For example, `TurnBasedVoiceDetector` automatically pauses capture while the character is speaking to prevent echo.
In this demo, `VoiceActivityDetector` extends `PlayerVoiceDetector` and leverages an AI model to accurately detect when the player is speaking.
- **AudioDispatcher**: Sends the captured microphone data for downstream processing.
### Workflow
**Audio Thread:**
At startup, the microphone calibrates to background noise.
The VAD (Voice Activity Detection) module listens for speech, and when speech is detected, the `AudioDispatcher` streams audio frames to the STT module.
Both partial and final transcriptions are produced and displayed in the UI.
Since this section mainly covers STT, detailed explanations about audio capture will be described later.
**Main Thread:**
In this demo's `STTCanvas`, each audio-thread event is registered in the `OnEnable` method.
Certain simple events, such as starting or stopping calibration, are handled directly (for example, updating on-screen text):
```c# STTCanvas.cs
void OnEnable()
{
if (!m_Audio)
return;
m_Audio.Event.onStartCalibrating.AddListener(() => Title("Calibrating"));
m_Audio.Event.onStopCalibrating.AddListener(Calibrated);
m_Audio.Event.onPlayerStartSpeaking.AddListener(() => Title("PlayerSpeaking"));
m_Audio.Event.onPlayerStopSpeaking.AddListener(() =>
{
Title("");
if (m_STTResult)
m_STTResult.text = "";
});
m_Audio.Event.onAudioSent.AddListener((audioData) =>
{
AudioChunk chunk = new AudioChunk();
InworldVector floatArray = new InworldVector();
foreach (float data in audioData)
{
floatArray.Add(data);
}
chunk.SampleRate = 16000;
chunk.Data = floatArray;
_ = InworldController.STT.RecognizeSpeechAsync(chunk);
});
InworldController.Instance.OnFrameworkInitialized += OnFrameworkInitialized;
InworldController.STT.OnTaskFinished += OnSpeechRecognized;
}
```
When the `onAudioSent` event is received, we assemble the audio data into an `AudioChunk`—the audio should be resampled to mono with a sample rate of 16,000 Hz—and call `InworldController.STT.RecognizeSpeechAsync()`.
This function checks whether the STT module exists and has been initialized (i.e., the `STTInterface` is valid).
If so, it directly calls `sttInterface.RecognizeSpeech`, returns the transcription string, and displays it on the `STTCanvas`.
```c# InworldController.cs
public async Awaitable RecognizeSpeechAsync(AudioChunk audioChunk)
{
string result = "";
if (!Initialized || !(m_Interface is STTInterface sttInterface))
return result;
m_SpeechRecognitionConfig ??= new SpeechRecognitionConfig();
if (m_InputStream != null)
{
m_InputStream.Dispose();
m_InputStream = null;
}
m_InputStream ??= sttInterface.RecognizeSpeech(audioChunk, m_SpeechRecognitionConfig);
...
```
---
#### Text-to-Speech (TTS) Primitive Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/tts
The Text-to-Speech (TTS) template shows how to make a simple TTS call using the TTS primitive.
The TTS primitive module uses the legacy TTS system.
This module supports both remote and local models.
However, newer versions of Inworld TTS can only be used remotely and as a Node in the Graph system.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Primitives` and play the `TTSTemplate` scene.
2. Toggle between `Remote` and `Local`, then press `Connect`.
3. Once connected, select a voice ID, then press `Preview` to listen.
4. You can also type any text and press `SEND` to generate the TTS result.
## Understanding the Template
### Structure
- This demo has a single prefab under `InworldController`: `TTS`, which contains `InworldTTSModule`.
- When `InworldController` initializes, it calls `InitializeAsync()` on the TTS module (see [Primitives Overview](./overview)).
- It creates a `TTSFactory`, which then creates a `TTSInterface` or `VADInterface` based on `TTSConfig`.
### Workflow
At runtime, when `InworldController` invokes `OnFrameworkInitialized`, the demo's `TTSConfigPanel` in `TTSCanvas` listens for this event and enables the previously disabled UI.
When the user switches the voice ID, the system terminates the current `TTSInterface` and creates a new one with the updated voice ID.
When the user clicks the `Preview` button, the `TTSConfigPanel` calls `InworldController.TTS.TextToSpeechAsync()` with "Hello" as the parameter.
When the user presses Enter or clicks `Send`, `TTSConfigPanel` calls `InworldController.TTS.TextToSpeechAsync()` with the input text.
### VoiceID
The voice ID identifies which voice is speaking. You can find the full list on the character page in `Character Studio`.
The primitive TTS module uses our legacy model. Voice IDs may differ from those in the `TTS Playground`.
To find available voices, open `Character Studio`, go to the character editing page, and select voices there.
The `TTSInterface` in the current Inworld Agent Runtime DLL does not support changing voice IDs at runtime.
Therefore, changing the voice ID terminates the current `TTSInterface` and creates a new interface with the selected voice ID.
---
#### Knowledge Primitive Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/primitives/knowledge
The Knowledge template shows how to use the Knowledge primitive.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Primitives` and play the `KnowledgeTemplate` scene.
2. After pressing Play, click `Connect`.
All currently available knowledge records will be displayed on the screen.
3. Click `Add Knowledge` to add more records.
Enter the content you want to add and click `Add`.
In this demo, all added records go to the `knowledge/test` collection.
4. Click `Query Knowledge` to search the knowledge records.
Enter your query and click `Query`.
Matching records will be displayed on the screen.
5. Clicking the trash icon next to a knowledge record deletes all records in that knowledge collection for this game session.
Click the GIF to expand.
## Understanding the Template
### Structure
- This demo has two prefabs under `InworldController`: `TextEmbedder` (contains `TextEmbedderModule`) and `Knowledge` (contains `KnowledgeModule`).
- These modules create `TextEmbedderFactory` and `KnowledgeFactory`, and each factory creates a `TextEmbedderInterface` or `KnowledgeInterface` based on the current `TextEmbedder/KnowledgeConfig`.
### KnowledgeData
KnowledgeData is the asset this demo interacts with.
After the text embedder initializes, whenever you submit a query, the AI module batches this KnowledgeData and tries to retrieve the most relevant pieces of knowledge.
By default, it's stored under `Assets/InworldRuntime/Data/General`.
Knowledge is stored in groups.
Each group can contain multiple pieces of information.
Each group ID should always start with `knowledge/`.
See the example data in our Unity project.
### Workflow
- Initialization is sequenced because `Knowledge` depends on `InworldController`.
- When `InworldController` initializes, it calls `InitializeAsync()` on `TextEmbedderModule`.
When that finishes, `InworldController` invokes `OnProcessInitialized`, which triggers `KnowledgeModule` initialization (see [Primitives Overview](./overview)).
- Once `KnowledgePanel` is initialized, `InworldController` invokes `OnFrameworkInitialized`.
In this demo, the `KnowledgeConfigPanel` on `KnowledgeCanvas > ConfigPanel` is registered to this event.
- Then `KnowledgeConfigPanel` compiles knowledge using the existing TextEmbedder module.
```c# KnowledgeConfigPanel.cs
void OnFrameworkInitialized()
{
if (m_ConnectButton)
m_ConnectButton.Switch(true);
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
if (!InworldController.Knowledge || !m_KnowledgeData || !InworldController.TextEmbedder || !m_KnowledgeData)
return;
m_KnowledgeData.CompileKnowledges();
}
```
#### Add Knowledge
- When you press `Add Knowledge` and submit the input text, the data is added to the current `KnowledgeData`.
- In this demo, this invokes `KnowledgeInteractionPanel._AddKnowledge()`.
- It tries to add a new knowledge group `knowledge/test`, then adds your entry to it.
- It then automatically calls `CompileKnowledges()` to activate the new piece of knowledge.
```c# KnowledgeInteractionPanel.cs
void _AddKnowledge()
{
if (m_KnowledgeData)
m_KnowledgeData.AddKnowledge("test", m_InputContent.text);
}
```
```c# KnowledgeData.cs
public List AddKnowledge(string knowledgeID, string content)
{
if (string.IsNullOrEmpty(knowledgeID))
knowledgeID = "knowledge/new";
else if (!knowledgeID.StartsWith("knowledge/"))
knowledgeID = $"knowledge/{knowledgeID}";
Knowledges knowledge = knowledges.FirstOrDefault(k => k.title == knowledgeID);
if (knowledge == null)
{
Knowledges newKnowledge = new Knowledges
{
title = knowledgeID,
content = new List{content}
};
knowledges.Add(newKnowledge);
}
else
knowledge.content.Add(content);
return CompileKnowledges();
}
```
#### Query Knowledge
- As long as `InworldController` is connected and each module is initialized, you can click `Query Knowledge`.
- Enter any question; if it matches the knowledge database above, related entries are listed.
- More relevant entries appear higher in the list.
- When you press Enter or click `Query`, it first triggers `KnowledgeInteractionPanel._QueryKnowledge()`.
- This calls `InworldController.Knowledge.GetKnowledges()`, which in turn calls `KnowledgeInterface.GetKnowledge()`.
```c#
void _QueryKnowledge()
{
InworldEvent inworldEvent = new InworldEvent();
InworldAgentSpeech speech = inworldEvent.Speech;
speech.AgentName = InworldFrameworkUtil.PlayerName;
speech.Utterance = m_InputContent.text;
if (InworldController.Knowledge)
InworldController.Knowledge.GetKnowledges(m_KnowledgeData.IDs, new List{inworldEvent});
}
```
```c# InworldController.cs
public List GetKnowledges(List knowledgeIDs, List eventHistory = null)
{
if (!Initialized || !(m_Interface is KnowledgeInterface knowledgeInterface))
return null;
InworldVector inputKnowledges = new InworldVector();
inputKnowledges.AddRange(knowledgeIDs);
InworldVector inputEvents = new InworldVector();
if (eventHistory != null && eventHistory.Count != 0)
inputEvents.AddRange(eventHistory);
InworldVector output = knowledgeInterface.GetKnowledge(inputKnowledges, inputEvents);
int nSize = output?.Size ?? -1;
List result = new List();
for (int i = 0; i < nSize; i++)
result.Add(output?[i]);
OnKnowledgeRespond?.Invoke(result);
return result;
}
```
---
#### Examples & Demos > Graph Node demos
#### Node Demos Overview
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/overview
Below are all available node demos—please select one to try out.
See a demonstration of the LLM node working within a graph workflow.
Try out Speech-to-Text node integration in a graph.
Explore Text-to-Speech node usage in a sample graph.
See how the Intent node processes and routes input data.
Experiment with safety checks using the Safety node.
Learn how to use and implement a custom node in the graph system.
Inspect data flow in graphs by examining edges between nodes.
See how to create loops and iterative processes using edges.
Experience a full conversation pipeline using node-based character interaction.
Try the JSON-based character graph template for cross-platform sharing.
These node demos above are built with our graph node editor (or by creating node assets directly).
At runtime, the assets interact with the corresponding primitive module interfaces and the GraphExecutor to send and receive `InworldBaseData`.
## InworldController
`InworldController` is the main GameObject in the Unity scene.
Each primitive module is organized as a child of this object.
At runtime, `InworldController` executes modules in batches according to the configured loading process.
Once initialized, the corresponding module interfaces become available to the Graph/Node system.
## InworldGraphExecutor
`InworldGraphExecutor` contains a Graph ScriptableObject asset.
### Workflow
1. At runtime, it first creates runtime handles for every graph, edge, and node by calling `CreateRuntime()`.
2. It then calls `Compile()` to make the graph executable.
3. After compilation, other scripts can interact with the executor by calling its `ExecuteGraphAsync()` method with input.
4. When execution finishes, it raises the `OnGraphResult` event.
## Graph
A graph represents a workflow for solving one or more tasks.
It contains nodes and edges that describe the system, and it is executed by the graph executor.
## Nodes
A node is a functional unit that processes input data (either the initial input or the output from a previous node).
It sends the processed result to the next node via edges, or outputs directly if it is an end node.
The InworldRuntime system is implemented in a C++ DLL and is type-sensitive.
Ensure that data types match from one node to the next.
If they do not match, create a custom node to perform the conversion.
## InworldBaseData
This is the fundamental data type that flows through the graph via edges.
All nodes can only receive and send `InworldBaseData` or its subclasses.
Although all data processing is based on `InworldBaseData`, nodes—especially those that call the C++ DLL—are still very strict about concrete types.
Passing different subclasses, even if they inherit from the same base, can sometimes cause crashes.
### Inworld Nodes
Inworld nodes are built on Inworld’s primitive modules.
In the current SDK, we provide:
- [LLM](../../runtime-reference/InworldNodeAsset/LLMNodeAsset)
- [TTS](../../runtime-reference/InworldNodeAsset/TTSNodeAsset)
- [STT](../../runtime-reference/InworldNodeAsset/STTNodeAsset)
- [RandomCannedText](../../runtime-reference/InworldNodeAsset/RandomCannedTextNodeAsset)
- [TextChunking](../../runtime-reference/InworldNodeAsset/TextChunkingNodeAsset)
- [TextAggregator](../../runtime-reference/InworldNodeAsset/TextAggregatorNodeAsset)
- [Subgraph](../../runtime-reference/InworldNodeAsset/SubgraphNodeAsset)
- [Safety](../../runtime-reference/InworldNodeAsset/SafetyNodeAsset)
- [Intent](../../runtime-reference/InworldNodeAsset/IntentNodeAsset)
Some of these nodes require their corresponding interfaces to be initialized first.
These nodes do not expose `ProcessBaseData()` and are not intended to be subclassed for custom data processing.
### Custom Nodes
Custom nodes are authored in Unity and are primarily used to convert between data types.
Examples include:
- GetPlayerName
- FilterInput
- FormatPrompt
- AddSpeechEvent
- ConversationEndPoint
Each custom node can be subclassed. They expose a key method, `ProcessBaseData()`, which converts the incoming `InworldBaseData` to the outgoing `InworldBaseData`.
## Edges
Edges connect nodes and can also define conditions that block certain data from passing through—for example, to prevent incompatible data types from reaching later nodes.
---
#### Character Interaction Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/character
This demo uses the full graph node system to build a relatively complete single graph that combines modules including LLM, STT, TTS, and Safety.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `CharacterInteractionNode` scene.
2. After the scene loads, you can enter text and press Enter or click the `SEND` button to submit.
3. You can also hold the `Record` button to record audio, then release it to send.
4. The AI agent responds with both audio and text. If you send audio, it will be transcribed to text first.
## Understanding the Graph
You can find the graph on the `InworldGraphExecutor` of `CharacterInteractionCanvas`.
The graph is relatively complex—let's use the graph editor to illustrate:
On the left, `FilterInputNode` acts as the `StartNode` and processes user input.
If the data is `InworldText` or `InworldAudio`, it passes downstream; otherwise, it returns an error and stops.
If the input is `InworldAudio`, it first goes through `STTNode` for transcription to text, then into `SafetyNode`. If it is text, it goes directly into `SafetyNode`.
Note that the two outgoing edges from `FilterInput` are not default edges. One is `TextEdge` and the other is `AudioEdge`. Their `MeetsCondition` checks are simple: for `InworldText`, `TextEdge` passes; for `InworldAudio`, `AudioEdge` passes. Otherwise they block.
You can assume that, by default, data tries to flow forward in the graph node system. When designing:
• Add a `CustomNode` before/after to convert the data into the expected type (slower), or
• Configure a custom `Edge` to allow only the types needed by the next node and block the rest.
`SafetyNode` checks input text against its `SafetyData` categories and thresholds.
If the input is safe, it proceeds to `AddPlayerSpeech`, then on through `LLM` into `AddCharacterSpeech`.
Otherwise, the user's input is ignored and the flow goes to a `SafetyResponse`, which is a `RandomCannedText` node that randomly selects one predefined message and sends it directly to `AddCharacterSpeech`.
In this demo, no `SafetyData` is configured, which means all inputs are allowed.
To change this, click `SafetyNode`.
The Inspector will highlight the node, and you can adjust `SafetyData` in the panel below.
`SafetyNode` has two outgoing edges.
The upper edge is a special `SafetyEdge` whose `MeetsCondition` simply checks whether the input is safe.
If safe, it proceeds to `AddCharacterSpeech`; otherwise, it goes to `RandomCannedText`.
`AddPlayerSpeech` is an `AddSpeechEventNode` that inherits from `CustomNode`.
It converts various upstream types into text when possible.
During creation, it uses the boolean `m_IsPlayer` to obtain the player or agent name, so the final output can be tagged with the correct speaker.
In this demo, `AddPlayerSpeech` connects to an early exit `PlayerFinal` to notify Unity that the graph has the player's input portion available.
```c# AddSpeechEventNodeAsset.cs
protected override InworldBaseData ProcessBaseData(InworldVector inputs)
{
if (!(m_Graph is CharacterInteractionGraphAsset charGraph))
{
return new InworldError("AddSpeechEvent Node only be used on Character Interaction Graph.", StatusCode.FailedPrecondition);
}
InworldBaseData inputData = inputs[0];
string outResult = TryProcessSafetyResult(inputData);
if (string.IsNullOrEmpty(outResult))
outResult = TryProcessTTSOutput(inputData);
if (string.IsNullOrEmpty(outResult))
outResult = TryProcessLLMResponse(inputData);
if (string.IsNullOrEmpty(outResult))
outResult = TryProcessText(inputData);
if (string.IsNullOrEmpty(outResult))
return new InworldError($"Unsupported data type {inputData.GetType()}.", StatusCode.Unimplemented);
AddUtterance(m_SpeakerName, outResult);
return new InworldText(outResult);
}
```
This node also passes its output to `FormatPrompt`, then on to `LLM` and `AddCharacterSpeech`.
This is an `EndNode`.
It emits the `PlayerSpeech` output, because sometimes we need an early return while the rest of the graph continues.
In this demo, this node lets the handler registered to the graph executor's `OnGraphResult` capture the user's own message (especially STT‑transcribed text) to render a UI bubble, etc.
This is also a `CustomNode`.
It stores the `AddSpeechEvent` result into the runtime `DialogHistory`, renders the prompt from the Jinja template, then wraps it into an `LLMChatRequest` and sends it to `LLMNode`.
Here is the `Prompt Template` used in this demo.
```jinja Prompt Template
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are {{Character.name}}, in conversation with the user, who is pretending to be {{Player}}.
# Context for the conversation
## Overview
The conversation is a live dialogue between {{Character.name}} and {{Player}}. It should NOT include any actions, nonverbal cues, or stage directions—ONLY dialogue.
## {{Character.name}}'s Dialogue Style
Shorter, natural response lengths and styles are encouraged. {{Character.name}} should respond engagingly to {{Player}} in a natural manner.
## Profile of {{Character.name}}
Name: {{Character.name}}
Role: {{Character.role}}
Pronouns: {{Character.pronouns}}
## Personality and Background
{{Character.description}}
## Relevant Facts
{% for record in Knowledge.records %}
{{record}}
{% endfor %}
## Motivation
{{Character.motivation}}
# Response Instructions
Respond as {{Character.name}} while maintaining consistency with the provided profile and context. Use the specified dialect, tone, and style.
<|eot_id|>
{% for speechEvent in EventHistory.speechEvents %}
<|start_header_id|>{{speechEvent.agentName}}<|end_header_id|>
{{speechEvent.utterance}}
{% endfor %}
<|start_header_id|>{{Character.name}}<|end_header_id|>
```
And here is the Jinja prompt after filling it with `CharacterData`, `DialogHistory`, `PlayerData`, etc.
```jinja Jinja Prompt
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are Harry Potter, in conversation with the user, who is pretending to be Player.
# Context for the conversation
## Overview
The conversation is a live dialogue between Harry Potter and Player. It should NOT include any actions, nonverbal cues, or stage directions—ONLY dialogue.
## Harry Potter's Dialogue Style
Shorter, natural response lengths and styles are encouraged. Harry Potter should respond engagingly to Player in a natural manner.
## Profile of Harry Potter
Name: Harry Potter
Role:
Pronouns:
## Personality and Background
Harry Potter is a brave and loyal wizard known for his role in defeating the dark wizard Lord Voldemort. He has unruly black hair, green eyes, and a lightning-shaped scar on his forehead. Harry is humble despite his fame in the wizarding world, and values friendship, courage, and doing what's right over what's easy.
## Relevant Facts
## Motivation
To protect the people I care about, stand against dark magic, and ensure peace in the wizarding world.
# Response Instructions
Respond as Harry Potter while maintaining consistency with the provided profile and context. Use the specified dialect, tone, and style.
<|eot_id|>
<|start_header_id|>Player<|end_header_id|>
how much is 2+2
<|start_header_id|>Harry Potter<|end_header_id|>
Well, even in the wizarding world, 2 plus 2 is 4.
<|start_header_id|>Player<|end_header_id|>
So what's your name and what's your favorite sports?
<|start_header_id|>Harry Potter<|end_header_id|>
```
You can compare the two prompts.
Like `AddPlayerSpeech`, `AddCharacterSpeech` is an `AddSpeechEventNode` that inherits from `CustomNode` and converts upstream types to text when possible.
During creation, it uses `m_IsPlayer` to obtain either the player's or the agent's name so the final output is tagged with the speaker.
In this demo, `AddCharacterSpeech` receives the value returned from the LLM and prefixes it with the character's name.
`AddCharacterSpeech` also connects to an early exit `CharFinal` to notify Unity that the character's output portion is available.
These two nodes trim the text generated by the LLM, because some models stream segmented output.
`TextChunking` merges those segments into a single string.
`TextProcessor` is a `CustomNode` that removes undesirable content before sending to TTS (e.g., brackets, emojis).
Some TTS models will literally read those symbols.
This is the third and final `EndNode`.
It takes text produced by either `RandomCannedText` or the LLM, processes it through the two text nodes above, and then synthesizes speech in `TTSNode`.
### InworldController
The `InworldController` contains all the primitive modules and an `InworldAudioManager`, which also contains all the audio modules.
For details about the primitive module, see the [Primitive Demos](../primitives/overview).
For details about the AudioManager, see the [Speech-to-text Node Demo](./stt#inworldaudiomanager)
## Workflow
1. When the game starts, `InworldController` initializes all its primitive modules.
Each model creates a factory and then builds its interface based on the provided configs.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, the `CharacterInteractionNodeTemplate` of the `CharacterInteractionPanel` subscribes to it and configures the prompt. Users can then interact with the graph system.
```c# CharacterInteractionNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
if (!(obj is CharacterInteractionGraphAsset charGraph))
return;
m_CharacterName = charGraph.prompt.conversationData.Character.name;
}
```
5. If the user sends text, it reaches the `Submit()` function, which converts the input into `InworldText`.
```c# CharacterInteractionNodeTemplate.cs
public async void Submit()
{
string input = m_InputField.text;
if (m_InputField)
m_InputField.text = string.Empty;
await m_InworldGraphExecutor.ExecuteGraphAsync("Text", new InworldText(input));
}
```
6. If the user sends audio, the `AudioDispatchModule` of `InworldAudioManager` raises the `onAudioSent` event.
`CharacterInteractionNodeTemplate` subscribes to this event and handles it in `SendAudio()`.
```c# CharacterInteractionNodeTemplate.cs
protected override void OnEnable()
{
base.OnEnable();
if (!InworldController.Audio)
return;
InworldController.Audio.Event.onStartCalibrating.AddListener(()=>Debug.LogWarning("Start Calibration"));
InworldController.Audio.Event.onStopCalibrating.AddListener(()=>Debug.LogWarning("Calibrated"));
InworldController.Audio.Event.onPlayerStartSpeaking.AddListener(()=>Debug.LogWarning("Player Started Speaking"));
InworldController.Audio.Event.onPlayerStopSpeaking.AddListener(()=>Debug.LogWarning("Player Stopped Speaking"));
InworldController.Audio.Event.onAudioSent.AddListener(SendAudio);
}
async void SendAudio(List audioData)
{
if (m_InworldGraphExecutor.Graph.IsJsonInitialized || InworldController.STT)
{
InworldVector floatArray = new InworldVector();
foreach (float data in audioData)
{
floatArray.Add(data);
}
InworldAudio audio = new InworldAudio(floatArray, 16000);
await m_InworldGraphExecutor.ExecuteGraphAsync("Audio", audio);
}
}
```
7. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `CharacterInteractionNodeTemplate` subscribes to in order to receive the data.
If the result is user text (or STT‑transcribed text), a bubble is created directly.
If it is a character reply, the bubble is updated (created if not found, otherwise appended).
If the result is audio, it is converted into an `AudioClip` and played.
```c# CharacterInteractionNodeTemplate.cs
protected override async void OnGraphResult(InworldBaseData obj)
{
InworldText text = new InworldText(obj);
if (text.IsValid)
{
string speech = text.Text;
string[] speechData = speech.Split(':', 2);
if (speechData.Length <= 1)
return;
if (speechData[0] == InworldFrameworkUtil.PlayerName)
PlayerSpeaks(speechData[1]);
else
LLMSpeaks(speechData[1]);
return;
}
InworldDataStream outputStream = new InworldDataStream(obj);
if (!outputStream.IsValid) return;
InworldInputStream stream = outputStream.ToInputStream();
int sampleRate = 0;
float[] finalData = null;
List buffer = new List(64 * 1024);
await Awaitable.BackgroundThreadAsync();
while (stream != null && stream.HasNext)
{
TTSOutput ttsOutput = stream.Read();
if (ttsOutput == null) continue;
InworldAudio ttsOutputAudio = ttsOutput.Audio;
sampleRate = ttsOutputAudio.SampleRate;
List wf = ttsOutputAudio.Waveform?.ToList();
if (wf != null && wf.Count > 0)
buffer.AddRange(wf);
}
await Awaitable.MainThreadAsync();
finalData = buffer.Count > 0 ? buffer.ToArray() : null;
if (sampleRate <= 0 || finalData == null || finalData.Length == 0)
return;
AudioClip clip = AudioClip.Create("TTS", finalData.Length, 1, sampleRate, false);
clip.SetData(finalData, 0);
m_AudioSource?.PlayOneShot(clip);
}
```
---
#### LLM Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/llm
This demo showcases how to use the `LLMNode`.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `LLMNode` scene.
2. Enter your request. The AI agent will respond.
## Understanding the Graph
You can find the graph on the `InworldGraphExecutor` of `LLMNodeCanvas`.
The graph is very simple. It contains a single node, `LLMNode`, with no edges.
`LLMNode` is both the `StartNode` and the `EndNode`.
### InworldController
The `InworldController` is also simple; it contains only one primitive module: `LLM`.
For details about the primitive module, see the [LLM Primitive Demo](../primitives/llm).
## Workflow
1. When the game starts, `InworldController` initializes its only module, `LLMModule`, which creates the `LLMInterface`.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`. In this case, only `LLMNode.CreateRuntime()` is called, using the created `LLMInterface` as input.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, `LLMNodeTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.
```c# LLMNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. When the user sends text, this demo wraps the message into an `LLMChatRequest` and sends it to the LLM as raw input.
```c# LLMNodeTemplate.cs
public async void SendText()
{
if (!InworldController.LLM)
{
Debug.LogError("Cannot find LLM Module!");
return;
}
// Compose the input into `LLMChatRequest`
InworldMessage message = PlayerSpeaks(m_InputField.text);
m_InputField.text = "";
LLMChatRequest llmChatRequest = new LLMChatRequest(m_Messages);
await m_InworldGraphExecutor.ExecuteGraphAsync("LLM", llmChatRequest);
}
```
6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `LLMCanvas` subscribes to in order to receive the data.
---
#### STT(Speech-to-text) Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/stt
This demo showcases how to use the `STTNode`.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `STTNode` scene.
2. Once the graph is compiled, speak into the microphone to generate text.
## Understanding the Graph
You can find the graph on the `InworldGraphExecutor` of `STTCanvas`.
The graph is very simple. It contains a single node, `STTNode`, with no edges.
`STTNode` is both the `StartNode` and the `EndNode`.
### InworldController
The `InworldController` is also simple; it contains only one primitive module: `STT`.
For details about the primitive module, see the [STT Primitive Demo](../primitives/stt).
### InworldAudioManager
`InworldAudioManager` handles audio processing and is also modular.
In this demo, it uses four components:
- **AudioCapturer**: Manages microphone on/off and input devices. Uses Unity's `Microphone` by default, and can be extended via third‑party plugins.
- **AudioCollector**: Collects raw samples from the microphone.
- **PlayerVoiceDetector**: Implements `IPlayerAudioEventHandler` and `ICalibrateAudioHandler` to emit player audio events and decide which timestamped segments to keep from the stream.
- **AudioDispatcher**: Sends the captured microphone data for downstream processing.
### Workflow
**Audio Thread:**
At startup, the microphone calibrates to background noise.
`PlayerVoiceDetector` listens for speech using SNR (Signal‑to‑Noise Ratio).
When it exceeds the threshold, `AudioDispatcher` streams audio frames to `InworldAudio`.
**Main Thread:**
1. When the game starts, `InworldController` initializes its only module, `STTModule`, which creates the `STTInterface`.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`. In this case, only `STTNode.CreateRuntime()` is called, using the created `STTInterface` as input.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, `STTNodeTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.
```c# STTNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. When `AudioDispatcher` sends data, `STTNodeTemplate` handles its `OnAudioSent` event with the `SendAudio()` function, converting the `List` audio data into `InworldAudio`.
```c# STTNodeTemplate.cs
protected override void OnEnable()
{
base.OnEnable();
if (!m_Audio)
return;
m_Audio.Event.onStartCalibrating.AddListener(()=>Title("Calibrating"));
m_Audio.Event.onStopCalibrating.AddListener(Calibrated);
m_Audio.Event.onPlayerStartSpeaking.AddListener(()=>Title("PlayerSpeaking"));
m_Audio.Event.onPlayerStopSpeaking.AddListener(()=>
{
Title("");
if (m_STTResult)
m_STTResult.text = "";
});
m_Audio.Event.onAudioSent.AddListener(SendAudio);
}
void SendAudio(List audioData)
{
if (!m_ModuleInitialized)
return;
InworldVector wave = new InworldVector();
wave.AddRange(audioData);
_ = m_InworldGraphExecutor.ExecuteGraphAsync("STT", new InworldAudio(wave, wave.Size));
}
```
6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `STTNodeTemplate` subscribes to in order to receive the data.
```c# STTNodeTemplate.cs
protected override void OnGraphResult(InworldBaseData obj)
{
InworldText outputStream = new InworldText(obj);
if (outputStream.IsValid && m_STTResult)
m_STTResult.text += outputStream;
}
```
---
#### TTS(Text-to-speech) Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/tts
This demo showcases how to use the `TTSNode`.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `TTSNode` scene.
2. Once the graph is compiled, enter text or send a preview message to generate speech.
## Understanding the Graph
You can find the graph on the `InworldGraphExecutor` of `TTSCanvas`.
The graph is very simple. It contains a single node, `TTSNode`, with no edges.
`TTSNode` is both the `StartNode` and the `EndNode`.
### InworldController
The `InworldController` is also simple; it contains only one primitive module: `TTS`.
For details about the primitive module, see the [TTS Primitive Demo](../primitives/tts).
### Workflow
1. When the game starts, `InworldController` initializes its only module, `TTSModule`, which creates the `TTSInterface` using the voice ID selected in the dropdown.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`. In this case, only `TTSNode.CreateRuntime()` is called, using the created `TTSInterface` as input.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, `TTSNodeTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.
```c# TTSNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. After the UI is initialized, pressing the `Preview` button sends "Hello, I'm {voiceID}" as `InworldText` to the graph.
6. When you enter a sentence and press `Enter` or the `Send` button, your message is also sent as `InworldText`.
```c# TTSNodeTemplate.cs
protected override void OnEnable()
{
base.OnEnable();
if (!m_Audio)
return;
m_Audio.Event.onStartCalibrating.AddListener(()=>Title("Calibrating"));
m_Audio.Event.onStopCalibrating.AddListener(Calibrated);
m_Audio.Event.onPlayerStartSpeaking.AddListener(()=>Title("PlayerSpeaking"));
m_Audio.Event.onPlayerStopSpeaking.AddListener(()=>
{
Title("");
if (m_STTResult)
m_STTResult.text = "";
});
m_Audio.Event.onAudioSent.AddListener(SendAudio);
}
void SendAudio(List audioData)
{
if (!m_ModuleInitialized)
return;
InworldVector wave = new InworldVector();
wave.AddRange(audioData);
_ = m_InworldGraphExecutor.ExecuteGraphAsync("STT", new InworldAudio(wave, wave.Size));
}
```
7. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `TTSNodeTemplate` subscribes to in order to receive the data.
```c# TTSNodeTemplate.cs
protected override async void OnGraphResult(InworldBaseData obj)
{
InworldDataStream outputStream = new InworldDataStream(obj);
InworldInputStream stream = outputStream.ToInputStream();
int sampleRate = 0;
List result = new List();
await Awaitable.BackgroundThreadAsync();
while (stream != null && stream.HasNext)
{
TTSOutput ttsOutput = stream.Read();
if (ttsOutput == null)
continue;
InworldAudio chunk = ttsOutput.Audio;
sampleRate = chunk.SampleRate;
List data = chunk.Waveform?.ToList();
if (data != null && data.Count > 0)
result.AddRange(data);
await Awaitable.NextFrameAsync();
}
await Awaitable.MainThreadAsync();
string output = $"SampleRate: {sampleRate} Sample Count: {result.Count}";
Debug.Log(output);
int sampleCount = result.Count;
if (sampleRate == 0 || sampleCount == 0)
return;
AudioClip audioClip = AudioClip.Create("TTS", sampleCount, 1, sampleRate, false);
audioClip.SetData(result.ToArray(), 0);
m_AudioSource?.PlayOneShot(audioClip);
}
```
8. The returned data type from `TTSNode` is `InworldDataStream`, which does not expose read APIs. Convert it to `InworldInputStream` first.
9. In this demo, we read on a background thread using Unity’s `Awaitable`.
10. After all waveform data is collected and we switch back to the main thread, we play it using the attached `AudioSource`.
### Switching voiceID
As you know, the InworldGraphSystem must be compiled before it can be used, and the voice ID is set during the compilation phase.
Therefore, to switch the voice ID at runtime, we actually need to terminate the current graph executor and restart the initialization process with the new ID.
---
#### Intent Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/intent
This demo showcases how to use the `IntentNode`.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `IntentNode` scene.
2. Once the graph is compiled, enter text.
The system finds the closest intents from the intent list and provides a similarity score (0 = lowest, 1 = highest).
## Understanding the Graph
`IntentNodeCanvas` contains an `InworldGraphExecutor` whose graph asset includes only a single `IntentNode`.
The graph is very simple. It contains a single node, `IntentNode`, with no edges.
`IntentNode` is both the `StartNode` and the `EndNode`.
### IntentData
The `IntentData` is defined directly within the `IntentNodeAsset`.
The data structure works as follows: each topic includes several `IntentSample` entries to help the AI module recognize those categories.
### InworldController
`InworldController` includes only one primitive module: `TextEmbedder`.
This is because `IntentNode` requires a `TextEmbedder` interface during `CreateRuntime()`.
```c# IntentNodeAsset.cs
public override bool CreateRuntime(InworldGraphAsset graphAsset)
{
m_Graph = graphAsset;
ComponentStore componentStore = new ComponentStore();
componentStore.AddTextEmbedderInterface(NodeName, InworldController.TextEmbedder.Interface as TextEmbedderInterface);
InworldCreationContext creationContext = new InworldCreationContext(componentStore);
IntentNodeCreationConfig creationCfg = GetNodeCreationConfig() as IntentNodeCreationConfig;
IntentNodeExecutionConfig executionCfg = GetNodeExecutionConfig() as IntentNodeExecutionConfig;
Runtime = new IntentNode(NodeName, creationContext, creationCfg, executionCfg);
return Runtime?.IsValid ?? false;
}
```
### Workflow
1. When the game starts, `InworldController` initializes its only module, `TextEmbedderModule`, which creates the `TextEmbedderInterface`.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`. In this case, only `IntentNode.CreateRuntime()` is called, using the created `TextEmbedderInterface` as input.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, `IntentNodeTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.
```c# IntentNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. After the UI is initialized, pressing `Enter` or the `SEND` button sends the input text to the graph.
6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `IntentNodeTemplate` subscribes to in order to receive the data.
```c# IntentNodeTemplate.cs
protected override void OnGraphResult(InworldBaseData output)
{
MatchedIntents matched = new MatchedIntents(output);
if (matched.IsValid)
DisplayMatchedIntents(matched);
else
{
Debug.LogError(output);
}
}
```
---
#### Safety Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/safety
This demo showcases how to use the `SafetyNode`.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `SafetyNode` scene.
2. Once the graph is compiled, enter text.
3. The system compares the input against the safety word lists and returns a similarity score (0 = lowest, 1 = highest).
4. If the score exceeds the threshold, the input is marked unsafe.
## Understanding the Graph
`SafetyNodeCanvas` contains an `InworldGraphExecutor` whose graph asset includes only a single `SafetyNode`.
The graph is very simple. It contains a single node, `SafetyNode`, with no edges.
`SafetyNode` is both the `StartNode` and the `EndNode`.
### SafetyData
The `SafetyData` is defined directly within the `SafetyNodeAsset`.
You can select multiple topics and configure a threshold for each detection.
Higher thresholds are more permissive (1 = almost everything is considered safe).
Lower thresholds are stricter (0 = almost everything is considered unsafe).
### InworldController
`InworldController` includes 2 primitive modules: `TextEmbedder` and `SafetyChecker`.
This is because `SafetyNode` requires both the `TextEmbedder` interface and the `SafetyChecker` configuration during `CreateRuntime()`.
The safety module also depends on `TextEmbedderModule` being initialized first.
```c# SafetyNodeAsset.cs
public override NodeCreationConfig GetNodeCreationConfig()
{
SafetyCheckerNodeCreationConfig nodeCreationCfg = new SafetyCheckerNodeCreationConfig();
nodeCreationCfg.SafetyConfig = InworldController.Safety.CreationConfig; // <== This Requires SafetyCheckerInterface.
nodeCreationCfg.EmbedderComponentID = ComponentID;
m_CreationConfig = nodeCreationCfg;
return m_CreationConfig;
}
public override bool CreateRuntime(InworldGraphAsset graphAsset)
{
m_Graph = graphAsset;
InworldSafetyModule safety = InworldController.Safety;
if (!safety)
return false;
safety.SetupSafetyThreshold(m_SafetyData);
ComponentStore componentStore = new ComponentStore();
componentStore.AddTextEmbedderInterface(NodeName, InworldController.TextEmbedder.Interface as TextEmbedderInterface);
InworldCreationContext creationContext = new InworldCreationContext(componentStore);
SafetyCheckerNodeCreationConfig creationCfg = GetNodeCreationConfig() as SafetyCheckerNodeCreationConfig;
SafetyCheckerNodeExecutionConfig executionCfg = GetNodeExecutionConfig() as SafetyCheckerNodeExecutionConfig;
Runtime = new SafetyCheckerNode(NodeName, creationContext, creationCfg, executionCfg);
return Runtime?.IsValid ?? false;
}
```
### Workflow
1. When the game starts, `InworldController` initializes `TextEmbedderModule`, which creates the `TextEmbedderInterface`.
2. Next, `SafetyChecker` initializes `InworldSafetyModule`.
3. Then `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`. In this case, `SafetyNode.CreateRuntime()` is called with both the `TextEmbedderInterface` and the `SafetyChecker` configuration as input. After that, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, `SafetyNodeTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.
```c# SafetyNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. After the UI is initialized, pressing `Enter` or the `SEND` button sends the input text to the graph.
6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `SafetyNodeTemplate` subscribes to in order to receive the data.
```c# SafetyNodeTemplate.cs
protected override void OnGraphResult(InworldBaseData output)
{
SafetyResult safetyResult = new SafetyResult(output);
if (safetyResult.IsValid)
DisplaySafety(safetyResult);
}
```
---
#### Custom Node Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/custom
This demo showcases how to create your `CustomNode` and make your own implementations.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `CustomNode` scene.
2. Once the graph is compiled, you can either enter text or input audio by hold the record button and release the button to send the audio.
3. In this node, if you are sending text, it will make your input text uppercase.
4. If you send the audio, it will make the pitch higher.
## Understanding the Graph
`CustomNodeSampleCanvas` contains an `InworldGraphExecutor` whose graph asset includes only a single `CustomSample`.
The graph is very simple. It contains a single node, `CustomSampleNode`, with no edges.
`CustomSampleNode` is both the `StartNode` and the `EndNode`.
### CustomNode details
The node is using `CustomSampleNode`, which is written and inherited from the `CustomNodeAsset`.
In its overrided `ProcessBaseData()`, it checks the input InworldBaseData, and process that data based on its type.
```c# CustomSampleNodeAsset.cs
public class CustomSampleNodeAsset : CustomNodeAsset
{
protected override InworldBaseData ProcessBaseData(InworldVector inputs)
{
if (inputs.Size == 0)
{
return new InworldError("No input data", StatusCode.DataLoss);
}
InworldBaseData inputData = inputs[0]; // YAN: Let's only process the last single input.
InworldText textResult = new InworldText(inputData);
if (textResult.IsValid)
return ProcessTextData(textResult);
InworldAudio audioResult = new InworldAudio(inputData);
if (audioResult.IsValid)
return ProcessAudioData(audioResult);
return new InworldError($"Unsupported data type: {inputData.GetType()}", StatusCode.Unimplemented);
}
...
}
```
### InworldAudioManager
In this demo, the `InworldController` does not include any primitive modules.
It includes an `InworldAudioManager` for sampling the microphone input.
`InworldAudioManager` handles audio processing and is also modular.
In this demo, it uses four components:
- **AudioCapturer**: Manages microphone on/off and input devices. Uses Unity's `Microphone` by default, and can be extended via third‑party plugins.
- **AudioCollector**: Collects raw samples from the microphone.
- **PlayerVoiceDetector**: Implements `IPlayerAudioEventHandler` and `ICalibrateAudioHandler` to emit player audio events and decide which timestamped segments to keep from the stream.
- **AudioDispatcher**: Sends the captured microphone data for downstream processing.
### Workflow
1. When the game starts, `InworldController` initalized immediately and invoke `OnFrameworkInitialized` event as there is no primitive modules.
2. Then `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`.
In this case, CustomNode's `CreateRuntime()` is called. It's in its parent class.
```c# CustomNodeAsset.cs
public override bool CreateRuntime(InworldGraphAsset graphAsset)
{
m_Graph = graphAsset;
m_Executor = new CustomNodeProcessExecutor(ProcessBaseDataIO);
Runtime = new CustomNodeWrapper(NodeName, m_Executor);
return Runtime?.IsValid ?? false;
}
protected virtual void ProcessBaseDataIO(IntPtr contextPtr)
{
try
{
// Here is the virtual ProcessBaseData for override.
CustomNodeProcessExecutor.SetLastOutput(ProcessBaseData(CustomNodeProcessExecutor.LastIntputs));
}
...
}
```
3. After compilation, the `OnGraphCompiled` event is invoked.
In this demo, `CustomNodeTemplate` subscribes to it and enables the UI components.
Users can then interact with the graph system.
```c# CustomNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
4. After the UI is initialized, send the input text or audio to the graph.
6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `CustomNodeTemplate` subscribes to in order to receive the data.
```c# CustomNodeTemplate.cs
protected override void OnGraphResult(InworldBaseData obj)
{
InworldText text = new InworldText(obj);
if (text.IsValid)
{
m_ResultText.text = "Result: " + text.Text;
return;
}
InworldAudio inworldAudio = new InworldAudio(obj);
if (!inworldAudio.IsValid)
return;
m_AudioSource.clip = inworldAudio.AudioClip;
m_AudioSource.Play();
}
```
---
#### Edge Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/edge
This demo shows how an `Edge` works in the graph node system.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `EdgeDemo` scene.
2. The overall experience is the same as in the [LLM Node Demo](./llm).
Type your text; the AI agent will respond.
## Understanding the Graph
`NodeConnectionCanvas` contains an `InworldGraphExecutor`.
The graph contains two nodes: a custom node `TextToLLM` and an `LLMNode`, connected by an edge `Edge_SampleText_To_LLM`.
`TextToLLM` is the `StartNode` and `LLMNode` is the `EndNode`.
You can also see this connection in the Graph Editor.
### CustomNode details
This node is a `TxtToPromptSampleNodeAsset`, implemented by inheriting from `CustomNodeAsset`.
In its overridden `ProcessBaseData()`, it inspects the input `InworldBaseData`, wraps it into an `LLMChatRequest` (the type accepted by `LLMNode`), and sends it onward.
```c# TxtToPromptSampleNodeAsset.cs
public class TxtToPromptSampleNodeAsset : CustomNodeAsset
{
protected override InworldBaseData ProcessBaseData(InworldVector inputs)
{
string strInput = "";
if (inputs != null)
{
for (int i = 0; i < inputs.Size; i++)
{
InworldText text = new InworldText(inputs[i]);
if (text.IsValid)
{
strInput += text.Text;
}
}
}
InworldVector messages = new InworldVector();
InworldMessage message = new InworldMessage();
message.Role = Role.User;
message.Content = strInput;
messages.Add(message);
return new LLMChatRequest(messages);
}
}
```
### Edge
This edge uses the default behavior: it simply forwards all output from the previous node to the next node.
### InworldController
The `InworldController` is also simple; it contains only one primitive module: `LLM`.
## Workflow
1. When the game starts, `InworldController` initializes its only module, `LLMModule`, which creates the `LLMInterface`.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`.
In this case, `TextToPromptNode` initializes immediately.
When `LLMNode.CreateRuntime()` is called, it uses the created `LLMInterface` as input.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked.
In this demo, `NodeConnectionTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.
```c# CustomNodeAsset.cs
public override bool CreateRuntime(InworldGraphAsset graphAsset)
{
m_Graph = graphAsset;
m_Executor = new EdgeNodeProcessExecutor(ProcessBaseDataIO);
Runtime = new EdgeNodeWrapper(NodeName, m_Executor);
return Runtime?.IsValid ?? false;
}
protected virtual void ProcessBaseDataIO(IntPtr contextPtr)
{
try
{
// Here is the virtual ProcessBaseData for override.
EdgeNodeProcessExecutor.SetLastOutput(ProcessBaseData(EdgeNodeProcessExecutor.LastIntputs));
}
...
}
```
```c# NodeConnectionTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. After the UI is initialized, send the input text to the graph.
6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `NodeConnectionTemplate` subscribes to in order to receive the data.
```c# NodeConnectionTemplate.cs
protected override void OnGraphResult(InworldBaseData obj)
{
LLMChatResponse response = new LLMChatResponse(obj);
Debug.Log(obj);
if (response.IsValid && response.Content != null && response.Content.IsValid)
{
string message = response.Content.ToString();
InsertBubble(m_BubbleLeft, Role.Assistant.ToString(), message);
}
}
```
---
#### Loop Edge Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/loop
This demo shows how Edge conditions and loops work in the graph node system.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `EdgeLoopDemo` scene.
2. Whenever you type something, it responds with the input prefixed by `*` characters on the left.
## Understanding the Graph
`NodeConnectionCanvas` contains an `InworldGraphExecutor`.
The graph contains three nodes:
- `TextCombiner`: A custom node that adds `*` in front of sentences.
- `NodeFinal`: A custom conversation‑endpoint node that prints the text.
- `FilterInput`: The start node, a custom node that filters out mismatched input types.
There are three edges:
- `FilterInput` to `TextCombiner`
- `TextCombiner` to `FilterInput`: This is a customized LoopEdge, with `IsLoop` toggled.
- `FilterInput` to `NodeFinal`.
`FilterInput` is the `StartNode` and `NodeFinal` is the `EndNode`.
You can also see this connection in the Graph Editor. It's clearer.
## CustomNode details
### TextCombiner
In its overridden `ProcessBaseData()`, it adds `*` to the left of the input text and sets it as the output.
```c# TextCombinerNodeAsset.cs
public class TextCombinerNodeAsset : CustomNodeAsset
{
public override string NodeTypeName => "TextCombinerNode";
public string currentText = "";
protected override InworldBaseData ProcessBaseData(InworldVector inputs)
{
if (inputs.Size == 0)
{
return new InworldError("No input data", StatusCode.DataLoss);
}
InworldBaseData inputData = inputs[0];
InworldText textResult = new InworldText(inputData);
if (textResult.IsValid)
currentText = $"* {textResult.Text}";
return new InworldText(currentText);
}
void OnEnable()
{
currentText = "";
}
}
```
### FilterInput
This is a custom node used in the [Character Interaction](./character).
It filters out input `InworldBaseData` that are neither `InworldText` nor `InworldAudio`.
```c# FilterInputNodeAsset.cs
public class FilterInputNodeAsset : CustomNodeAsset
{
public override string NodeTypeName => "FilterInputNode";
protected override InworldBaseData ProcessBaseData(InworldVector inputs)
{
if (inputs.Size == 0)
{
return new InworldError("No input data", StatusCode.DataLoss);
}
InworldBaseData inputData = inputs[0]; // YAN: Let's only process the last single input.
InworldText textResult = new InworldText(inputData);
if (textResult.IsValid)
return textResult;
InworldAudio audioResult = new InworldAudio(inputData);
if (audioResult.IsValid)
return audioResult;
return new InworldError($"Unsupported data type: {inputData.GetType()}", StatusCode.Unimplemented);
}
}
```
### NodeFinal
`NodeFinal` uses the custom node `ConversationEndpointNodeAsset`.
During `CreateRuntime()`, it stores the speaker's name and later returns output text containing both the speaker's name and the result.
It is typically used as the end node in [Character Interaction](./character).
```c# ConversationEndpointNodeAsset.cs
public override bool CreateRuntime(InworldGraphAsset graphAsset)
{
if (graphAsset is CharacterInteractionGraphAsset charGraph)
m_SpeakerName = m_IsPlayer ? InworldFrameworkUtil.PlayerName : charGraph.characters[0].characterName;
return base.CreateRuntime(graphAsset);
}
protected override InworldBaseData ProcessBaseData(InworldVector inputs)
{
if (inputs.Size <= 0)
return inputs[0];
InworldText text = new InworldText(inputs[0]);
if (text.IsValid)
{
return new InworldText($"{m_SpeakerName}: {text.Text}");
}
return inputs[0];
}
```
## Edge
The loop edge from `TextCombiner` to `FilterInput` is a customized edge with `IsLoop` toggled.
In its overridden checking function `MeetsCondition()`:
if the current loop count exceeds the limit, the edge blocks passage;
otherwise, it allows passage (sending control back to the loop start `FilterInput`).
Because each iteration prefixes another `*` before passing the result back to `FilterInput`, you will see the number of `*` increase with each loop.
```
public class LoopEdgeAsset : InworldEdgeAsset
{
public int echoTimes = 3;
int m_CurrentLoop = 0;
public override string EdgeTypeName => "LoopEdge";
protected override bool MeetsCondition(InworldBaseData inputData)
{
Debug.Log($"Current Loop: {m_CurrentLoop} -> {echoTimes}");
m_CurrentLoop++;
if (m_CurrentLoop < echoTimes)
return m_AllowedPassByDefault;
m_CurrentLoop = 0;
return !m_AllowedPassByDefault;
}
void OnEnable()
{
m_CurrentLoop = 0;
}
}
```
Be mindful of memory allocation when using Loop Edges.
Because the graph node system executes inside C++, each iteration may allocate new memory.
Other edges use the default behavior: they simply forward all output from the previous node to the next node.
### InworldController
The `InworldController` contains no primitive modules.
## Workflow
1. When the game starts, `InworldController` initializes immediately because there are no primitives.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`.
For how `CreateRuntime` works on custom nodes, see the [CustomNode Demo](./custom).
For edges, during `CreateRuntime()` the system calls `SetEdgeCondition()` and registers `OnConditionCheck` as the function pointer for the condition.
Inside `OnConditionCheck`, the system calls the overridden `MeetsCondition()` virtual function implemented by each edge subclass.
```c# InworldEdgeAsset.cs
public bool CreateRuntime(EdgeWrapper wrapper)
{
if (wrapper == null || !wrapper.IsValid)
return false;
m_RuntimeWrapper = wrapper;
if (IsLoop)
m_RuntimeWrapper.SetToLoop();
if (!IsRequired)
m_RuntimeWrapper.SetToOptional();
SetEdgeCondition(); // <==
m_RuntimeWrapper.SetCondition(m_Executor);
m_RuntimeWrapper.Build();
return true;
}
protected void SetEdgeCondition(string customEdgeName = "")
{
if (!m_IsMultiThread)
{
EdgeConditionExecutor executor = new EdgeConditionExecutor(OnConditionCheck, this);
if (!string.IsNullOrEmpty(customEdgeName))
InworldComponentManager.RegisterCustomEdgeCondition(customEdgeName, executor);
m_Executor = executor;
}
else
{
EdgeConditionThreadedExecutor executor = new EdgeConditionThreadedExecutor(OnConditionCheck);
if (!string.IsNullOrEmpty(customEdgeName))
InworldComponentManager.RegisterCustomEdgeCondition(customEdgeName, executor);
m_Executor = executor;
}
}
static void OnConditionCheck(IntPtr data)
{
InworldEdgeAsset edgeAsset = GCHandle.FromIntPtr(data).Target as InworldEdgeAsset;
if (edgeAsset == null)
return;
InworldBaseData inputData = new InworldBaseData(InworldInterop.inworld_EdgeConditionExecutor_GetLastInput());
InworldInterop.inworld_EdgeConditionExecutor_SetNextOutput(edgeAsset.MeetsCondition(inputData));
}
```
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked.
In this demo, `NodeConnectionTemplate` subscribes to it and enables the UI components.
Users can then interact with the graph system.
```c# LoopEdgeNodeTemplate.cs
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
foreach (InworldUIElement element in m_UIElements)
element.Interactable = true;
}
```
5. After the UI is initialized, send the input text to the graph.
6. Calling `ExecuteGraphAsync()` causes the graph to loop, sending the `TextCombiner` result back to `FilterInput` until the loop count configured by `LoopEdge` is reached.
It then produces a result and invokes `OnGraphResult()`, which `NodeConnectionTemplate` subscribes to in order to receive the data.
```c# LoopEdgeNodeTemplate.cs
protected override void OnGraphResult(InworldBaseData obj)
{
InworldText response = new InworldText(obj);
if (response.IsValid)
{
string message = response.Text;
InsertBubble(m_BubbleLeft, Role.User.ToString(), message);
}
}
```
---
#### Character Interaction Node Json Demo
Source: https://docs.inworld.ai/Unity/runtime/demos/nodes/json
A demo that uses all Primitive modules and behaves the same as the [Character Interaction Demo](./character).
That demo compiles by calling the API directly, while this one creates the runtime by supplying a JSON to CreateRuntime.
A well‑formed JSON can speed up compilation, and the JSON file can be reused across other platforms such as Unreal and Node.js.
Creating the runtime from JSON is still experimental.
Behavior may change at any time, and JSON generated by the Graph View Editor may encounter various issues.
## Run the Template
1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `CharacterInteractionNodeWithJson` scene.
2. After the scene loads, you can enter text and press Enter or click the `SEND` button to submit.
3. You can also hold the `Record` button to record audio, then release it to send.
4. All behavior should be exactly the same as the [Character Interaction Demo](./character).
## Understanding the Graph
The graph should be exactly the same as the [Character Interaction Demo](./character).
## JSON structure
A valid JSON contains the following sections: schema_version, components, and main. In main, provide the id, then list nodes and edges, and finally define start_nodes and end_nodes.
```json unity_character_engine.json
{
"schema_version": "1.0.0",
"components": [
...
{
"id": "stt_component",
"type": "STTInterface",
"creation_config": {
"type": "LocalSTTConfig",
"properties": {
"model_path": "{{STT_MODEL_PATH}}",
"device": {
"type": "CUDA",
"index": -1,
"info": {
"name": "",
"timestamp": 0,
"free_memory_bytes": 0,
"total_memory_bytes": 0
}
},
"default_config": {}
}
}
},
...
{
"id": "text_edge",
"type": "TextEdge"
},
...
],
"main": {
"id": "main_graph",
"nodes": [
{
"id": "FilterInput",
"type": "FilterInputNode"
},
{
"id": "Safety",
"type": "SafetyCheckerNode",
"creation_config": {
"type": "SafetyCheckerNodeCreationConfig",
"properties": {
"embedder_component_id": "bge_embedder_component",
"safety_config": {
"model_weights_path": "{{SAFETY_MODEL_PATH}}"
}
}
},
"execution_config": {
"type": "SafetyCheckerNodeExecutionConfig",
"properties": {
}
}
},
...
],
"edges": [
{
"from_node": "FilterInput",
"to_node": "STT",
"condition_id": "audio_edge",
"optional": true
},
...
],
"start_nodes": [
"FilterInput"
],
"end_nodes": [
"TTS",
"CharFinal",
"PlayerFinal"
]
}
}
```
### schema_version
For now, use 1.0.0.
### components
The only difference between the JSON-based approach and the pure API approach is that this ScriptableObject contains a list of components.
This effectively describes the JSON. These components are also ScriptableObjects; they have no effect at Unity AI Runtime.
Their sole purpose is to generate the JSON, because a valid Inworld JSON must include the components field.
```json unity_character_engine.json
{
"schema_version": "1.0.0",
"components": [
{
"id": "qwen_llm_component",
"type": "LLMInterface",
"creation_config": {
"type": "RemoteLLMConfig",
"properties": {
"provider": "inworld",
"model_name": "Qwen/Qwen2-72B-Instruct",
"api_key": "{{INWORLD_API_KEY}}",
"default_config": {
"max_new_tokens": 160,
"max_prompt_length": 8000,
"temperature": 0.7,
"top_p": 0.95,
"repetition_penalty": 1.0,
"frequency_penalty": 0.0,
"presence_penalty": 0.0,
"stop_sequences": [
"\n\n"
]
}
}
}
},
...
]
}
```
When using the Graph View Editor, whenever you create a node, if it is an Inworld node (LLM, TTS, etc.), a default component will be created for you if one doesn’t already exist.
#### Custom edges and components
All custom edges are also components. When writing JSON, explicitly declare them inside components.
```json unity_character_engine.json
{
"schema_version": "1.0.0",
"components": [
{
"id": "qwen_llm_component",
"type": "LLMInterface",
"creation_config": {
...
}
},
...
{
"id": "text_edge",
"type": "TextEdge"
},
{
"id": "audio_edge",
"type": "AudioEdge"
},
{
"id": "safety_edge",
"type": "SafetyEdge"
}
],
...
}
```
#### Relationship between nodes and components
When creating the runtime from JSON, the component field in the NodeAsset (if present) must be filled in.
In the nodes section, the id inside properties must match the component id defined above.
```
{
"schema_version": "1.0.0",
"components": [
{
"id": "stt_component", // <=== This is Component ID!
"type": "STTInterface",
"creation_config": {
"type": "LocalSTTConfig",
"properties": {
"model_path": "{{STT_MODEL_PATH}}",
"device": {
"type": "CUDA",
"index": -1,
"info": {
"name": "",
"timestamp": 0,
"free_memory_bytes": 0,
"total_memory_bytes": 0
}
},
"default_config": {}
}
}
},
...
]
"main": {
"id": "main_graph",
"nodes": [
{
"id": "FilterInput",
"type": "FilterInputNode"
},
...
{
"id": "STT",
"type": "STTNode",
"execution_config": {
"type": "STTNodeExecutionConfig",
"properties": {
"stt_component_id": "stt_component" // <=== Here need to be the same!
}
}
},
]
}
}
```
### main
The entire graph—its nodes, edges, start_nodes, and end_nodes—must be defined here.
#### node
For user-defined nodes, provide the name of the C# class, for example
```json
{
"id": "FilterInput",
"type": "FilterInputNode"
},
```
This corresponds to the C# class’s NodeTypeName.
```c# FilterInputNodeAsset.cs
public class FilterInputNodeAsset : CustomNodeAsset
{
public override string NodeTypeName => "FilterInputNode";
...
}
```
For Inworld nodes, formats vary; follow the sample JSON in this demo.
#### edge
For edges, ensure the ids match.
```json
{
"from_node": "FilterInput",
"to_node": "STT",
"condition_id": "audio_edge", //<==
"optional": true
},
```
For example, for the edge connecting FilterInputNode to STTNode, the condition_id must match the id defined in components.
```json
{
"id": "audio_edge", //<== Here.
"type": "AudioEdge"
},
```
### InworldController
The `InworldController` only provides the `InworldAudioManager` for audio output and does not require any Primitive modules.
For details about the AudioManager, see the [Speech-to-Text Node Demo](./stt#inworldaudiomanager)
## Workflow
1. When the game starts, `InworldController` initializes immediately because there are no modules.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`.
3. When the graph runs `CreateRuntime()`, if a JSON file is present it first runs `ParseJson()`, uses Newtonsoft.Json, and stores all JSON data into dictionaries.
```c# InworldGraphAsset.cs
public bool CreateRuntime()
{
string graphName = !string.IsNullOrEmpty(m_GraphName) ? m_GraphName : "UnnamedGraph";
if (!m_GraphJson || string.IsNullOrEmpty(m_GraphJson.text))
m_RuntimeGraph ??= new InworldGraph(graphName);
else
ParseJson(); // <============================================= Here
if (!CreateNodesRuntime())
return false;
if (!CreateEdgesRuntime())
return false;
if (!SetupStartNodesRuntime())
return false;
if (!SetupEndNodesRuntime())
return false;
return true;
}
public void ParseJson()
{
if (m_ParsedRoot != null || string.IsNullOrEmpty(m_GraphJson?.text))
return;
m_ParsedRoot = JObject.Parse(m_GraphJson.text);
if (!(m_ParsedRoot["main"] is JObject main))
return;
m_JsonNodeRegistry ??= new Dictionary();
if (main["nodes"] is JArray nodes)
{
foreach (JToken n in nodes)
{
string id = (string)n["id"];
if (!string.IsNullOrEmpty(id))
m_JsonNodeRegistry[id] = false;
}
}
m_JsonEdgeRegistry ??= new Dictionary<(string, string), bool>();
if (main["edges"] is JArray edges)
{
foreach (JToken e in edges)
{
string fromNode = (string)e["from_node"];
string toNode = (string)e["to_node"];
if (!string.IsNullOrEmpty(fromNode) && !string.IsNullOrEmpty(toNode))
m_JsonEdgeRegistry[(fromNode, toNode)] = false;
}
}
}
```
4. When each node/edge creates its runtime, it first calls `RegisterJson()` to apply the values from `m_JsonNodeRegistry` and `m_JsonEdgeRegistry`.
If this succeeds, the subsequent `CreateRuntime()` API calls are skipped.
```c# InworldGraphAsset.cs
public bool CreateNodesRuntime()
{
foreach (InworldNodeAsset nodeAsset in m_Nodes)
{
if (nodeAsset.IsValid)
continue;
if (nodeAsset.RegisterJson(this)) // <============= Here
continue;
if (m_RuntimeGraph == null)
{
Debug.LogError($"[InworldFramework] Creating Runtime Node Failed. Runtime Graph is Null.");
continue; //return false;
}
if (!nodeAsset.CreateRuntime(this))
{
Debug.LogError($"[InworldFramework] Creating Runtime for Node: {nodeAsset.NodeName} Type: {nodeAsset.NodeTypeName} failed");
return false;
}
...
}
}
public bool RegisterJsonNode(string nodeName)
{
if (m_JsonNodeRegistry == null || !m_JsonNodeRegistry.ContainsKey(nodeName))
return false;
m_JsonNodeRegistry[nodeName] = true;
return true;
}
```
5. During compilation, we call `InitializeRegistries()` to initialize all Primitive modules in a special way.
This function lives inside the Inworld library and is not fully complete.
When called in the Unity Editor, it runs independently of the editor lifecycle; its effects persist even after you stop play mode.
To mitigate this, a hard-coded safeguard ensures it is only called once per Unity Editor session.
If registries change and you need to call `InitializeRegistries()` again, please restart the Unity Editor.
6. After that, run `GraphParser.ParseGraph()` and pass other user data as needed to produce the `CompiledGraph()`
```c# InworldGraphAsset.cs
public bool CompileRuntime()
{
if (m_CompiledGraph == null || !m_CompiledGraph.IsValid)
{
if (m_ParsedRoot != null)
{
InitializeRegistries();
ConfigParser parser = new ConfigParser();
if (!parser.IsValid)
{
Debug.LogError("[InworldFramework] Graph compiled Error: Unable to create parser.");
return false;
}
m_CompiledGraph = parser.ParseGraph(m_GraphJson.text, m_UserData?.ToHashMap);
}
```
7. The remaining flow is identical to the [Character Interaction Node](./character)
---
#### Editor Tutorials
#### Overview
Source: https://docs.inworld.ai/Unity/runtime/editor/overview
In the Unity AI Runtime, you can use the 3 following ways to interact with Inworld AI modals:
* Using the C# code to directly access the Primitive Modules.
* Generating a ScriptableObject GraphAsset, with the also generated EdgeAssets and NodeAssets that will interact with primitives automatically.
* Using the GraphViewEditor to automatically generate the graph system.
Apparently, using the GraphViewEditor should be the easiet.
Although in some case you will still need to directly modify the ScriptableObjects or even the Code.
For those part, we will introduce later.
In this tutorial, let's focus on creating graphs with GraphViewEditor.
## Life Cycle
For the graph/node system, the life cycle of the runtime (When you actually click the `Play` button) is like this:
You will need a `InworldGraphExecutor` to hold the `GraphAsset`, it will do the above steps.
You will also need a `NodeTemplate` to communicate with the `InworldGraphExecutor`, to send and receive the data from Inworld.
In this overview, let's focus on the graph creating, so we will use the current default GraphNodeTemplate in the demos.
Let's clone a scene from `Assets/InworldRuntime/Scenes/Nodes/CharacterInteractionNode.unity`.
Let's rename the cloned one whatever you want. i.e; `Demo`, and open it.
The graph is held under the `InworldGraphExecutor`, which is at the root hierarchy of `CharacterInteractionCanvas`.
Navigate to that place, double click the asset to open it in Inspector.
Scroll down until you see the `Open Graph Editor` button. Click it.
You'll see the graph in the editor, and you'll have a glimpse about how each node works together.
You can scroll the mouse wheel to zoom in and zoom out.
You can press and hold the mouse wheel to move around.
Let's not edit this graph to avoid polluting the demo. We will create our own later.
Right click on the `Project` tab, select `Create > Inworld > Create Graph > CharacterInteraction`.
* User Data
* Character Data (In Characters list of the Character Interaction Data).
* Voice (If you're planning to use TTS)
* Prompt (if you're planning to use LLM)
You can click the `◎` icon to select the default one first.
Then clone and update your own data later.
Once the character data and the user data has been set, let's click the `Open in Graph Editor` button.
Right click on the Graph Editor, select `Create Node > Custom Nodes... > FilterInputNode`.
Rename it to `Input`.
Each node has to be renamed and has a unique name after created.
The graph compiler will fail if it found duplicated names.
The `FilterInputNode` is good to be use as the StartNode, because you can use it to receive multiple input nodes.
For example, currently it will detect if the format of the inputting data is correct.
If the Inputting data is `InworldText` or `InworldAudio`, it will pass, otherwise it will fail.
Currently, this node only support `InworldText` and `InworldAudio`.
But please feel free to inherit this `FilterInputNodeAsset` and add your desired input type on your own.
(Still must be a existing child class of InworldBaseData though)
From the output of FilterInput, hold left mouse button, drag an edge out.
Create a `STTNode` and rename it to `STT`.
`STT(Speech-to-text) Node` is used to generate the text from player's audio input.
It accepts `InworldAudio` as input, sends `InworldText` as output.
Did you notice that the background color of the STTNode is different?
That's because it's a dll node, not a custom node.
You probably don't want to inherit a dll node, because if you do, the graph system may get broken.
Right click on the Graph Editor, select `Create Node > Custom Nodes... > AddSpeechEventNode`.
Rename it to be `PlayerSpeech`.
Set the `IsPlayer` to be `true`.
The `AddSpeechEventNode` basically converts everything to `InworldText`.
Meanwhile, it will store this piece of data in the `ConversationData` of the `PromptAsset`, which will later be used to generate the prompt.
It's expected to have a pair of AddSpeechEventNode, one for player, one for agent.
By setting the `IsPlayer` to be true, the generated data will be set into the prompt as the `Player's speech`.
If not set, the agent will get confused with its identity.
Connect `Input` to `STT`, `STT` to `PlayerSpeech`, and `Input` to `PlayerSpeech`.
Right click the edge from `Input` to `STT`, select `Set Edge Type > Audio`.
Right click the edge from `Input` to `PlayerSpeech`, select `Set Edge Type > Text`.
By default, the edge can pass through everything without checking.
But each node can only receive certain amount of data types as their input.
Data type mismatch may even lead the Unity Editor to **crash**!
So it's necessary to set data type for the edges to mark specific data transfer.
Extend `PlayerSpeech` node out to create a `FormatPromptNode`
The `FormatPromptNode` is a customNode. It converts `InworldText` to `LLMChatRequest`, that will be used in `LLMNode` as input.
The `FormatPromptNode` will use the current graph's prompt data. Set under the `InworldGraphAsset` you just filled.
If you double click to open it, you'll find 2 fields. `Prompt` and `JinjaPrompt`.
We will use jinja to take the previous node's output, `InworldText`, together with the previously filled `UserData`, `CharacterData` to replace those `{{}}` defined the `Prompt` to generate the `JinjaPrompt` as the result.
And the `JinjaPrompt`, is the actual data we are going to send to `LLMNode`
Extend `FormatPromptNode` node out to create a `LLMNode`.
The LLMNode is a dll node that cannot be inherrited.
It takes `LLMChatRequest` as input, returns `LLMChatResponse` as output.
Extend `LLMNode` node out to create another `AddSpeechEventNode`.
Rename it to be `CharacterSpeech`.
Set the `IsPlayer` to be `false`.
Extend `CharacterSpeech` node out to create a `TTSNode`.
Rename it to be `TTS`.
Set the voice to be an available voiceID (Default is not an available ID right now. Please rename a valid voiceID)
Done!
You've finally finished a basic character conversation graph, with `InworldText` and `InworldAudio` as input, `InworldAudio` as output.
It will take your typed or spoken speech as input, use STT to convert your audio into text, pasted in the current prompt, get the response, then speak out with TTS.
You can simply press `Ctrl + S` or press the `Save button` at the top to save this graph.
After saved, you will find all the related NodeAssets and EdgeAssets are already created.
And all the Start/End Nodes are set.
Those ScriptableObjects are stored by default under the folder `Asset/Data/{GraphName}`
Find the `InworldGraphExecutor`, by default it's at the `CharacterInteractionCanvas`.
Replace the graph asset with the one you just created.
You will find generally it's working, just there are not bubbles popping out.
That's because we only have 1 end node, that's a TTSNode, so only the TTSOutput will be send out during the graph execution.
Open your graph editor, adding `ConversationEndpointNode` right after each `AddSpeechEventNode` (CharacterSpeech and PlayerSpeech)
The `ConversationEndpointNodes` is a customNode that often used as EndNodes, it will generate the sentence like `{Speaker}: {Contents}`.
For the user to extract and render correspondently.
* Rename the one after `CharacterSpeech` to `CharacterFinal` and set the `IsPlayer` to be `false`.
* Rename the one after `PlayerSpeech` to `PlayerFinal` and set the `IsPlayer` to be `true`.
Press "Save", you will see the data of the `GraphAsset` has changed.
The `EndNodes` now contains 3 nodes.
Done!
You will find generally it's working, and the bubbles are also popped out!
---
#### Build Graphs with the Graph Editor
Source: https://docs.inworld.ai/Unity/runtime/editor/graph
Everything you can do in the Graph Editor can also be achieved by creating, editing, or removing `ScriptableObject` assets.
In highly customized scenarios, configuring via `ScriptableObjects` may offer finer control (for example, defining your own fields).
The Graph Editor, however, is faster and requires less code.
## Create a Graph
You still need to create a `ScriptableObject` for your graph.
In the Project tab, right‑click and choose "Create > Inworld > Graph > Default" or "Character Interaction".
## Fill in Required Fields
Do not open the Graph Editor yet.
First, fill in some required fields, otherwise the graph will not run.
The Default Graph is suitable for general tasks that are not related to character interaction.
You can also inherit from Default Graph to implement more customized functionality.
UserData is required.
The Character Interaction Graph inherits from Default Graph.
It is a template focused on character interaction.
In addition to UserData, it has a mandatory field: CharacterData.
If you plan to use a TTS node, also fill in the `Voice ID`.
If you plan to use an LLM node, provide a `Prompt`.
Once these are set, you can open the Graph Editor.
## Zoom in Zoom out and Move Graph
You can scroll the mouse wheel to zoom in and zoom out.
You can press and hold the mouse wheel to move around.
## Create a Node
After opening the Graph Editor, right‑click inside the canvas and select `Create Node`.
You can add an Inworld node, or open Custom Nodes to add an existing custom node.
## Create a New Custom Node Type
While adding a node, you can also select `Create New Custom Node Script`.
A dialog will appear asking for a file name and a path.
Click `Create` to proceed.
If the IDE configured for your Unity Editor is already open, it will switch to that IDE and open the newly created file.
After that, reopen the Graph Editor and you will find your new custom node under Custom Nodes.
## Create an Edge
Drag from an output port of one node to an input port of another node to automatically create an edge.
You can also drag out from an output port into empty space.
The editor will open the Create Node dialog; after you select a node to create, an edge will be created automatically between the two nodes.
## Change Edge Type
Right‑click an edge to choose its type.
The editor currently only provides a set of predefined edge types.
If you want to create your own edge type, this is not yet supported within the Graph Editor.
Locate the corresponding `EdgeAsset` in your project and replace it with your custom ScriptableObject `EdgeAsset`.
## Delete Nodes or Edges
Select an edge and either `Right‑click > Delete` or press the `Delete` key.
Select a node and press the `Delete` key to remove it.
When you delete a node, any edges that can no longer connect will also be removed.
## Save the Graph
Click Save Graph in the top left to save your changes to the current graph.
You can also press "Ctrl + S" to do the same.
After saving, ScriptableObjects for the nodes and edges are generated under `Assets/Data/{Your graph name}`.
The Start and End nodes are also computed and set automatically.
## Export JSON
You can also click `Save Json` to generate a JSON representation of the graph.
This feature is experimental; some component settings may be incomplete and may need to be filled in manually.
---
#### Build Graphs by ScriptableObjects
Source: https://docs.inworld.ai/Unity/runtime/editor/so
## Create a Graph
In the Project tab, right‑click and choose `Create > Inworld > Graph > Default` or `Character Interaction`.
## Fill in Required Fields
First, fill in the required fields; otherwise the graph will not run.
The Default Graph is suitable for general tasks that are not related to character interaction.
You can also inherit from Default Graph to implement more customized functionality.
UserData is required.
The Character Interaction Graph inherits from Default Graph.
It is a template focused on character interaction.
In addition to UserData, it has a mandatory field: CharacterData.
If you plan to use a TTS node, also fill in the `Voice ID`.
If you plan to use an LLM node, provide a `Prompt`.
## Create a Node
In the Project tab, right‑click and choose `Create > Inworld > Create Node`, then choose your node type.
For the custom nodes used in Character Interaction (such as `FilterInput`, `FormatPrompt`, etc.), they are stored under `CharacterInteraction`.
After you create it, rename the ScriptableObject — especially its `Node Name` under `General Configuration`.
## Add a node to the graph
You can drag it directly into the graph's `Nodes` list.
## Create an Edge
In the Project tab, right‑click and choose `Create > Inworld > Create Edge`, then choose your edge type.
Then assign the target `GraphAsset`, the source `NodeAsset`, and the target `NodeAsset` to the edge.
Finally, don't forget to add the created edge to the graph's `Edges` list.
## Create a New Custom Node/Edge Type
Without the Graph Editor, this is more involved. First, create a Unity script that inherits from `InworldNodeAsset` (or the appropriate base type for edges).
Then add the `[CreateAssetMenu]` attribute to the class so it appears in the Create menu.
## Configure Start/End Nodes
Although this step is handled automatically by the Graph Editor, it is required when creating a graph via ScriptableObjects.
Drag the corresponding `NodeAsset` into the `Start/End Node` fields on the `GraphAsset`.
### About End Node and End Nodes
We provide both `End Node` and `End Nodes` fields. The only difference is that `End Nodes` supports multiple end nodes.
If you have only one end node, you can use either field.
## Validate with Graph Editor
At any time, you can open the Graph Editor to review your graph.
Any edits you make directly in the GraphAsset will immediately be reflected in the Graph Editor.
---
#### Runtime Reference
#### Overview
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/overview
## Classes
References for C# classes. These are classes that implements the interactions between Unity and the underlying Inworld C++ library.
- [InworldController](./InworldController)
- [InworldGraphExecutor](./InworldGraphExecutor)
- [InworldAudioManager](./InworldAudioModule/InworldAudioManager)
- [InworldAudioModule](./InworldAudioModule/InworldAudioModule)
- [AudioCaptureModule](./InworldAudioModule/AudioCaptureModule)
- [AudioCollectModule](./InworldAudioModule/AudioCollectModule)
- [AudioDispatchModule](./InworldAudioModule/AudioDispatchModule)
- [VoiceActivityDetector](./InworldAudioModule/VoiceActivityDetector)
- [InworldFrameworkModule](./InworldFrameworkModule/InworldFrameworkModule)
- [InworldAECModule](./InworldFrameworkModule/InworldAECModule)
- [InworldLLMModule](./InworldFrameworkModule/InworldLLMModule)
- [InworldSafetyModule](./InworldFrameworkModule/InworldSafetyModule)
- [InworldSTTModule](./InworldFrameworkModule/InworldSTTModule)
- [InworldTTSModule](./InworldFrameworkModule/InworldTTSModule)
- [InworldVADModule](./InworldFrameworkModule/InworldVADModule)
- [KnowledgeModule](./InworldFrameworkModule/KnowledgeModule)
- [TelemetryModule](./InworldFrameworkModule/TelemetryModule)
- [TextEmbedderModule](./InworldFrameworkModule/TextEmbedderModule)
## Scriptable Objects
References for Scriptable Objects. These are configurable assets that you can create by the asset menu.
### Graph Asset
Graph is the container for an AI chain execution path, composed of Nodes, Edges and Components (experimental).
At runtime, they are compiled (to their respective runtime classes you can access) and executed by the Graph Executor.
- [InworldGraphAsset](./InworldGraphAsset/InworldGraphAsset)
- [CharacterInteractionGraphAsset](./InworldGraphAsset/CharacterInteractionGraphAsset)
### Node Asset
- [InworldNodeAsset](./InworldNodeAsset/InworldNodeAsset)
- [CustomNodeAsset](./InworldNodeAsset/CustomNodeAsset)
- [IntentNodeAsset](./InworldNodeAsset/IntentNodeAsset)
- [LLMNodeAsset](./InworldNodeAsset/LLMNodeAsset)
- [RandomCannedTextNodeAsset](./InworldNodeAsset/RandomCannedTextNodeAsset)
- [SafetyNodeAsset](./InworldNodeAsset/SafetyNodeAsset)
- [STTNodeAsset](./InworldNodeAsset/STTNodeAsset)
- [SubgraphNodeAsset](./InworldNodeAsset/SubgraphNodeAsset)
- [TextAggregatorNodeAsset](./InworldNodeAsset/TextAggregatorNodeAsset)
- [TextChunkingNodeAsset](./InworldNodeAsset/TextChunkingNodeAsset)
- [TextProcessorNodeAsset](./InworldNodeAsset/TextProcessorNodeAsset)
- [TTSNodeAsset](./InworldNodeAsset/TTSNodeAsset)
### Edge Asset
- [InworldEdgeAsset](./InworldEdgeAsset/InworldEdgeAsset)
- [InworldAudioEdgeAsset](./InworldEdgeAsset/InworldAudioEdgeAsset)
- [InworldJsonEdgeAsset](./InworldEdgeAsset/InworldJsonEdgeAsset)
- [InworldLLMEdgeAsset](./InworldEdgeAsset/InworldLLMEdgeAsset)
- [InworldSafetyEdgeAsset](./InworldEdgeAsset/InworldSafetyEdgeAsset)
- [InworldTextEdgeAsset](./InworldEdgeAsset/InworldTextEdgeAsset)
### Component Asset (Experimental)
- [InworldComponentAsset](./InworldComponentAsset/InworldComponentAsset)
The component class is created to support Graph JSON serialization. This feature is currently experimental.
### Data Asset
- [CharacterData](./DataClasses/CharacterData)
- [GoalData](./DataClasses/GoalData)
- [IntentData](./DataClasses/IntentData)
- [KnowledgeData](./DataClasses/KnowledgeData)
## Node Input/Output Reference
This table provides a quick reference for all workflow nodes and their input/output data types:
| Node Name | Input Types | Output Types |
|-----------|-------------|--------------|
| **Inworld Nodes** |
| [LLM](./InworldNodeAsset/LLMNodeAsset) | `LLMChatRequest` | `LLMChatResponse` |
| [STT](./InworldNodeAsset/STTNodeAsset) | `InworldAudioChunk` | `InworldText` |
| [TTS](./InworldNodeAsset/TTSNodeAsset) | `InworldText` | `InworldDataStream` |
| | `TTSRequest` | `InworldDataStream` |
| [TextAggregator](./InworldNodeAsset/TextAggregatorNodeAsset) | `InworldText` | `InworldText` |
| | `DatInworldDataStreamaStream` | `InworldText` |
| | `LLMCompletionResponse` | `InworldText` |
| | `LLMChatResponse` | `InworldText` |
| [TextChunking](./InworldNodeAsset/TextChunkingNodeAsset) | `InworldText` | `InworldText` |
| | `InworldDataStream` | `InworldText` |
| | `LLMCompletionResponse` | `InworldText` |
| | `LLMChatResponse` | `InworldText` |
| [RandomCannedText](./InworldNodeAsset/RandomCannedTextNodeAsset) | N/A | `InworldText` |
| [Safety](./InworldNodeAsset/SafetyNodeAsset) | `InworldText` | `SafetyResult` |
| [Intent](./InworldNodeAsset/SafetyNodeAsset) | `InworldText` | `MatchedIntents` |
| **Unity Nodes (CustomNodes Used in the Character Interaction)** |
| `AddSpeechEventNode` | `SafetyResult` | `InworldText` |
| | `LLMChatResponse` | `InworldText` |
| | `InworldText` | `InworldText` |
| | `InworldDataStream` | `InworldText` |
| `ConversationEndpointNode` | `InworldText` | `InworldText` |
| `CustomSampleNode` | `InworldText` | `InworldText` |
| | `InworldAudioChunk` | `InworldAudioChunk` |
| `FilterInputNode` | Any | `InworldText` |
| | | `InworldAudioChunk` |
| `FormatPromptNode` | `InworldText` | `LLMChatRequest` |
| `GetPlayerNameNode` | N/A | `InworldText` |
| `TextCombinerNode` | `InworldText` | `InworldText` |
| `TextProcessorNode` | `InworldDataStream` | `InworldDataStream` |
| `TxtToPromptSampleNode` | `InworldText` | `LLMChatRequest` |
| **Inworld Nodes (Available via API Only, no Unity NodeAsset yet)** |
| `LLMCompletionNode` | `InworldText` | `LLMCompletionResponse` |
| `LLMPromptBuilderNode` | `InworldJson` | `InworldText` |
| `LLMChatRequestBuilderNode` | `InworldJson` | `LLMChatRequest` |
| `KnowledgeNode` | `InworldText` | `KnowledgeRecords` |
| `KeywordMatcherNode` | `InworldText` | `MatchedKeywords` |
| `TextClassifierNode` | `InworldText` | `ClassificationResult` |
| `MemoryUpdateNode` | `InworldText` | `MemoryState` |
| | `EventHistory` | `MemoryState` |
| | `MemoryState` | `MemoryState` |
| | `Text` | `MemoryState` |
| `MemoryRetrieveNode` | `EventHistory` | `KnowledgeRecords` |
| | `MemoryState` | `KnowledgeRecords` |
| | `Text` | `KnowledgeRecords` |
| `GoalAdvancementNode` | `EventHistory` | `GoalAdvancement` |
| | `EventHistory` | `GoalAdvancement` |
| | `Json` | `GoalAdvancement` |
| | `Text` | `GoalAdvancement` |
| `MCPListToolsNode` | N/A | `ListToolsData` |
| `MCPCallToolNode` | `ListToolCallData` | `ListToolCallsResults` |
## More Reference Pages
Explore the class inheritance relationships in the Unity AI Runtime.
Discover all runtime interfaces and the contracts they define.
Reference for all available enums, their meanings, and values.
Learn about callback delegates used throughout the runtime.
---
#### Runtime Reference > Classes
#### Inworld Controller
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldController
[Overview](./overview) > Inworld Controller
**Class:** `InworldController` | **Inherits from:** `SingletonBehavior`
The singleton class of the controller. It's the primitive handler that manages all framework modules and their initialization process. The InworldController coordinates the initialization of various modules including LLM, TTS, STT, VAD, Safety, AEC, Text Embedder, and Knowledge modules.
## Properties
- [IsDebugMode](#isdebugmode)
- [LLM](#llm)
- [TTS](#tts)
- [STT](#stt)
- [AEC](#aec)
- [VAD](#vad)
- [TextEmbedder](#textembedder)
- [Safety](#safety)
- [Knowledge](#knowledge)
- [Audio](#audio)
## Methods
- [InitializeAsync](#initializeasync)
## Events
- [OnProcessInitialized](#onprocessinitialized)
- [OnFrameworkInitialized](#onframeworkinitialized)
## Reference
### IsDebugMode
Set/Get DebugMode. Controls verbose logging throughout the framework.
#### Returns
**Type:** `bool`
---
Below are all the modules we currently support. They are stored as static fields after initialization.
They are required to be added manually in the scene, where the InworldController Component is attached.

The modules referenced need to be created alongside the controller. It's usually helpful to organize them in one Prefab and modify it based on your needs.

### LLM
Get the [LLM Module](./InworldFrameworkModule/InworldLLMModule).
#### Returns
**Type:** `InworldLLMModule`
---
### TTS
Get the [TTS Module](./InworldFrameworkModule/InworldTTSModule).
#### Returns
**Type:** `InworldTTSModule`
---
### STT
Get the [STT Module](./InworldFrameworkModule/InworldSTTModule).
#### Returns
**Type:** `InworldSTTModule`
---
### AEC
Get the [AEC Module](./InworldFrameworkModule/InworldAECModule).
#### Returns
**Type:** `InworldAECModule`
---
### VAD
Get the [VAD Module](./InworldFrameworkModule/InworldVADModule).
#### Returns
**Type:** `InworldVADModule`
---
### TextEmbedder
Get the [Text Embedder Module](./InworldFrameworkModule/TextEmbedderModule).
#### Returns
**Type:** `TextEmbedderModule`
---
### Safety
Get the [Safety Module](./InworldFrameworkModule/InworldSafetyModule).
This module should be initialized after text embedder initialize.
#### Returns
**Type:** `InworldSafetyModule`
---
### Knowledge
Get the [Knowledge Module](./InworldFrameworkModule/KnowledgeModule).
This module should be initialized after text embedder initialize.
#### Returns
**Type:** `KnowledgeModule`
---
### Audio
Gets the [InworldAudioManager](./InworldAudioModule/InworldAudioManager) component responsible for audio input, processing, and output.
Automatically finds and caches the audio manager in the scene if not already set.
Handles microphone capture, voice detection, and audio streaming to the Inworld service.
#### Returns
**Type:** `InworldAudioManager`
---
### InitializeAsync
Asynchronously initializes all framework modules and establishes connections to required services.
Validates the API key, builds telemetry configuration, and sequentially initializes each module group.
This method should be called before using any Inworld functionality.
#### Returns
**Type:** `void`
---
### OnProcessInitialized
Event triggered when a loading process phase is completed during initialization.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| processIndex | `int` | The index of the completed process phase |
---
### OnFrameworkInitialized
Event triggered when all framework modules have been successfully initialized.
#### Parameters
None
---
---
#### Inworld Graph Executor
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldGraphExecutor
[Overview](./overview) > Inworld Graph Executor
**Class:** `InworldGraphExecutor` | **Inherits from:** `MonoBehaviour`
Component responsible for executing Inworld AI graph assets within Unity. Manages the lifecycle of graph compilation, execution, and result handling for AI-powered workflows. Provides event-driven architecture for monitoring graph state and receiving output data.
## Properties
- [Graph](#graph)
- [Error](#error)
- [IsCompiled](#iscompiled)
- [IsExecuting](#isexecuting)
- [RuntimeGraph](#runtimegraph)
- [IsInitialized](#isinitialized)
## Methods
- [LoadData](#loaddata)
- [Compile](#compile)
- [ExecuteGraphAsync](#executegraphasync)
- [StopExecution](#stopexecution)
- [CleanupRuntimeObjects](#cleanupruntimeobjects)
## Events
- [OnGraphCompiled](#ongraphcompiled)
- [OnGraphStarted](#ongraphstarted)
- [OnGraphResult](#ongraphresult)
- [OnGraphFinished](#ongraphfinished)
- [OnGraphError](#ongrapherror)
## Reference
The graph executor is the independent hub of the graph engine. Assign a graph asset in editor or at runtime, then compile it to make it executable.

### Graph
Gets or sets the graph asset to be executed by this component.
When set, clears the current compilation state and runtime objects.
#### Returns
**Type:** `InworldGraphAsset`
---
### Error
Gets or sets the current error message from graph operations.
When set, automatically logs the error and triggers the OnGraphError event.
#### Returns
**Type:** `string`
---
### IsCompiled
Gets a value indicating whether the current graph has been successfully compiled.
#### Returns
**Type:** `bool`
---
### IsExecuting
Gets a value indicating whether the graph is currently executing.
#### Returns
**Type:** `bool`
---
### RuntimeGraph
Gets the runtime instance of the compiled graph.
Returns null if no graph is loaded or compilation failed.
#### Returns
**Type:** `InworldGraph`
---
### IsInitialized
Gets a value indicating whether the graph executor is initialized and ready for execution.
#### Returns
**Type:** `bool`
---
### LoadData
Loads a new graph asset for execution, replacing the current one.
Cannot be called while a graph is currently executing.
Automatically clears compilation state and runtime objects.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| graphAsset | `InworldGraphAsset` | The graph asset to load for execution. |
#### Returns
**Type:** `void`
---
### Compile
Compiles the current graph asset into a runtime executable form.
Must be called before executing the graph.
Triggers OnGraphCompiled event on success or OnGraphError on failure.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| compiled | `CompiledGraphInterface` | Optional compiled graph interface. |
#### Returns
**Type:** `bool`
**Description:** True if compilation succeeded, false otherwise.
---
### ExecuteGraphAsync
Executes the compiled graph asynchronously with optional input data.
The graph must be compiled before execution can begin.
Supports both streaming and simplified execution modes.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| executorName | `string` | Optional identifier for this execution instance. If empty, a GUID will be generated. |
| input | `InworldBaseData` | Optional input data to provide to the graph. If null, an empty InworldBaseData will be used. |
#### Returns
**Type:** `Awaitable`
**Description:** A task that completes with true if execution started successfully, false otherwise.
---
### StopExecution
Stops the current graph execution if one is running.
Closes all active executions and triggers the OnGraphFinished event.
#### Returns
**Type:** `void`
---
### CleanupRuntimeObjects
Cleans up all runtime objects associated with the current graph.
Resets compilation and execution state to allow for fresh graph loading.
#### Returns
**Type:** `void`
---
### OnGraphCompiled
Event triggered when a graph asset has been successfully compiled and is ready for execution.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| graphAsset | `InworldGraphAsset` | The compiled graph asset. |
---
### OnGraphStarted
Event triggered when graph execution begins.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| graphAsset | `InworldGraphAsset` | The graph asset that started execution. |
---
### OnGraphResult
Event triggered when the graph produces a result during execution.
Provides access to intermediate and final output data from the graph.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| result | `InworldBaseData` | The result data produced by the graph. |
---
### OnGraphFinished
Event triggered when graph execution completes successfully.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| graphAsset | `InworldGraphAsset` | The graph asset that finished execution. |
---
### OnGraphError
Event triggered when an error occurs during graph compilation or execution.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| graphAsset | `InworldGraphAsset` | The graph asset that encountered an error. |
| errorMessage | `string` | The error message describing what went wrong. |
---
---
#### Runtime Reference > Classes > InworldAudioModule
#### Inworld Audio Manager
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldAudioModule/InworldAudioManager
[Overview](../overview) > Inworld Audio Manager
**Class:** `InworldAudioManager` | **Inherits from:** `MonoBehaviour` | **Requires:** `AudioSource`
Manages audio recording, processing, and playback for the Inworld framework. Coordinates multiple audio modules and handles microphone input, voice detection, and audio streaming. Requires an AudioSource component for audio playback functionality.
## Properties
- [InputBuffer](#inputbuffer)
- [DeviceName](#devicename)
- [RecordingSource](#recordingsource)
- [RecordingClip](#recordingclip)
- [Event](#event)
- [IsPlayerSpeaking](#isplayerspeaking)
- [IsCalibrating](#iscalibrating)
- [Volume](#volume)
- [IsMicRecording](#ismicrecording)
## Methods
- [StartMicrophone](#startmicrophone)
- [StopMicrophone](#stopmicrophone)
- [StartAudioThread](#startaudiothread)
- [StopAudioThread](#stopaudiothread)
- [ResetPointer](#resetpointer)
- [CollectAudio](#collectaudio)
- [StartVoiceDetecting](#startvoicedetecting)
- [StopVoiceDetecting](#stopvoicedetecting)
- [StartCalibrate](#startcalibrate)
- [StopCalibrate](#stopcalibrate)
- [PreProcess](#preprocess)
- [PostProcess](#postprocess)
- [PushAudio](#pushaudio)
- [OnSendAudio](#onsendaudio)
- [TryDeleteModule](#trydeletemodule)
- [AddModule](#addmodule)
- [GetModule](#getmodule)
- [GetUniqueModule](#getuniquemodule)
- [GetModules](#getmodules)
## Reference
The InworldAudioManager component manages all audio-related functionality in the Inworld framework. It coordinates multiple audio modules and handles microphone input, voice detection, and audio streaming.

### Serialized Fields
The following fields are configurable in the Unity Inspector:
- **Audio Modules** (`m_AudioModules`) - List of InworldAudioModule components that handle different aspects of audio processing. Depending on the interface they implement, their processing function are called at different times.
- **Audio Event** (`m_AudioEvent`) - AudioEvent component for handling audio-related events and callbacks
- **Device Name** (`m_DeviceName`) - Name of the microphone device to use for audio recording (empty for default device)
### InputBuffer
Gets or sets the circular buffer used for storing input audio data.
This buffer holds microphone input samples for processing by audio modules.
#### Returns
**Type:** `CircularBuffer`
---
### DeviceName
Gets or sets the name of the microphone device to use for audio recording.
If empty, the system default microphone will be used.
#### Returns
**Type:** `string`
---
### RecordingSource
Gets the AudioSource component attached to this AudioManager.
Creates a new AudioSource if one doesn't exist. Used for audio playback and recording.
#### Returns
**Type:** `AudioSource`
---
### RecordingClip
Get the current AudioClip (Used for microphone input)
#### Returns
**Type:** `AudioClip`
---
### Event
Gets the AudioEvent component for handling audio-related events.
#### Returns
**Type:** `AudioEvent`
---
### IsPlayerSpeaking
Gets or sets whether the player is currently speaking.
When set, triggers appropriate events and notifies audio handlers.
#### Returns
**Type:** `bool`
---
### IsCalibrating
Gets or sets whether the audio system is currently calibrating.
When set, triggers appropriate calibration events.
#### Returns
**Type:** `bool`
---
### Volume
Gets or sets the volume of the recording source.
#### Returns
**Type:** `float`
---
### IsMicRecording
Gets whether the microphone is currently recording.
#### Returns
**Type:** `bool`
---
### StartMicrophone
Start microphone handling.
#### Returns
**Type:** `bool`
**Description:** False if there's no active Microphone Handler. True otherwise.
---
### StopMicrophone
Stop microphone handling.
#### Returns
**Type:** `bool`
**Description:** False if there's no active Microphone Handler. True otherwise.
---
### StartAudioThread
Should be called after the handler setup buffers and clips for data.
This function will kickstart the coroutine to collect audio, which gets constantly sent to the handler.
#### Returns
**Type:** `void`
---
### StopAudioThread
Stop the audio coroutine for audio collection. Note that this function does not touch the buffers (more like pause than stop).
#### Returns
**Type:** `void`
---
### ResetPointer
Resets the pointer for audio collection.
#### Returns
**Type:** `void`
---
### CollectAudio
Collects audio from all registered audio collection handlers.
#### Returns
**Type:** `void`
---
### StartVoiceDetecting
Starts voice detection for player audio events.
#### Returns
**Type:** `void`
---
### StopVoiceDetecting
Stops voice detection for player audio events.
#### Returns
**Type:** `void`
---
### StartCalibrate
Starts calibration for all registered audio calibration handlers.
#### Returns
**Type:** `void`
---
### StopCalibrate
Stops calibration for all registered audio calibration handlers.
#### Returns
**Type:** `void`
---
### PreProcess
Runs pre-processing on all registered audio processing handlers.
#### Returns
**Type:** `void`
---
### PostProcess
Runs post-processing on all registered audio processing handlers.
#### Returns
**Type:** `void`
---
### PushAudio
Manually push the audio wave data to server.
#### Returns
**Type:** `IEnumerator`
---
### OnSendAudio
Sends audio data and triggers the audio sent event.
#### Returns
**Type:** `void`
---
### TryDeleteModule
Attempts to delete modules of the specified type from the audio modules list.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| T | `Type` | The type of module to delete. |
#### Returns
**Type:** `bool`
**Description:** True if any modules were deleted, false otherwise.
---
### AddModule
Adds an audio module to the manager's module list.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| module | `InworldAudioModule` | The audio module to add. |
#### Returns
**Type:** `void`
---
### GetModule
Gets the first module of the specified type from the audio modules list.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| T | `Type` | The type of module to retrieve. |
#### Returns
**Type:** `T`
**Description:** The first module of the specified type, or null if none found.
---
### GetUniqueModule
Gets a unique module of the specified type. Throws an exception if no module is found or if multiple modules exist.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| T | `Type` | The type of module to retrieve. |
#### Returns
**Type:** `T`
**Description:** The unique module of the specified type.
---
### GetModules
Gets all modules of the specified type from the audio modules list.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| T | `Type` | The type of modules to retrieve. |
#### Returns
**Type:** `List`
**Description:** A list of all modules of the specified type.
---
---
#### Inworld Audio Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldAudioModule/InworldAudioModule
[Overview](../overview) > Inworld Audio Module
**Class:** `InworldAudioModule` | **Inherits from:** `MonoBehaviour`
Abstract base class for all audio processing modules in the Inworld framework. Provides common functionality for audio modules including coroutine management and audio manager access. All audio modules should inherit from this class to integrate with the audio processing pipeline.
## Properties
- [Priority](#priority)
- [Audio](#audio)
## Methods
- [StartModule](#startmodule)
- [StopModule](#stopmodule)
---
## Audio Processing Pipeline
The InworldAudioModule interfaces are called in a specific order during audio processing:
1. **Pre-Processing**: `OnPreProcessAudio()` is called first
2. **Collection**: `OnCollectAudio()` gathers audio data from the input buffer
3. **Post-Processing**: `OnPostProcessAudio()` processes the collected data
4. **Voice Detection**: Voice detection runs continuously when enabled
5. **Audio Sending**: Audio data is sent when the player is speaking
This pipeline runs continuously in the audio coroutine managed by the InworldAudioManager.
## Constants
The base class defines several audio processing constants:
- `k_InputSampleRate` (16000) - Standard sample rate for audio input
- `k_InputChannels` (1) - Number of audio channels (mono)
- `k_InputBufferSecond` (1) - Buffer duration in seconds
- `k_SizeofInt16` (2) - Size of 16-bit integer in bytes
---
## Interfaces
The InworldAudioModule system uses several interfaces that define different aspects of audio processing. These interfaces are implemented by child classes and called by the [InworldAudioManager](./InworldAudioManager) at specific times:
### IMicrophoneHandler
Handles microphone recording functionality.
**Methods:**
- `ListMicDevices()` - Lists available microphone devices
- `StartMicrophone()` - Called by [InworldAudioManager.StartMicrophone()](./InworldAudioManager#startmicrophone)
- `StopMicrophone()` - Called by [InworldAudioManager.StopMicrophone()](./InworldAudioManager#stopmicrophone)
- `ChangeInputDevice(string deviceName)` - Changes the input device
- `IsMicRecording` (property) - Accessed by [InworldAudioManager.IsMicRecording](./InworldAudioManager#ismicrecording)
### ICollectAudioHandler
Handles audio data collection from the input buffer.
**Methods:**
- `OnCollectAudio()` - Called by [InworldAudioManager.CollectAudio()](./InworldAudioManager#collectaudio) during the audio coroutine
- `ResetPointer()` - Called by [InworldAudioManager.ResetPointer()](./InworldAudioManager#resetpointer)
### IPlayerAudioEventHandler
Handles voice detection and player audio events.
**Methods:**
- `OnPlayerUpdate()` - Coroutine for continuous player audio updates
- `StartVoiceDetecting()` - Called by [InworldAudioManager.StartVoiceDetecting()](./InworldAudioManager#startvoicedetecting)
- `StopVoiceDetecting()` - Called by [InworldAudioManager.StopVoiceDetecting()](./InworldAudioManager#stopvoicedetecting)
### ICalibrateAudioHandler
Handles audio calibration functionality.
**Methods:**
- `OnStartCalibration()` - Called by [InworldAudioManager.StartCalibrate()](./InworldAudioManager#startcalibrate)
- `OnStopCalibration()` - Called by [InworldAudioManager.StopCalibrate()](./InworldAudioManager#stopcalibrate)
- `OnCalibrate()` - Performs the actual calibration process
### IProcessAudioHandler
Handles audio pre-processing and post-processing.
**Methods:**
- `OnPreProcessAudio()` - Called by [InworldAudioManager.PreProcess()](./InworldAudioManager#preprocess) during the audio coroutine
- `OnPostProcessAudio()` - Called by [InworldAudioManager.PostProcess()](./InworldAudioManager#postprocess) during the audio coroutine
- `ProcessedBuffer` (property) - Buffer containing processed audio data
### ISendAudioHandler
Handles sending audio data to external services.
**Methods:**
- `OnStartSendAudio()` - Called when [InworldAudioManager.IsPlayerSpeaking](./InworldAudioManager#isplayerspeaking) is set to true
- `OnStopSendAudio()` - Called when [InworldAudioManager.IsPlayerSpeaking](./InworldAudioManager#isplayerspeaking) is set to false
- `Samples` (property) - Accessed by [InworldAudioManager.OnSendAudio()](./InworldAudioManager#onsendaudio) and [InworldAudioManager.PushAudio()](./InworldAudioManager#pushaudio)
## Reference
### Priority
Gets or sets the execution priority of this audio module.
Lower values indicate higher priority and will be processed first.
#### Returns
**Type:** `int`
---
### Audio
Gets the InworldAudioManager instance that coordinates audio processing.
Automatically finds and caches the audio manager in the scene if not already set.
#### Returns
**Type:** `InworldAudioManager`
---
### StartModule
Starts the module's processing coroutine.
Only one coroutine can be active per module at a time.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| moduleCycle | `IEnumerator` | The coroutine to start for this module's processing cycle. |
#### Returns
**Type:** `void`
---
### StopModule
Stops the currently running module coroutine and cleans up resources.
Safe to call even if no coroutine is currently running.
#### Returns
**Type:** `void`
---
---
#### Audio Capture Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldAudioModule/AudioCaptureModule
[Overview](../overview) > Audio Capture Module
**Class:** `AudioCaptureModule` | **Inherits from:** `InworldAudioModule` | **Implements:** `IMicrophoneHandler`
Core module responsible for microphone capture and audio input management in the Inworld framework. This module handles the low-level microphone operations and audio thread management. Only one AudioCaptureModule should be active in the module list at any time.
> **Note:** This module is not available on Unity WebGL builds due to microphone access limitations.
## Properties
- [autoStart](#autostart)
- [IsMicRecording](#ismicrecording)
## Methods
- [StartMicrophone](#startmicrophone)
- [StopMicrophone](#stopmicrophone)
- [ChangeInputDevice](#changeinputdevice)
- [ListMicDevices](#listmicdevices)
## Reference
### autoStart
Determines whether the microphone should automatically start when this module initializes.
When true, the microphone will begin capturing and calibration will start automatically.
#### Returns
**Type:** `bool`
---
### IsMicRecording
Gets whether the microphone is currently recording.
#### Returns
**Type:** `bool`
---
### StartMicrophone
Initializes and starts the microphone capture system using Unity's built-in microphone functionality.
Creates an AudioClip for continuous recording and begins the audio processing thread.
#### Returns
**Type:** `bool`
**Description:** True if the microphone was successfully started and an AudioClip was created; otherwise, false.
---
### StopMicrophone
Stop the microphone input using Unity's official method.
#### Returns
**Type:** `bool`
**Description:** True if successfully stopped, false otherwise.
---
### ChangeInputDevice
Changes the microphone input device name.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| deviceName | `string` | The device name to switch to. |
#### Returns
**Type:** `bool`
**Description:** True if the device was successfully changed, false otherwise.
---
### ListMicDevices
Gets the list of available microphone devices.
#### Returns
**Type:** `List`
**Description:** A list of available microphone device names.
---
## Important Notes
### Single Instance Requirement
Only one AudioCaptureModule should be active in the module list at any time. Having multiple instances can cause conflicts in microphone access and audio processing.
### WebGL Compatibility
This module is not available on Unity WebGL builds due to microphone access limitations. The code is wrapped in `#if !UNITY_WEBGL` preprocessor directives.
### Microphone vs Recording State
The Start/Stop microphone functions control the underlying audio capture system, separate from recording state which controls actual data transmission. Use these functions sparingly and only when necessary for system management.
### Automatic Initialization
When `autoStart` is enabled (default), the module will automatically:
1. Start the microphone when the component initializes
2. Begin calibration if the microphone starts successfully
---
---
#### Audio Collect Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldAudioModule/AudioCollectModule
[Overview](../overview) > Audio Collect Module
**Class:** `AudioCollectModule` | **Inherits from:** `InworldAudioModule` | **Implements:** `ICollectAudioHandler`
Audio sampling module responsible for collecting audio data from the microphone input buffer. This module interfaces with Unity's microphone system to continuously capture audio frames and populate the input buffer for further processing by other audio modules.
> **Note:** This module is not available on Unity WebGL builds due to microphone access limitations.
## Properties
- [m_AutoReconnect](#mautoreconnect)
## Methods
- [OnCollectAudio](#oncollectaudio)
- [ResetPointer](#resetpointer)
## Reference
### m_AutoReconnect
Determines whether the module should automatically attempt to reconnect the microphone if it becomes disconnected.
When true, the module will automatically restart the microphone if recording stops unexpectedly.
#### Returns
**Type:** `bool`
---
### OnCollectAudio
Collects audio data from the microphone for the current frame and updates the input buffer.
Called by the AudioManager's audio processing coroutine to capture approximately 0.1 seconds of audio data.
Uses Unity's microphone API to retrieve the latest audio samples.
#### Returns
**Type:** `int`
**Description:** The number of audio samples collected, or -1 if collection failed.
---
### ResetPointer
Resets the pointer of the Audio Buffer to the beginning.
#### Returns
**Type:** `void`
---
## Audio Collection Process
The AudioCollectModule performs the following steps during audio collection:
1. **Device Check**: Verifies the microphone device is still recording
2. **Auto-Reconnect**: If enabled and microphone is not recording, attempts to restart it
3. **Position Tracking**: Gets the current microphone position and calculates sample difference
4. **Data Retrieval**: Extracts audio samples from the AudioClip using `GetData()`
5. **Buffer Update**: Enqueues the collected samples into the input buffer
6. **Position Update**: Updates the last position for the next collection cycle
### Sample Collection Details
- **Collection Frequency**: Called approximately every 0.1 seconds by the audio coroutine
- **Sample Size**: Collects the difference between current and last microphone position
- **Buffer Management**: Uses circular buffer to handle continuous audio streaming
- **Error Handling**: Returns -1 if collection fails at any step
## Important Notes
### WebGL Compatibility
This module is not available on Unity WebGL builds due to microphone access limitations. The code is wrapped in `#if !UNITY_WEBGL` preprocessor directives.
### Auto-Reconnect Feature
When `m_AutoReconnect` is enabled (default), the module will automatically attempt to restart the microphone if it detects that recording has stopped. This helps maintain continuous audio capture in case of temporary microphone disconnections.
### Position Tracking
The module uses Unity's `Microphone.GetPosition()` to track the current read position in the audio buffer. It handles buffer wraparound by checking if the current position is less than the last position, indicating the buffer has looped.
### Integration with Audio Pipeline
This module is called by the [InworldAudioManager.CollectAudio()](./InworldAudioManager#collectaudio) method during the audio processing pipeline, specifically between pre-processing and post-processing stages.
---
---
#### Audio Dispatch Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldAudioModule/AudioDispatchModule
[Overview](../overview) > Audio Dispatch Module
**Class:** `AudioDispatchModule` | **Inherits from:** `InworldAudioModule` | **Implements:** `ISendAudioHandler`
Audio dispatch module responsible for sending processed audio data to the Inworld service. Manages the transmission of audio samples and handles queuing, debugging, and test mode functionality. This module serves as the final stage in the audio processing pipeline before data reaches the server.
## Properties
- [ShortBufferToSend](#shortbuffertosend)
- [IsReadyToSend](#isreadytosend)
- [Samples](#samples)
## Methods
- [OnStartSendAudio](#onstartsendaudio)
- [OnStopSendAudio](#onstopsendaudio)
## Reference
### ShortBufferToSend
Gets the audio buffer to use for sending data to the Inworld service.
Returns either the processed buffer from an audio processor module or the raw input buffer.
#### Returns
**Type:** `CircularBuffer`
---
### IsReadyToSend
Gets a value indicating whether the audio data is ready to be sent to the Inworld service.
Checks if the Inworld controller is available and properly initialized.
#### Returns
**Type:** `bool`
---
### Samples
Gets all the current audio samples ready for transmission.
#### Returns
**Type:** `List`
---
### OnStartSendAudio
Called when the player starts being allowed to send audio into the buffer.
Implemented in the child class for custom behavior.
#### Returns
**Type:** `void`
---
### OnStopSendAudio
Called when the player stops being allowed to send audio into the buffer.
Automatically sends any remaining audio data and clears the queue.
#### Returns
**Type:** `void`
---
## Audio Dispatching Process
The AudioDispatchModule performs the following operations:
### Continuous Processing
- Runs a coroutine that executes every 0.1 seconds while the module is active
- Monitors the `Audio.IsPlayerSpeaking` state to determine when to process audio
- Converts buffer data to audio chunks when the player is speaking
### Buffer Management
- Uses `ShortBufferToSend` to determine the appropriate audio source
- Prioritizes processed audio buffer from `IProcessAudioHandler` if available
- Falls back to raw input buffer if no processor is present
- Tracks position to avoid processing the same data multiple times
### Audio Chunk Creation
- Converts audio samples to `AudioChunk` format with 16kHz sample rate
- Uses `InworldVector` for data storage
- Queues audio data for transmission to the Inworld service
### Debug and Test Features
- **Audio Debugging**: When enabled, stores audio data in `m_DebugInput` for analysis
- **Test Mode**: Available for testing audio transmission without actual server communication
## Serialized Fields
The following fields are configurable in the Unity Inspector:
- **m_IsAudioDebugging** (`bool`) - Enables audio debugging mode to store input data for analysis
- **m_TestMode** (`bool`) - Enables test mode for audio transmission testing
## Important Notes
### Integration with Inworld Service
This module serves as the final stage in the audio processing pipeline, responsible for:
- Converting processed audio data into the format expected by the Inworld service
- Managing the transmission queue to ensure reliable data delivery
- Handling the transition between speaking and non-speaking states
### Buffer Priority
The module intelligently selects the audio source:
1. **Processed Buffer**: Uses `IProcessAudioHandler.ProcessedBuffer` if available
2. **Raw Buffer**: Falls back to `Audio.InputBuffer` if no processor is present
### Automatic Lifecycle Management
- **OnEnable**: Automatically starts the audio dispatching coroutine
- **OnDisable**: Stops the coroutine and cleans up resources
- **OnStopSendAudio**: Automatically sends remaining data and clears the queue
### Ready State Validation
The `IsReadyToSend` property validates that:
- InworldController instance exists
- STT module is available
- Audio queue contains data
- Module is properly initialized
---
---
#### Voice Activity Detector
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldAudioModule/VoiceActivityDetector
[Overview](../overview) > Voice Activity Detector
**Class:** `VoiceActivityDetector` | **Inherits from:** `PlayerVoiceDetector` | **Implements:** `ICalibrateAudioHandler`
Advanced voice activity detection module that uses the Inworld VAD system for speech detection. Extends PlayerVoiceDetector to provide more sophisticated voice activity analysis using native DLL functions. This detector provides more accurate results than simple volume-based detection.
## Methods
- [DetectPlayerSpeaking](#detectplayerspeaking)
- [GetAudioChunk](#getaudiochunk)
## Reference
### DetectPlayerSpeaking
Detects whether the player is currently speaking using the Inworld VAD (Voice Activity Detection) system.
Uses native DLL functions to analyze audio characteristics beyond simple volume thresholds.
#### Returns
**Type:** `bool`
**Description:** True if voice activity is detected in the current audio data; otherwise, false.
---
### GetAudioChunk
Converts a list of float audio samples into an InworldAudioChunk for processing by the VAD system.
Creates the appropriate data structure required by the native DLL voice activity detection functions.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| audioData | `List` | The audio sample data as normalized float values. |
#### Returns
**Type:** `AudioChunk`
**Description:** An InworldAudioChunk object containing the audio data formatted for VAD processing.
---
## Voice Activity Detection Process
The VoiceActivityDetector performs the following steps for voice detection:
### Prerequisites Check
1. **Controller Validation**: Verifies InworldController instance exists
2. **VAD Module Check**: Ensures VAD module is available
3. **Audio Data Validation**: Confirms audio data is present
### Detection Process
1. **Audio Chunk Creation**: Converts float samples to AudioChunk format
2. **VAD Analysis**: Calls native DLL function `DetectVoiceActivity()`
3. **Result Interpretation**: Returns true if VAD result is >= 0 (positive detection)
### Audio Format Requirements
- **Sample Rate**: 16kHz (16000 Hz)
- **Data Format**: Normalized float values (-1.0 to 1.0)
- **Data Structure**: InworldVector\ containing audio samples
## Inheritance and Interface Implementation
### PlayerVoiceDetector Base Class
This class extends `PlayerVoiceDetector`, which provides:
- Basic voice detection infrastructure
- Audio data collection and management
- Integration with the audio processing pipeline
### ICalibrateAudioHandler Interface
The base class implements `ICalibrateAudioHandler`, providing:
- `OnStartCalibration()` - Called by [InworldAudioManager.StartCalibrate()](./InworldAudioManager#startcalibrate)
- `OnStopCalibration()` - Called by [InworldAudioManager.StopCalibrate()](./InworldAudioManager#stopcalibrate)
- `OnCalibrate()` - Performs calibration-specific operations
## Important Notes
### Native DLL Integration
This detector uses native DLL functions for voice activity detection:
- **Function**: `InworldController.VAD.DetectVoiceActivity(audioChunk)`
- **Return Values**:
- `-1`: Negative detection (no voice activity)
- `0` and above: Positive detection (voice activity present)
### Advanced Detection Capabilities
Unlike simple volume-based detection, this module:
- Analyzes audio characteristics beyond amplitude
- Uses sophisticated algorithms for speech pattern recognition
- Provides more accurate voice activity detection
- Reduces false positives from background noise
### Integration with Audio Pipeline
The VoiceActivityDetector integrates with the audio processing pipeline:
- Receives audio data from the collection modules
- Processes audio through the VAD system
- Updates the `IsPlayerSpeaking` state in InworldAudioManager
- Triggers appropriate events for voice start/stop detection
### Performance Considerations
- Uses native DLL functions for efficient processing
- Requires proper audio data format for accurate detection
- Depends on InworldController and VAD module availability
- Processes audio in real-time during the audio coroutine
---
---
#### Runtime Reference > Classes > InworldFrameworkModule
#### Inworld Framework Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldFrameworkModule
[Overview](../overview) > Inworld Framework Module
**Class:** `InworldFrameworkModule` | **Inherits from:** `MonoBehaviour`
Base class for a component that governs a feature of the InworldSDK.
Abstract base class for all Inworld framework modules within Unity. Provides common functionality for module initialization, configuration, and lifecycle management. Manages the creation and coordination of factories, configurations, interfaces, and nodes.
## Properties
- [ModelType](#modeltype)
- [Interface](#interface)
- [Graph](#graph)
- [Node](#node)
- [Initialized](#initialized)
## Methods
- [CreateFactory](#createfactory)
- [SetupConfig](#setupconfig)
- [Initialize](#initialize)
- [InitializeAsync](#initializeasync)
## Events
- [OnInitialized](#oninitialized)
- [OnTeminated](#onteminated)
- [OnTaskStarted](#ontaskstarted)
- [OnTask](#ontask)
- [OnTaskFinished](#ontaskfinished)
- [OnTaskCancelled](#ontaskcancelled)
## Reference
### ModelType
Gets or sets the model type for this module (Remote or Local).
Determines whether the module uses cloud-based or on-device AI models.
#### Returns
**Type:** `ModelType`
---
### Interface
Gets the active interface instance for this module.
Returns null if the module is not initialized.
#### Returns
**Type:** `InworldInterface`
---
### Graph
Gets the active graph instance associated with this module.
#### Returns
**Type:** `InworldGraph`
---
### Node
Gets the active node instance associated with this module.
#### Returns
**Type:** `InworldNode`
---
### Initialized
Gets or sets the initialization state of this module.
When set to true, triggers OnInitialized event if interface is available.
When set to false, cleans up interface and triggers OnTerminated event.
#### Returns
**Type:** `bool`
---
### CreateFactory
Creates and returns a factory instance specific to this module type.
Must be implemented by derived classes to provide appropriate factory creation.
#### Returns
**Type:** `InworldFactory`
**Description:** A factory instance for creating module-specific objects.
---
### SetupConfig
Sets up and returns a configuration instance specific to this module type.
Must be implemented by derived classes to provide appropriate configuration.
#### Returns
**Type:** `InworldConfig`
**Description:** A configuration instance for module initialization.
---
### Initialize
Initializes the module synchronously by creating factory, configuration, and interface.
Sets up the complete module pipeline required for operation.
#### Returns
**Type:** `bool`
**Description:** True if initialization succeeded, false otherwise.
---
### InitializeAsync
Initializes the module asynchronously by creating factory, configuration, and interface.
Performs interface creation on a background thread for improved performance.
#### Returns
**Type:** `Awaitable`
**Description:** A task that completes with true if initialization succeeded, false otherwise.
---
### OnInitialized
Event triggered when the module has been successfully initialized.
#### Parameters
None
---
### OnTeminated
Event triggered when the module has been terminated or disposed.
#### Parameters
None
---
### OnTaskStarted
Event triggered when a task has started execution.
#### Parameters
None
---
### OnTask
Event triggered during task execution with status updates.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | `string` | The status message describing current task progress. |
---
### OnTaskFinished
Event triggered when a task has finished execution.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | `string` | The completion message describing task results. |
---
### OnTaskCancelled
Event triggered when a task has been cancelled.
#### Parameters
None
---
## Serialized Fields
The following fields are configurable in the Unity Inspector:
- **m_ModelType** (`ModelType`) - The model type for this module (default: Remote)
## Important Notes
### Abstract Implementation Requirements
Derived classes must implement:
- `CreateFactory()` - Provide module-specific factory creation. Typically, this sets up an InworldInterface for the module, which directly interacts with the DLL through InworldInterop.
- `SetupConfig()` - Provide module-specific configuration setup
### Model Type Support
The module supports two model types:
- **Remote**: Uses cloud-based AI models
- **Local**: Uses on-device AI models
Note that some features (particularly audio features) are only available for local models.
---
---
#### Inworld AEC Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldAECModule
[Overview](../overview) > Inworld AEC Module
**Class:** `InworldAECModule` | **Inherits from:** `InworldFrameworkModule`
Module for Acoustic Echo Cancellation (AEC) within the Inworld framework. Processes audio streams to remove echo and feedback from microphone input. Uses CPU-based local processing only and does not support remote operation. Essential for clear audio communication in applications with both input and output audio.
> **Important:** This module must be executed with a LocalCPU model. Remote operation is not supported.
## Methods
- [CreateFactory](#createfactory)
- [SetupConfig](#setupconfig)
- [FilterAudio](#filteraudio)
## Reference
### CreateFactory
Creates and returns an AECFactory for this module.
#### Returns
**Type:** `InworldFactory`
**Description:** A factory instance for creating acoustic echo cancellation objects.
---
### SetupConfig
Sets up the configuration for acoustic echo cancellation operations.
Only supports local CPU processing; remote operation is not available.
#### Returns
**Type:** `InworldConfig`
**Description:** An AEC configuration instance for module initialization, or null if remote mode is attempted.
---
### FilterAudio
Filters audio to remove echo and feedback using acoustic echo cancellation.
Processes both near-end (microphone) and far-end (speaker) audio to produce clean output.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| nearend | `AudioChunk` | The microphone audio input that may contain echo. |
| farend | `AudioChunk` | The speaker audio output that creates potential echo. |
#### Returns
**Type:** `AudioChunk`
**Description:** The filtered audio chunk with echo removed, or null if processing failed.
---
---
#### Inworld LLM Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldLLMModule
[Overview](../overview) > Inworld LLM Module
**Class:** `InworldLLMModule` | **Inherits from:** `InworldFrameworkModule`
Module for Large Language Model (LLM) integration in the Inworld framework. Provides text generation capabilities using both remote and local AI models.
## Properties
- [Provider](#provider)
- [ModelName](#modelname)
- [ModelPath](#modelpath)
- [MaxToken](#maxtoken)
- [MaxPromptLength](#maxpromptlength)
- [Repetition](#repetition)
- [Temperature](#temperature)
- [TopP](#topp)
- [Frequency](#frequency)
- [Presence](#presence)
## Methods
- [GenerateText](#generatetext)
- [GenerateTextAsync](#generatetextasync)
- [SetupTextGenerationConfig](#setuptextgenerationconfig)
## Reference

### Provider
Gets or sets the AI model provider (e.g., OpenAI, Anthropic).
Used when connecting to remote LLM services.
#### Returns
**Type:** `string`
**Description:** The provider name as a string.
---
### ModelName
Gets or sets the specific model name to use for text generation.
This should correspond to models available from the configured provider.
#### Returns
**Type:** `string`
**Description:** The model name as a string.
---
### ModelPath
Gets or sets the file path to a local LLM model.
If not set, defaults to the framework's default LLM model path.
#### Returns
**Type:** `string`
**Description:** The full file path to the local model.
---
### MaxToken
Gets or sets the maximum number of tokens the model can generate in a single response.
Value is automatically clamped between 1 and 2500.
#### Returns
**Type:** `int`
**Description:** The maximum token count (1-2500).
---
### MaxPromptLength
Gets or sets the maximum length of the input prompt in tokens.
Value is automatically clamped between 1 and 2500.
#### Returns
**Type:** `int`
**Description:** The maximum prompt length (1-2500).
---
### Repetition
Gets or sets the repetition penalty factor.
Higher values reduce the likelihood of repeating the same content.
Value is automatically clamped between 0 and 1.
#### Returns
**Type:** `float`
**Description:** The repetition penalty (0.0-1.0).
---
### Temperature
Gets or sets the temperature for text generation.
Higher values produce more creative/random output, lower values are more focused.
Value is automatically clamped between 0 and 1.
#### Returns
**Type:** `float`
**Description:** The temperature value (0.0-1.0).
---
### TopP
Gets or sets the top-p (nucleus sampling) parameter.
Controls diversity by only considering tokens with cumulative probability up to this value.
Value is automatically clamped between 0 and 1.
#### Returns
**Type:** `float`
**Description:** The top-p value (0.0-1.0).
---
### Frequency
Gets or sets the frequency penalty factor.
Reduces the likelihood of repeating frequently used tokens.
Value is automatically clamped between 0 and 1.
#### Returns
**Type:** `float`
**Description:** The frequency penalty (0.0-1.0).
---
### Presence
Gets or sets the presence penalty factor.
Encourages the model to talk about new topics by penalizing tokens that have already appeared.
Value is automatically clamped between 0 and 1.
#### Returns
**Type:** `float`
**Description:** The presence penalty (0.0-1.0).
---
### GenerateText
Generates text synchronously using the configured LLM.
This method blocks until the generation is complete.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | `string` | The input text prompt for generation. |
#### Returns
**Type:** `string`
**Description:** The generated text response, or empty string if generation fails.
---
### GenerateTextAsync
Generates text asynchronously using the configured LLM.
This method performs generation on a background thread for improved performance.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | `string` | The input text prompt for generation. |
#### Returns
**Type:** `Awaitable`
**Description:** A task that completes with the generated text response, or empty string if generation fails.
---
### SetupTextGenerationConfig
Sets up and returns a text generation configuration instance based on current module settings.
This method configures all text generation parameters including token limits, penalties, and sampling parameters.
#### Returns
**Type:** `TextGenerationConfig`
**Description:** A configuration instance for text generation with current module settings.
---
## Serialized Fields
The following fields are configurable in the Unity Inspector:
### Remote Configuration
- **m_Provider** (`string`) - The AI model provider (e.g., OpenAI, Anthropic)
- **m_ModelName** (`string`) - The specific model name to use for text generation
### Local Configuration
- **m_ModelPath** (`string`) - The file path to a local LLM model
### Text Configuration
- **m_MaxToken** (`int`) - Maximum number of tokens to generate (default: 100, range: 1-2500)
- **m_MaxPromptLength** (`int`) - Maximum length of input prompt in tokens (default: 1000, range: 1-2500)
- **m_TopP** (`float`) - Top-p (nucleus sampling) parameter (default: 0.95, range: 0-1)
- **m_Temperature** (`float`) - Temperature for text generation (default: 0.5, range: 0-1)
### Penalty Configuration
- **m_Repetition** (`float`) - Repetition penalty factor (default: 1.0, range: 0-1)
- **m_Frequency** (`float`) - Frequency penalty factor (default: 0.0, range: 0-1)
- **m_Presence** (`float`) - Presence penalty factor (default: 0.0, range: 0-1)
## Configuration Management
### Remote Configuration
When using remote models, the module creates an `LLMRemoteConfig` with:
- Provider name (if specified)
- Model name (if specified)
- API key from framework utilities
### Local Configuration
When using local models, the module creates an `LLMLocalConfig` with:
- Model path (resolved from StreamingAssets folder)
- Device configuration from framework utilities
### Text Generation Configuration
The `SetupTextGenerationConfig()` method configures:
- Maximum token count
- Repetition penalty
- Top-p and temperature (with GPT-5 compatibility handling)
- Maximum prompt length
- Frequency and presence penalties
## Important Notes
### GPT-5 Compatibility
Special handling is implemented for GPT-5 models where TopP and Temperature parameters are not applied.
---
---
#### Inworld Safety Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldSafetyModule
[Overview](../overview) > Inworld Safety Module
**Class:** `InworldSafetyModule` | **Inherits from:** `InworldFrameworkModule`
Module for content safety checking and moderation within the Inworld framework. Provides real-time content analysis to detect potentially harmful or inappropriate content. Integrates with text embedding services for advanced semantic analysis of user inputs and AI responses.
## Properties
- [SafetyConfig](#safetyconfig)
- [CreationConfig](#creationconfig)
## Methods
- [SetupSafetyThreshold](#setupsafetythreshold)
- [SetupEmbedder](#setupembedder)
- [SetupSafetyConfig](#setupsafetyconfig)
- [IsSafe](#issafe)
## Reference

### SafetyConfig
Gets the current safety configuration instance.
Contains the configured safety thresholds and topic settings for content moderation.
#### Returns
**Type:** `SafetyConfig`
**Description:** The safety configuration instance with current threshold settings.
---
### CreationConfig
Gets the safety checker creation configuration instance.
Contains settings used during the initialization of the safety checker.
#### Returns
**Type:** `SafetyCheckerCreationConfig`
**Description:** The creation configuration instance for safety checker initialization.
---
### SetupSafetyThreshold
Sets up safety thresholds for content moderation.
Replaces current threshold configuration with the provided data.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| safetyData | `List` | List of safety thresholds to configure for different topic types. |
#### Returns
**Type:** `void`
---
### SetupEmbedder
Sets up the text embedder interface for semantic analysis.
Required for advanced content understanding and safety classification.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| textEmbedderInterface | `TextEmbedderInterface` | The text embedder interface to use for content analysis. |
#### Returns
**Type:** `void`
---
### SetupSafetyConfig
Configures safety settings based on provided threshold data.
Converts SafetyThreshold objects to internal TopicThreshold format.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| safetyThresholds | `List` | List of safety thresholds to configure. |
#### Returns
**Type:** `void`
---
### IsSafe
Checks if the provided sentence is safe according to current safety thresholds.
Performs real-time content analysis using the configured safety parameters.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| sentence | `string` | The text content to analyze for safety violations. |
#### Returns
**Type:** `bool`
**Description:** True if the content is considered safe, false if it violates safety thresholds.
---
## Serialized Fields
The following fields are configurable in the Unity Inspector:
### Safety Configuration
- **m_SafetyData** (`List`) - List of safety thresholds for different topic types
### Remote Configuration
- **m_Provider** (`string`) - The AI model provider for safety checking
- **m_ModelName** (`string`) - The specific model name to use for safety analysis
### Local Configuration
- **m_ModelPath** (`string`) - The file path to a local safety model
## Safety Threshold Configuration
The `SafetyThreshold` class defines safety parameters for content moderation:
### SafetyThreshold Properties
- **topic** (`UnsafeTopic`) - The type of unsafe topic to monitor
- **threshold** (`float`) - The threshold value for this topic (typically between 0.0 and 1.0)
### Threshold Behavior
- **Lower values** (closer to 0.0) are more restrictive and will flag more content as unsafe
- **Higher values** (closer to 1.0) are more permissive and will allow more content through
## Important Notes
The safety module requires a text embedder module. Your InworldController must initialize one before initializing the safety module.
---
---
#### Inworld STT Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldSTTModule
[Overview](../overview) > Inworld STT Module
**Class:** `InworldSTTModule` | **Inherits from:** `InworldFrameworkModule`
Module for Speech-to-Text (STT) functionality within the Inworld framework. Converts audio input into text transcriptions using AI-powered speech recognition. Supports both synchronous and asynchronous speech recognition operations.
## Methods
- [RecognizeSpeech](#recognizespeech)
- [RecognizeSpeechAsync](#recognizespeechasync)
## Reference
### RecognizeSpeech
Performs synchronous speech recognition on the provided audio chunk.
Converts audio data to text using the configured speech recognition model.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| audioChunk | `AudioChunk` | The audio data to transcribe into text. |
#### Returns
**Type:** `string`
**Description:** The transcribed text result, or empty string if recognition failed.
---
### RecognizeSpeechAsync
Performs asynchronous speech recognition on the provided audio chunk.
Converts audio data to text using the configured speech recognition model.
Provides progress notifications through task events during processing.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| audioChunk | `AudioChunk` | The audio data to transcribe into text. |
#### Returns
**Type:** `Awaitable`
**Description:** A task that completes with the transcribed text result, or empty string if recognition failed.
---
## Configuration Management
### Remote Configuration
When using remote STT services, the module creates an `STTRemoteConfig` with:
- API key from framework utilities (if available)
### Local Configuration
When using local STT models, the module creates an `STTLocalConfig` with:
- Model path (resolved from StreamingAssets folder)
- Device configuration from framework utilities
---
---
#### Inworld TTS Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldTTSModule
[Overview](../overview) > Inworld TTS Module
**Class:** `InworldTTSModule` | **Inherits from:** `InworldFrameworkModule`
Module for Text-to-Speech (TTS) functionality within the Inworld framework. Converts text input into synthesized speech audio using AI-powered voice synthesis. Supports both synchronous and asynchronous speech synthesis operations with customizable voice settings.
## Properties
- [SynthesisConfig](#synthesisconfig)
- [Voice](#voice)
## Methods
- [TextToSpeech](#texttospeech)
- [TextToSpeechAsync](#texttospeechasync)
- [SetVoice](#setvoice)
## Reference
### SynthesisConfig
Gets the current speech synthesis configuration instance.
Contains speech processing parameters and settings for TTS operations.
#### Returns
**Type:** `SpeechSynthesisConfig`
**Description:** The speech synthesis configuration instance with current settings.
---
### Voice
Gets the current voice profile instance.
Contains the configured speaker identity for speech synthesis.
#### Returns
**Type:** `InworldVoice`
**Description:** The voice profile instance with current speaker settings.
---
### TextToSpeech
Performs synchronous text-to-speech conversion and immediately plays the result.
Converts the provided text to speech audio using the specified speaker voice.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | `string` | The text content to convert to speech. |
| speakerID | `string` | The voice identifier to use for speech synthesis. |
#### Returns
**Type:** `void`
---
### TextToSpeechAsync
Performs asynchronous text-to-speech conversion and plays the result when complete.
Converts the provided text to speech audio using the specified speaker voice.
Provides progress notifications through task events during processing.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| text | `string` | The text content to convert to speech. |
| speakerID | `string` | The voice identifier to use for speech synthesis. |
#### Returns
**Type:** `void`
---
### SetVoice
Sets the voice profile to use for speech synthesis.
Configures the speaker identity for subsequent TTS operations.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| voiceID | `string` | The voice identifier to set as the current speaker. |
#### Returns
**Type:** `void`
---
## Serialized Fields
The following fields are configurable in the Unity Inspector:
- **m_AudioSource** (`AudioSource`) - The Unity AudioSource component used for playing synthesized speech
## Configuration Management
### Remote Configuration
When using remote TTS services, the module creates a `TTSRemoteConfig` with:
- API key from framework utilities (if available)
### Local Configuration
When using local TTS models, the module creates a `TTSLocalConfig` with:
- Model path (resolved from StreamingAssets folder)
- Prompt path (resolved from StreamingAssets folder)
---
---
#### Inworld VAD Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/InworldVADModule
[Overview](../overview) > Inworld VAD Module
**Class:** `InworldVADModule` | **Inherits from:** `InworldFrameworkModule`
Module for Voice Activity Detection (VAD) within the Inworld framework. Analyzes audio input to determine when speech is present versus silence or background noise. Used for optimizing speech processing by filtering out non-speech audio segments.
## Methods
- [DetectVoiceActivity](#detectvoiceactivity)
## Reference
### DetectVoiceActivity
Detects voice activity in the provided audio chunk.
Analyzes the audio data to determine if speech is present based on the configured threshold.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| audioChunk | `AudioChunk` | The audio data to analyze for voice activity. |
#### Returns
**Type:** `int`
**Description:** 1 if voice activity is detected, 0 if no voice activity, or -1 if detection failed.
---
## Serialized Fields
The following fields are configurable in the Unity Inspector:
- **m_Threshold** (`float`) - Voice activity detection threshold (default: 0.3, range: 0-1)
## Important Notes
The VAD module only supports local model execution.
---
---
#### Knowledge Module
Source: https://docs.inworld.ai/Unity/runtime/runtime-reference/InworldFrameworkModule/KnowledgeModule
[Overview](../overview) > Knowledge Module
**Class:** `KnowledgeModule` | **Inherits from:** `InworldFrameworkModule`
Module for managing knowledge bases and information retrieval within the Inworld framework. Provides functionality for compiling, storing, and querying knowledge content for AI interactions. Integrates with text embedding services for semantic search and knowledge matching.
## Properties
- [CompileConfig](#compileconfig)
- [MaxCharsPerChunk](#maxcharsperchunk)
- [MaxChunksPerDoc](#maxchunksperdoc)
## Events
- [OnKnowledgeRemoved](#onknowledgeremoved)
- [OnKnowledgeCompiled](#onknowledgecompiled)
- [OnKnowledgeRespond](#onknowledgerespond)
## Methods
- [SetupEmbedder](#setupembedder)
- [RemoveKnowledge](#removeknowledge)
- [CompileKnowledges](#compileknowledges)
- [GetKnowledges](#getknowledges)
## Reference
### CompileConfig
Gets the compile configuration for knowledge processing.
Returns null if the module is not properly configured.
#### Returns
**Type:** `KnowledgeCompileConfig`
**Description:** The knowledge compilation configuration instance, or null if not configured.
---
### MaxCharsPerChunk
Gets or sets the maximum number of characters per knowledge chunk.
Affects how knowledge content is divided during compilation.
#### Returns
**Type:** `int`
**Description:** The maximum characters per chunk setting.
---
### MaxChunksPerDoc
Gets or sets the maximum number of chunks per document.
Limits the granularity of knowledge segmentation during compilation.
#### Returns
**Type:** `int`
**Description:** The maximum chunks per document setting.
---
### OnKnowledgeRemoved
Event triggered when knowledge content has been removed from the knowledge base.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| knowledgeID | `string` | The unique identifier of the removed knowledge. |
---
### OnKnowledgeCompiled
Event triggered when knowledge content has been compiled and chunked.
Provides the knowledge ID and the resulting list of compiled chunks.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| knowledgeID | `string` | The unique identifier of the compiled knowledge. |
| chunks | `List` | The resulting list of compiled knowledge chunks. |
---
### OnKnowledgeRespond
Event triggered when knowledge retrieval responds with relevant information.
Provides the list of knowledge chunks that match the query.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| chunks | `List` | The list of relevant knowledge chunks. |
---
### SetupEmbedder
Sets up the text embedder interface for semantic knowledge processing.
Required for advanced knowledge matching and retrieval operations.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| textEmbedderInterface | `TextEmbedderInterface` | The text embedder interface to use for knowledge processing. |
#### Returns
**Type:** `void`
---
### RemoveKnowledge
Removes knowledge content from the knowledge base.
Triggers OnKnowledgeRemoved event if the removal is successful.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| knowledgeID | `string` | The unique identifier of the knowledge to remove. |
#### Returns
**Type:** `bool`
**Description:** True if the knowledge was successfully removed, false otherwise.
---
### CompileKnowledges
Compiles raw knowledge content into processable chunks.
Breaks down large knowledge documents into smaller, searchable segments.
Triggers OnKnowledgeCompiled event with the resulting chunks.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| knowledgeID | `string` | The unique identifier for this knowledge set. |
| knowledges | `List` | List of raw knowledge strings to compile. |
#### Returns
**Type:** `List`
**Description:** List of compiled knowledge chunks, or null if compilation failed.
---
### GetKnowledges
Retrieves relevant knowledge content based on provided knowledge IDs and optional event history.
Performs semantic matching to find the most relevant knowledge for the current context.
Triggers OnKnowledgeRespond event with the retrieved knowledge.
#### Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| knowledgeIDs | `List` | List of knowledge identifiers to search within. |
| eventHistory | `List` | Optional conversation history to provide context for knowledge retrieval. |
#### Returns
**Type:** `List