> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inworld.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# STT(Speech-to-text) Node Demo

This demo showcases how to use the `STTNode`.

## Run the Template

1. Go to `Assets/InworldRuntime/Scenes/Nodes` and play the `STTNode` scene.
   <img src="https://mintcdn.com/inworldai/pDD5vvrZThONehMe/img/unity/framework/STTNNode00.png?fit=max&auto=format&n=pDD5vvrZThONehMe&q=85&s=99f1251567d2e8b6f667eaf1b15a5c7d" alt="STTNode00" width="654" height="696" data-path="img/unity/framework/STTNNode00.png" />
2. Once the graph is compiled, speak into the microphone to generate text.

<img src="https://mintcdn.com/inworldai/pDD5vvrZThONehMe/img/unity/framework/STTNode.gif?s=2f09f75aadd1f304c829ca3a8a22d46f" alt="STTNode01" width="1920" height="1080" data-path="img/unity/framework/STTNode.gif" />

## Understanding the Graph

You can find the graph on the `InworldGraphExecutor` of `STTCanvas`.

<img src="https://mintcdn.com/inworldai/pDD5vvrZThONehMe/img/unity/framework/STTNode02.png?fit=max&auto=format&n=pDD5vvrZThONehMe&q=85&s=e173796fa06cd82de31263a304e7ce4a" alt="STTNode02" width="1371" height="867" data-path="img/unity/framework/STTNode02.png" />

The graph is very simple. It contains a single node, `STTNode`, with no edges.

`STTNode` is both the `StartNode` and the `EndNode`.

<img src="https://mintcdn.com/inworldai/pDD5vvrZThONehMe/img/unity/framework/STTNode03.png?fit=max&auto=format&n=pDD5vvrZThONehMe&q=85&s=d0a1245f52471c709c8e1a1e9aef95e7" alt="STTNode03" width="1014" height="705" data-path="img/unity/framework/STTNode03.png" />

### InworldController

The `InworldController` is also simple; it contains only one primitive module: `STT`.

<img src="https://mintcdn.com/inworldai/pDD5vvrZThONehMe/img/unity/framework/STTNode04.png?fit=max&auto=format&n=pDD5vvrZThONehMe&q=85&s=2f206440d974d65ecfcc018595d8fb67" alt="STTNode04" width="1362" height="570" data-path="img/unity/framework/STTNode04.png" />

<Tip>
  For details about the primitive module, see the [STT Primitive Demo](../primitives/stt).
</Tip>

### InworldAudioManager

`InworldAudioManager` handles audio processing and is also modular.

In this demo, it uses four components:

* **AudioCapturer**: Manages microphone on/off and input devices. Uses Unity's `Microphone` by default, and can be extended via third‑party plugins.
* **AudioCollector**: Collects raw samples from the microphone.
* **PlayerVoiceDetector**: Implements `IPlayerAudioEventHandler` and `ICalibrateAudioHandler` to emit player audio events and decide which timestamped segments to keep from the stream.
* **AudioDispatcher**: Sends the captured microphone data for downstream processing.

<img src="https://mintcdn.com/inworldai/PEMIBdkx0YyDrDSz/img/unity/framework/AudioManager.png?fit=max&auto=format&n=PEMIBdkx0YyDrDSz&q=85&s=d4513a92cab3cb9aaae2101bc80fba07" alt="AudioManager" width="1203" height="546" data-path="img/unity/framework/AudioManager.png" />

### Workflow

**Audio Thread:**
At startup, the microphone calibrates to background noise.

`PlayerVoiceDetector` listens for speech using SNR (Signal‑to‑Noise Ratio).

When it exceeds the threshold, `AudioDispatcher` streams audio frames to `InworldAudio`.

**Main Thread:**

1. When the game starts, `InworldController` initializes its only module, `STTModule`, which creates the `STTInterface`.
2. Next, `InworldGraphExecutor` initializes its graph asset by calling each component’s `CreateRuntime()`. In this case, only `STTNode.CreateRuntime()` is called, using the created `STTInterface` as input.
3. After initialization, the graph calls `Compile()` and returns the executor handle.
4. After compilation, the `OnGraphCompiled` event is invoked. In this demo, `STTNodeTemplate` subscribes to it and enables the UI components. Users can then interact with the graph system.

```c# STTNodeTemplate.cs theme={"system"}
protected override void OnGraphCompiled(InworldGraphAsset obj)
{
    foreach (InworldUIElement element in m_UIElements)
        element.Interactable = true;

}
```

5. When `AudioDispatcher` sends data, `STTNodeTemplate` handles its `OnAudioSent` event with the `SendAudio()` function, converting the `List<float>` audio data into `InworldAudio`.

```c# STTNodeTemplate.cs theme={"system"}
protected override void OnEnable()
{
    base.OnEnable();
    if (!m_Audio)
        return;
    m_Audio.Event.onStartCalibrating.AddListener(()=>Title("Calibrating"));
    m_Audio.Event.onStopCalibrating.AddListener(Calibrated);
    m_Audio.Event.onPlayerStartSpeaking.AddListener(()=>Title("PlayerSpeaking"));
    m_Audio.Event.onPlayerStopSpeaking.AddListener(()=>
    {
        Title("");
        if (m_STTResult)
            m_STTResult.text = "";
    });
    m_Audio.Event.onAudioSent.AddListener(SendAudio);
}

void SendAudio(List<float> audioData)
{
    if (!m_ModuleInitialized)
        return;
    InworldVector<float> wave = new InworldVector<float>();
    wave.AddRange(audioData);
    
    _ = m_InworldGraphExecutor.ExecuteGraphAsync("STT", new InworldAudio(wave, wave.Size));
}
```

6. Calling `ExecuteGraphAsync()` eventually produces a result and invokes `OnGraphResult()`, which `STTNodeTemplate` subscribes to in order to receive the data.

```c# STTNodeTemplate.cs theme={"system"}
protected override void OnGraphResult(InworldBaseData obj)
{
    InworldText outputStream = new InworldText(obj);
    if (outputStream.IsValid && m_STTResult)
        m_STTResult.text += outputStream;
}
```
