Voice Activity Detection
Player voice activity detection (VAD) is a feature of Inworld's Unreal Engine SDK that allows a game to detect when a player begins speaking, which can then be used trigger in-game events.
VAD Options
To enable VAD, select a speech mode under Speech Options in the Inworld Session Component.
Speech Modes
There are three Speech Modes available:
- Default: VAD is disabled, all captured audio data is sent during audio sessions, VAD blueprint events do not trigger.
- VAD Continious Audio Send: VAD is enabled, all captured audio data is sent during audio sessions, VAD blueprint events trigger.
- VAD Conditional Audio Send: VAD is enabled, only data with detected player voice is sent during audio sessions, VAD blueprint events work.
Speech Options
In addition, three further Speech Options are available:
- VAD Prob Threshold: The probability threshold the language processing model uses to determine if VAD should be triggered.
- VAD Buffer Chunks Num: The number of 100 ms audio chunks that are buffered to send to server upon voice detection.
- VAD Silence Chunks Num: The number of 100 ms audio chunks of silence required before VAD is considered to have detected silence.
Triggering VAD Events
VAD blueprint events can be triggered from either the Inworld Player Component or Inworld Character Component.