VoiceActivityDetector | Inherits from: PlayerVoiceDetector | Implements: ICalibrateAudioHandler
Advanced voice activity detection module that uses the Inworld VAD system for speech detection. Extends PlayerVoiceDetector to provide more sophisticated voice activity analysis using native DLL functions. This detector provides more accurate results than simple volume-based detection.
Methods
Reference
DetectPlayerSpeaking
Detects whether the player is currently speaking using the Inworld VAD (Voice Activity Detection) system. Uses native DLL functions to analyze audio characteristics beyond simple volume thresholds.Returns
Type:bool
Description: True if voice activity is detected in the current audio data; otherwise, false.
GetAudioChunk
Converts a list of float audio samples into an InworldAudioChunk for processing by the VAD system. Creates the appropriate data structure required by the native DLL voice activity detection functions.Parameters
| Parameter | Type | Description |
|---|---|---|
| audioData | List<float> | The audio sample data as normalized float values. |
Returns
Type:AudioChunk
Description: An InworldAudioChunk object containing the audio data formatted for VAD processing.
Voice Activity Detection Process
The VoiceActivityDetector performs the following steps for voice detection:Prerequisites Check
- Controller Validation: Verifies InworldController instance exists
- VAD Module Check: Ensures VAD module is available
- Audio Data Validation: Confirms audio data is present
Detection Process
- Audio Chunk Creation: Converts float samples to AudioChunk format
- VAD Analysis: Calls native DLL function
DetectVoiceActivity() - Result Interpretation: Returns true if VAD result is >= 0 (positive detection)
Audio Format Requirements
- Sample Rate: 16kHz (16000 Hz)
- Data Format: Normalized float values (-1.0 to 1.0)
- Data Structure: InworldVector<float> containing audio samples
Inheritance and Interface Implementation
PlayerVoiceDetector Base Class
This class extendsPlayerVoiceDetector, which provides:
- Basic voice detection infrastructure
- Audio data collection and management
- Integration with the audio processing pipeline
ICalibrateAudioHandler Interface
The base class implementsICalibrateAudioHandler, providing:
OnStartCalibration()- Called by InworldAudioManager.StartCalibrate()OnStopCalibration()- Called by InworldAudioManager.StopCalibrate()OnCalibrate()- Performs calibration-specific operations
Important Notes
Native DLL Integration
This detector uses native DLL functions for voice activity detection:- Function:
InworldController.VAD.DetectVoiceActivity(audioChunk) - Return Values:
-1: Negative detection (no voice activity)0and above: Positive detection (voice activity present)
Advanced Detection Capabilities
Unlike simple volume-based detection, this module:- Analyzes audio characteristics beyond amplitude
- Uses sophisticated algorithms for speech pattern recognition
- Provides more accurate voice activity detection
- Reduces false positives from background noise
Integration with Audio Pipeline
The VoiceActivityDetector integrates with the audio processing pipeline:- Receives audio data from the collection modules
- Processes audio through the VAD system
- Updates the
IsPlayerSpeakingstate in InworldAudioManager - Triggers appropriate events for voice start/stop detection
Performance Considerations
- Uses native DLL functions for efficient processing
- Requires proper audio data format for accurate detection
- Depends on InworldController and VAD module availability
- Processes audio in real-time during the audio coroutine