Native Development Kit
The Native Development Kit (NDK) enables you to integrate Inworld AI characters into any application or game engine written in C++ or that supports extensions via C++ code.
For additional information on Inworld's Native Development Kit, see: Inworld NDK Github.
NDK Integration Examples
The following NDK examples cover several of Inworld’s supported platforms.
Console App
The console app is a simple text-based application used for internal testing. It provides a good minimal example of utilizing the NDK.
NOTE: The Unity and Android App provided in the repo may not be up to date. Only the Console App is utilized in automated builds for testing purposes.
Godot SDK
Demo focuses on creating more realistic and believable worlds with characters powered by artificial intelligence.
The Godot SDK is likely the best reference for how one might quickly integrate the NDK into a game engine. It provides simple wrappers around the NDK, and comes with a 2D sample project for quick testing. The implementation is extremely simple, and does not use more advanced features of Inworld.
Inworld-specific Code: inworld-godot-sdk
Unreal SDK
Demo focuses on creating more realistic and believable worlds with characters powered by artificial intelligence.
The Unreal SDK is likely the best reference for how one might integrate the NDK into a game engine in-depth. It provides wrappers around the NDK, and comes with a large amount of opinionated implementation that abstracts much of the underlying API away from the end user. The implementation is a bit more complex, and utilizes the more advanced features of Inworld.
Inworld::SdkInfo
struct INWORLD_EXPORT SdkInfo
{
std::string Type;
std::string Subtype;
std::string Version;
std::string OS;
std::string Description;
};
To help identify client applications to inworld, create an Inworld::SdkInfo
and populate it with information about the sdk used.
- Type: “godot”, “unreal” - engine
- Subtype: “4.3”, “5.5.0” - engine version
- Version: “0.2.0”, “1.8.2” - sdk version
- OS: “win64”, “macOS” - platform
- Description: Inworld: The AI engine for games and media - user data
This information is for tracking purposes only to help diagnose issues for end-users.
Inworld::Client
InitClientAsync
Before establishing a connection to Inworld, initialize the Inworld::Client object by calling InitClientAsync.
void Inworld::Client::InitClientAsync(
const SdkInfo& SdkInfo,
std::function<void(ConnectionState)> ConnectionStateCallback, std::function<void(std::shared_ptr<Inworld::Packet>)> PacketCallback
)
- The first parameter contains the Inworld::SdkInfo.
- The second parameter contains a callback for the client to respond to changes in connection state.
- The third parameter contains a callback for the client to respond to packets sent by Inworld.
Note: The matching call to destroy the client is DestroyClient.
InitSpeechProcessor
In order to pre-process speech data sent to Inworld from the client, call InitSpeechProcessor.
template<class T, class U>
void Inworld::Client::InitSpeechProcessor(const T& Options)
There are 3 types of SpeechProcessors, and they are specified by the type of options sent.
- Inworld::ClientSpeechOptions_Default: Speech data is not altered, VAD is not run.
- Inworld::ClientSpeechOptions_VAD_DetectOnly: Speech data is not altered, VAD is run.
- Inworld::ClientSpeechOptions_VAD_DetectAndFilterAudio: Speech data is filtered by VAD.
NOTE: VAD is currently experimental. It is advised to simply utilize
Inworld::ClientSpeechOptions_Default
.
NOTE: The matching call to destroy the speech processor is DestroySpeechProcessor.
Inworld::ClientOptions
In order to configure an Inworld Session, call SetOptions.
void Inworld::Client::SetOptions(const ClientOptions& Options)
Inworld::ClientOptions::Capabilities
struct INWORLD_EXPORT Capabilities
{
bool Animations = false;
bool Audio = false;
bool Emotions = false;
bool Interruptions = false;
bool EmotionStreaming = false;
bool SilenceEvents = false;
bool PhonemeInfo = false;
bool Continuation = true;
bool TurnBasedSTT = true;
bool NarratedActions = true;
bool Relations = true;
bool MultiAgent = true;
bool Audio2Face = false;
bool MultiModalActionPlanning = false;
bool PingPongReport = true;
bool PerceivedLatencyReport = true;
bool Logs = false;
bool LogsWarning = true;
bool LogsInfo = true;
bool LogsDebug = false;
bool LogsInternal = false;
};
The capabilities typically remain unchanged. It is advised to consult with Inworld about modifying the client-facing capabilities.
For brevity, explanation for each capability will not be given, as most are self-explanatory.
Inworld::ClientOptions::UserConfig
struct INWORLD_EXPORT UserConfiguration
{
struct PlayerProfile
{
struct PlayerField
{
std::string Id;
std::string Value;
};
std::vector<PlayerField> Fields;
};
PlayerProfile Profile;
std::string Name;
std::string Id;
};
- PlayerProfile: [“age”:”25”, “gender”:”male”] - PlayerProfile fields for the user.
- Name: “Matthew”, “Bob” - Name for characters to refer to the player by.
- Id: “hjd891hj1-dh3219-d32j09” - Unique Id for identification for tracking purposes.
Inworld::ClientOptions
struct INWORLD_EXPORT ClientOptions
{
Capabilities Capabilities;
UserConfiguration UserConfig;
std::string ServerUrl;
std::string Resource;
std::string ApiKey;
std::string ApiSecret;
std::string Base64;
std::string ProjectName;
std::string GameSessionId;
ClientHeaderData Metadata;
};
- ServerUrl: “api-engine.inworld.ai:443” - used for changing the environment, mostly used internally.
- Resource: “workspaces/your-workspace-id”
- ApiKey: “vpjaW9m4D31UmrqaFJNdE1nTtcOZu26s”
- ApiSecret: “oqIYRSze5F8uzSWxg7ed6mrGTqJRvfPoeazV16pweIoScaUfMp6PnDLXj4HTc8Pd”
- Base64: “dnBqYVc5bTREMzFVbXJxYUZKTmRFMW5UdGNPWnUyNnM6b3FJWVJTemU1Rjh1elNXeGc3ZWQ2bXJHVHFKUnZmUG9lYXpWMTZwd2VJb1NjYVVmTXA2UG5ETFhqNEhUYzhQZA==”
Note: It is only required to enter [ApiKey + ApiSecret] or [Base64].
- ProjectName: “MyGameName” - used to identify the project for tracking purposes.
- GameSessionId: “f2332f-f4hn289-h65464” - used to identify a game session for tracking purposes.
- ClientHeaderData: internal use only.
StartClientFrom[SceneId/Save/Token]
Once a client has been configured, it may be started by calling StartClientFromSceneId
, StartClientFromSave
, or StartClientFromToken
void Inworld::Client::StartClientFromSceneId(const std::string& SceneId)
void Inworld::Client::StartClientFromSave(const SessionSave& Save)
void Inworld::Client::StartClientFromToken(const SessionToken& Token)
StartClientFromSceneId
Simply provide the scene id in the format “workspaces/my-workspace-name/scenes/my-scene-name”
StartClientFromSave
To save a session, simply cache the Inworld::SessionSave
object from the callback provided to Inworld::Client::SaveSessionStateAsync
void Inworld::Client::SaveSessionStateAsync(
std::function<void(const SessionSave&, bool)> Callback
)
struct INWORLD_EXPORT SessionSave
{
std::string SceneId;
std::string State;
};
StartClientFromToken
To get a session token, simply cache the Inworld::SessionToken
object provided by Inworld::Client::GetSessionToken
const SessionToken& Inworld::Client::GetSessionToken() const
struct INWORLD_EXPORT SessionToken
{
std::string SessionId;
std::string Token;
int64_t ExpirationTime;
};
NOTE: The matching call to stop the session is
StopClient
.
Inworld::Packet
Inworld::Packet is the base class for all Inworld events.
Inworld::PacketId contains information about the packet used to associate packets with a given interaction or utterance.
Inworld::Routing contains information about the Source Inworld::Actor
and Target Inworld::Actor
.
Inworld::Actor contains information about the actor (PLAYER, AGENT, or WORLD), and the Name (agent id if the Actor is an AGENT).
Inworld::PacketVisitor
To process events sent by Inworld, and sent to the callback provided to Inworld::Client::InitClientAsync
, simply create a class that inherits from Inworld::PacketVisitor
.
Call Packet->Accept(MyPacketVisitor); for the Inworld::PacketVisitor
to process the packet by type.
class INWORLD_EXPORT PacketVisitor
{
public:
virtual void Visit(const TextEvent& Event) { }
virtual void Visit(const VADEvent& Event) { }
virtual void Visit(const DataEvent& Event) { }
virtual void Visit(const AudioDataEvent& Event) { }
virtual void Visit(const A2FHeaderEvent& Event) { }
virtual void Visit(const A2FContentEvent& Event) { }
virtual void Visit(const SilenceEvent& Event) { }
virtual void Visit(const ControlEvent& Event) { }
virtual void Visit(const ControlEventConversationUpdate& Event) { }
virtual void Visit(const ControlEventCurrentSceneStatus& Event) { }
virtual void Visit(const EmotionEvent& Event) { }
virtual void Visit(const CancelResponseEvent& Event) { }
virtual void Visit(const CustomGestureEvent& Event) { }
virtual void Visit(const CustomEvent& Event) { }
virtual void Visit(const ActionEvent& Event) { }
virtual void Visit(const RelationEvent& Event) { }
virtual void Visit(const PingEvent& Event) { }
virtual void Visit(const LogEvent& Event) { }
};
An implementation may utilize the Inworld::Packet
Source and Destination actor ids to route the event to the desired object that represents the actor.
Inworld::ControlEventCurrentSceneStatus
The Inworld::ControlEventCurrentSceneStatus
is sent once a session has been established, or when the scene or characters within a scene change.
It is recommended to cache off any Inworld::AgentInfo
resulting from the Inworld::ControlEventCurrentSceneStatus
in order to communicate with Inworld properly.
const std::vector<AgentInfo>& Inworld::ControlEventCurrentSceneStatus::GetAgentInfos() const
Inworld::AgentInfo
struct INWORLD_EXPORT AgentInfo
{
std::string BrainName;
std::string AgentId;
std::string GivenName;
};
- BrainName: “workspaces/my-workspace-name/characters/bob” - resource id of the character.
- AgentId: “gbh73892gbhf-fhbw9efhb-f2h89fjn-fh289n9” - unique id used to specify agent to Inworld.
- GivenName: “Bob” - friendly name that can be used for end-user display.
An implementation may use this event to simply pair objects that represent the actors with their agent ID.
Inworld::TextEvent
This event is sent for two reasons:
-
- The player is talking, and we receive PLAYER > AGENT STT information. In this case, the event will only be considered final once the player stops talking.
-
- The character is talking, and we receive AGENT > PLAYER information. In this case, the event will always be considered final, as the client will not receive partial utterance results.
Inworld::AudioDataEvent
This event is always paired with AGENT > PLAYER Inworld::TextEvents
.
The DataChunk provided by the event represents character speech in 1-channel 24000 sample rate audio wav data.
The PhonemeInfo provided will contain timestamp to phoneme data that can be utilized for lipsync.
Inworld::ControlEvent
This event typically represents INTERACTION_END, and can be used to know when the client should no longer expect additional information about the interaction it represents.
Inworld::CustomEvent
This event represents what we now call “Triggers”. Each trigger contains a Name and a map of Params.
SendText
To send text messages to an agent, call Inworld::Client::SendTextMessage
.
Inworld::Client::SendTextMessage(
const std::string& AgentId,
const std::string& Text
)
Audio Sessions
To send audio data to an agent, call Inworld::Client::StartAudioSession
, Inworld::Client::SendSoundMessage
, and Inworld::Client::StopAudioSession
.
StartAudioSession
To start an audio session with an agent, call Inworld::Client::StartAudioSession
.
AudioSessionStartPayload
Inworld::AudioSessionStartPayload
is used to configure the AudioSession.
struct INWORLD_EXPORT AudioSessionStartPayload
{
enum class MicrophoneMode : uint8_t
{
Unspecified = 0,
OpenMic = 1,
ExpectAudioEnd = 2,
};
enum class UnderstandingMode : uint8_t
{
Unspecified = 0,
Full = 1,
SpeechRecognitionOnly = 2,
};
MicrophoneMode MicMode = MicrophoneMode::Unspecified;
UnderstandingMode UndMode = UnderstandingMode::Unspecified;
};
- MicrophoneMode::OpenMic: Acts as a normal open-ended mic, where Inworld detects end of player speech and will automatically generate a response.
- MicrophoneMode::ExpectAudioEnd: Acts as a ‘push-to-talk’, where Inworld will wait for a call to Inworld::Client::StopAudioSession before generating a response.
- UnderstandingMode::Full: Will generate both PLAYER > AGENT TTS and AGENT > PLAYER data.
- UnderstandingMode::SpeechRecognitionOnly: Will only generate PLAYER > AGENT TTS data.
void StartAudioSession(
const std::string& AgentId,
const AudioSessionStartPayload& Payload
);
SendSoundMessage
void SendSoundMessage(
const std::string& AgentId,
const std::string& Data
);
NOTE: This must only be called after a matching
Inworld::Client::StartAudioSession
.
The Data sent assumes 1-channel 16000 sample rate wav audio data.
It is recommended to call this function with 0.1s of audio data every 0.1s. This requirement is not strictly enforced, but there are data packet limits to be aware of. Keeping this send rate will ensure this cap is not exceeded.
StopAudioSession
cvoid StopAudioSession(
const std::string& AgentId
);
SendCustomEvent
To send triggers to an agent, call Inworld::Client::SendCustomEvent
.
Inworld::Client::SendCustomEvent(
const std::string& AgentId,
const std::string& Name,
const std::unordered_map<std::string, std::string>& Params
);