Skip to main content

Native Development Kit

The Native Development Kit (NDK) enables you to integrate Inworld AI characters into any application or game engine written in C++ or that supports extensions via C++ code.

For additional information on Inworld's Native Development Kit, see: Inworld NDK Github.

NDK Integration Examples

The following NDK examples cover several of Inworld’s supported platforms.

Console App

Console App Example

The console app is a simple text-based application used for internal testing. It provides a good minimal example of utilizing the NDK.

NOTE: The Unity and Android App provided in the repo may not be up to date. Only the Console App is utilized in automated builds for testing purposes.

Godot SDK

Godot SDK Example

Demo focuses on creating more realistic and believable worlds with characters powered by artificial intelligence.

The Godot SDK is likely the best reference for how one might quickly integrate the NDK into a game engine. It provides simple wrappers around the NDK, and comes with a 2D sample project for quick testing. The implementation is extremely simple, and does not use more advanced features of Inworld.

Inworld-specific Code: inworld-godot-sdk

Unreal SDK

Unreal SDK Example

Demo focuses on creating more realistic and believable worlds with characters powered by artificial intelligence.

The Unreal SDK is likely the best reference for how one might integrate the NDK into a game engine in-depth. It provides wrappers around the NDK, and comes with a large amount of opinionated implementation that abstracts much of the underlying API away from the end user. The implementation is a bit more complex, and utilizes the more advanced features of Inworld.

Inworld::SdkInfo

struct INWORLD_EXPORT SdkInfo
{
std::string Type;
std::string Subtype;
std::string Version;
std::string OS;
std::string Description;
};

To help identify client applications to inworld, create an Inworld::SdkInfo and populate it with information about the sdk used.

  • Type: “godot”, “unreal” - engine
  • Subtype: “4.3”, “5.5.0” - engine version
  • Version: “0.2.0”, “1.8.2” - sdk version
  • OS: “win64”, “macOS” - platform
  • Description: Inworld: The AI engine for games and media - user data

This information is for tracking purposes only to help diagnose issues for end-users.

Inworld::Client

InitClientAsync

Before establishing a connection to Inworld, initialize the Inworld::Client object by calling InitClientAsync.

void Inworld::Client::InitClientAsync(
const SdkInfo& SdkInfo,
std::function<void(ConnectionState)> ConnectionStateCallback, std::function<void(std::shared_ptr<Inworld::Packet>)> PacketCallback
)
  • The first parameter contains the Inworld::SdkInfo.
  • The second parameter contains a callback for the client to respond to changes in connection state.
  • The third parameter contains a callback for the client to respond to packets sent by Inworld.

Note: The matching call to destroy the client is DestroyClient.

InitSpeechProcessor

In order to pre-process speech data sent to Inworld from the client, call InitSpeechProcessor.

template<class T, class U>
void Inworld::Client::InitSpeechProcessor(const T& Options)

There are 3 types of SpeechProcessors, and they are specified by the type of options sent.

  • Inworld::ClientSpeechOptions_Default: Speech data is not altered, VAD is not run.
  • Inworld::ClientSpeechOptions_VAD_DetectOnly: Speech data is not altered, VAD is run.
  • Inworld::ClientSpeechOptions_VAD_DetectAndFilterAudio: Speech data is filtered by VAD.

NOTE: VAD is currently experimental. It is advised to simply utilize Inworld::ClientSpeechOptions_Default.

NOTE: The matching call to destroy the speech processor is DestroySpeechProcessor.

Inworld::ClientOptions

In order to configure an Inworld Session, call SetOptions.

void Inworld::Client::SetOptions(const ClientOptions& Options)

Inworld::ClientOptions::Capabilities

 struct INWORLD_EXPORT Capabilities
{
bool Animations = false;
bool Audio = false;
bool Emotions = false;
bool Interruptions = false;
bool EmotionStreaming = false;
bool SilenceEvents = false;
bool PhonemeInfo = false;
bool Continuation = true;
bool TurnBasedSTT = true;
bool NarratedActions = true;
bool Relations = true;
bool MultiAgent = true;
bool Audio2Face = false;
bool MultiModalActionPlanning = false;
bool PingPongReport = true;
bool PerceivedLatencyReport = true;
bool Logs = false;
bool LogsWarning = true;
bool LogsInfo = true;
bool LogsDebug = false;
bool LogsInternal = false;
};

The capabilities typically remain unchanged. It is advised to consult with Inworld about modifying the client-facing capabilities.

For brevity, explanation for each capability will not be given, as most are self-explanatory.

Inworld::ClientOptions::UserConfig

 struct INWORLD_EXPORT UserConfiguration
{
struct PlayerProfile
{
struct PlayerField
{
std::string Id;
std::string Value;
};
std::vector<PlayerField> Fields;
};
PlayerProfile Profile;
std::string Name;
std::string Id;
};
  • PlayerProfile: [“age”:”25”, “gender”:”male”] - PlayerProfile fields for the user.
  • Name: “Matthew”, “Bob” - Name for characters to refer to the player by.
  • Id: “hjd891hj1-dh3219-d32j09” - Unique Id for identification for tracking purposes.

Inworld::ClientOptions

 struct INWORLD_EXPORT ClientOptions
{
Capabilities Capabilities;
UserConfiguration UserConfig;
std::string ServerUrl;
std::string Resource;
std::string ApiKey;
std::string ApiSecret;
std::string Base64;
std::string ProjectName;
std::string GameSessionId;
ClientHeaderData Metadata;
};
  • ServerUrl: “api-engine.inworld.ai:443” - used for changing the environment, mostly used internally.
  • Resource: “workspaces/your-workspace-id”
  • ApiKey: “vpjaW9m4D31UmrqaFJNdE1nTtcOZu26s”
  • ApiSecret: “oqIYRSze5F8uzSWxg7ed6mrGTqJRvfPoeazV16pweIoScaUfMp6PnDLXj4HTc8Pd”
  • Base64: “dnBqYVc5bTREMzFVbXJxYUZKTmRFMW5UdGNPWnUyNnM6b3FJWVJTemU1Rjh1elNXeGc3ZWQ2bXJHVHFKUnZmUG9lYXpWMTZwd2VJb1NjYVVmTXA2UG5ETFhqNEhUYzhQZA==”

Note: It is only required to enter [ApiKey + ApiSecret] or [Base64].

  • ProjectName: “MyGameName” - used to identify the project for tracking purposes.
  • GameSessionId: “f2332f-f4hn289-h65464” - used to identify a game session for tracking purposes.
  • ClientHeaderData: internal use only.

StartClientFrom[SceneId/Save/Token]

Once a client has been configured, it may be started by calling StartClientFromSceneId, StartClientFromSave, or StartClientFromToken

void Inworld::Client::StartClientFromSceneId(const std::string& SceneId)
void Inworld::Client::StartClientFromSave(const SessionSave& Save)
void Inworld::Client::StartClientFromToken(const SessionToken& Token)

StartClientFromSceneId

Simply provide the scene id in the format “workspaces/my-workspace-name/scenes/my-scene-name”

StartClientFromSave

To save a session, simply cache the Inworld::SessionSave object from the callback provided to Inworld::Client::SaveSessionStateAsync

void Inworld::Client::SaveSessionStateAsync(
std::function<void(const SessionSave&, bool)> Callback
)
struct INWORLD_EXPORT SessionSave
{
std::string SceneId;
std::string State;
};

StartClientFromToken

To get a session token, simply cache the Inworld::SessionToken object provided by Inworld::Client::GetSessionToken

const SessionToken& Inworld::Client::GetSessionToken() const
struct INWORLD_EXPORT SessionToken
{
std::string SessionId;
std::string Token;
int64_t ExpirationTime;
};

NOTE: The matching call to stop the session is StopClient.

Inworld::Packet

Inworld::Packet is the base class for all Inworld events.

Inworld::PacketId contains information about the packet used to associate packets with a given interaction or utterance.

Inworld::Routing contains information about the Source Inworld::Actor and Target Inworld::Actor.

Inworld::Actor contains information about the actor (PLAYER, AGENT, or WORLD), and the Name (agent id if the Actor is an AGENT).

Inworld::PacketVisitor

To process events sent by Inworld, and sent to the callback provided to Inworld::Client::InitClientAsync, simply create a class that inherits from Inworld::PacketVisitor.

Call Packet->Accept(MyPacketVisitor); for the Inworld::PacketVisitor to process the packet by type.

class INWORLD_EXPORT PacketVisitor
{
public:
virtual void Visit(const TextEvent& Event) { }
virtual void Visit(const VADEvent& Event) { }
virtual void Visit(const DataEvent& Event) { }
virtual void Visit(const AudioDataEvent& Event) { }
virtual void Visit(const A2FHeaderEvent& Event) { }
virtual void Visit(const A2FContentEvent& Event) { }
virtual void Visit(const SilenceEvent& Event) { }
virtual void Visit(const ControlEvent& Event) { }
virtual void Visit(const ControlEventConversationUpdate& Event) { }
virtual void Visit(const ControlEventCurrentSceneStatus& Event) { }
virtual void Visit(const EmotionEvent& Event) { }
virtual void Visit(const CancelResponseEvent& Event) { }
virtual void Visit(const CustomGestureEvent& Event) { }
virtual void Visit(const CustomEvent& Event) { }
virtual void Visit(const ActionEvent& Event) { }
virtual void Visit(const RelationEvent& Event) { }
virtual void Visit(const PingEvent& Event) { }
virtual void Visit(const LogEvent& Event) { }
};

An implementation may utilize the Inworld::Packet Source and Destination actor ids to route the event to the desired object that represents the actor.

Inworld::ControlEventCurrentSceneStatus

The Inworld::ControlEventCurrentSceneStatus is sent once a session has been established, or when the scene or characters within a scene change.

It is recommended to cache off any Inworld::AgentInfo resulting from the Inworld::ControlEventCurrentSceneStatus in order to communicate with Inworld properly.

const std::vector<AgentInfo>& Inworld::ControlEventCurrentSceneStatus::GetAgentInfos() const

Inworld::AgentInfo

struct INWORLD_EXPORT AgentInfo
{
std::string BrainName;
std::string AgentId;
std::string GivenName;
};
  • BrainName: “workspaces/my-workspace-name/characters/bob” - resource id of the character.
  • AgentId: “gbh73892gbhf-fhbw9efhb-f2h89fjn-fh289n9” - unique id used to specify agent to Inworld.
  • GivenName: “Bob” - friendly name that can be used for end-user display.

An implementation may use this event to simply pair objects that represent the actors with their agent ID.

Inworld::TextEvent

This event is sent for two reasons:

    1. The player is talking, and we receive PLAYER > AGENT STT information. In this case, the event will only be considered final once the player stops talking.
    1. The character is talking, and we receive AGENT > PLAYER information. In this case, the event will always be considered final, as the client will not receive partial utterance results.

Inworld::AudioDataEvent

This event is always paired with AGENT > PLAYER Inworld::TextEvents.

The DataChunk provided by the event represents character speech in 1-channel 24000 sample rate audio wav data.

The PhonemeInfo provided will contain timestamp to phoneme data that can be utilized for lipsync.

Inworld::ControlEvent

This event typically represents INTERACTION_END, and can be used to know when the client should no longer expect additional information about the interaction it represents.

Inworld::CustomEvent

This event represents what we now call “Triggers”. Each trigger contains a Name and a map of Params.

SendText To send text messages to an agent, call Inworld::Client::SendTextMessage.

Inworld::Client::SendTextMessage(
const std::string& AgentId,
const std::string& Text
)

Audio Sessions

To send audio data to an agent, call Inworld::Client::StartAudioSession, Inworld::Client::SendSoundMessage, and Inworld::Client::StopAudioSession.

StartAudioSession

To start an audio session with an agent, call Inworld::Client::StartAudioSession.

AudioSessionStartPayload

Inworld::AudioSessionStartPayload is used to configure the AudioSession.

struct INWORLD_EXPORT AudioSessionStartPayload
{
enum class MicrophoneMode : uint8_t
{
Unspecified = 0,
OpenMic = 1,
ExpectAudioEnd = 2,
};
enum class UnderstandingMode : uint8_t
{
Unspecified = 0,
Full = 1,
SpeechRecognitionOnly = 2,
};
MicrophoneMode MicMode = MicrophoneMode::Unspecified;
UnderstandingMode UndMode = UnderstandingMode::Unspecified;
};
  • MicrophoneMode::OpenMic: Acts as a normal open-ended mic, where Inworld detects end of player speech and will automatically generate a response.
  • MicrophoneMode::ExpectAudioEnd: Acts as a ‘push-to-talk’, where Inworld will wait for a call to Inworld::Client::StopAudioSession before generating a response.
  • UnderstandingMode::Full: Will generate both PLAYER > AGENT TTS and AGENT > PLAYER data.
  • UnderstandingMode::SpeechRecognitionOnly: Will only generate PLAYER > AGENT TTS data.
void StartAudioSession(
const std::string& AgentId,
const AudioSessionStartPayload& Payload
);
SendSoundMessage
void SendSoundMessage(
const std::string& AgentId,
const std::string& Data
);

NOTE: This must only be called after a matching Inworld::Client::StartAudioSession.

The Data sent assumes 1-channel 16000 sample rate wav audio data.

It is recommended to call this function with 0.1s of audio data every 0.1s. This requirement is not strictly enforced, but there are data packet limits to be aware of. Keeping this send rate will ensure this cap is not exceeded.

StopAudioSession

cvoid StopAudioSession(
const std::string& AgentId
);

SendCustomEvent

To send triggers to an agent, call Inworld::Client::SendCustomEvent.

Inworld::Client::SendCustomEvent(
const std::string& AgentId,
const std::string& Name,
const std::unordered_map<std::string, std::string>& Params
);