Skip to main content

LLM

Getting Started

To get started with the Inworld Unreal Engine SDK LLM, refer to the sections below.

Obtain Runtime API Key

All requests to the Inworld LLM Service API must include both a Runtime API Key and a unique User Id.

To obtain a Runtime API key, perform the following steps:

  1. Navigate to Inworld Studio.
  2. From within a studio workspace, navigate to the Integrations tab.
  3. Go to the Runtime API Key section and click Generate new key.
  4. Click the Base 64 to copy the key for use.

Project Settings

From the Unreal Editor, navigate to: Edit > Project Settings > Plugins > InworldAI - LLM

LLM Project Settings

ApiKey - The Base64 authorization signature copied from Studio. User Id - Unique identifier for the end-user associated with the request. Model Id - ID of the model to use. Options include inworld-pixie (small model) and inworld-dragon (large model).

Note: The values set here specify a default. These values can be overridden per blueprint node.

Completion Events

Below are the provided methods of completion using the LLM Service.

Each completion event utilizes a Text Generation Config with the following properties:

  • Max Tokens - Maximum number of output tokens to generate.
  • Temperature - Determines randomness of response. Higher values result in greater randomness.
  • Presence Penalty - Positive values penalize new tokens based on whether they appear in the generated text so far, increasing the model's likelihood to talk about new topics.
  • Repetition Penalty - Penalizes new tokens based on whether they appear in the prompt and the generated text so far. Values greater than 1 encourage the model to use new tokens, while values less than 1 encourage the model to repeat tokens. The value must be strictly positive.
  • Frequency Penalty - Positive values penalize new tokens based on their existing frequency in the generated text so far, decreasing the model's likelihood to repeat the same line verbatim.
  • Top P - Use nucleus sampling, where the model only considers the results of the tokens with Top P probability mass. For example, 0.1 means only the tokens comprising the top 10% probability mass are considered. It is generally recommended to either alter temperature of top p, but not both.
  • Stream - Wether or not to stream responses.

Complete Text

Generate a text response for a provided prompt.

Complete Text

Text - Prompt to generate the text.

Text Generation Config - Configuration for generating text (see above).

Complete Chat

Generate a chat completion.

Complete Chat

Messages - An array of messages that form the conversation history. Each message includes an author role, and the content of the message.

Text Generation Config - Configuration for generating text (see above).