Create chat completion
Generate a response for the given chat conversation
model to auto for automatic model selection based on criteria like price, latency, or performance.
For more advanced routing — such as conditional routing, A/B testing across variants, and reusable configurations — create a router and reference it via the model field (e.g., inworld/my-router).
For web-grounded answers, use extra_body.web_search.Authorizations
Your authentication credentials. For Basic authentication, please populate Basic $INWORLD_API_KEY.
Please make sure your API Key has write permissions for the Router API in order to create, update, and delete routers. You can create a key in one command with the Inworld CLI: inworld workspace add-key.
Body
The model to use, which can be:
- A model id (e.g.,
gpt-oss-120b). The best provider is automatically selected by latency, or you can control provider selection viaextra_body.provider. See Models for available models. - A provider-prefixed model id (e.g.,
openai/gpt-5). This specifies the provider and model to use. autofor automatic model selection based on criteria like price, latency, or intelligence- A router, which is specified by
inworld/<router-name>. The routernamemust be prefixed byinworld/.
A list of messages comprising the conversation so far.
If using a router where a prompt is specified, these messages will be appended to the prompt.
Show child attributes
Show child attributes
If true, partial message deltas will be sent as server-sent events.
Sampling temperature between 0 and 2. Higher values make output more random.
0 <= x <= 2Nucleus sampling parameter. Must be greater than 0.
0 < x <= 1Maximum number of tokens to generate.
x >= 1Maximum number of completion tokens to generate.
x >= 1Penalizes tokens based on presence in the text.
-2 <= x <= 2Penalizes tokens based on frequency in the text.
-2 <= x <= 2Random seed for generation.
Up to 4 sequences where the API will stop generating.
Modifies the likelihood of specified tokens appearing in the completion.
Show child attributes
Show child attributes
Controls the amount of reasoning effort the model uses. Note: This parameter is provider/model-specific and may not be supported by all models (e.g., OpenAI models do not support this parameter). This will be overridden if extra_body.reasoning is specified.
none, low, minimal, medium, high, xhigh A unique identifier for the end user. When used with a router, the same user will consistently receive the same variant across requests (sticky routing).
Tool-based web search configuration. The LLM calls a search engine in a tool-calling loop, then synthesizes a grounded answer with url_citation annotations. Works with any LLM that supports tool calling. Mutually exclusive with web_search_options. See Web Search for details.
Show child attributes
Show child attributes
Native web search using the provider's built-in search grounding (no tool loop). Supported by OpenAI (search models only), Anthropic, Google / Vertex AI, and Groq. Mutually exclusive with web_search. See Web Search for details.
Show child attributes
Show child attributes
Output modalities to generate. Defaults to ["text"]. Include "image" to request image generation (e.g., ["text", "image"]). Currently supported for OpenAI and Google image models.
text, image Configuration for image output. Optional when requesting image output via modalities: ["image"].
Show child attributes
Show child attributes
Optional parameters for model routing and optimization.
Show child attributes
Show child attributes
Response
A successful response. Returns either a complete chat completion or streaming chunks.
Unique identifier for the chat completion.
Object type, always 'chat.completion'.
Unix timestamp when the completion was created.
The model that was actually used.
List of chat completion choices.
Show child attributes
Show child attributes
Token usage statistics.
Show child attributes
Show child attributes
Routing metadata providing transparency into model selection decisions.
Show child attributes
Show child attributes
Was this page helpful?