Web search - Inworld AI Documentation

You can ground LLM responses with real-time web search results by adding web search configuration to a router variant. Two mutually exclusive modes are available — setting both on the same variant returns an error.

Mode	Field	How it works	Supported Models
Tool-based	`web_search`	LLM calls a search engine in a tool-calling loop, then synthesizes a final answer	Any LLM that supports tool calling
Native	`web_search_options`	Provider’s built-in search grounding (no tool loop)	OpenAI (search models only), Anthropic, Google, Vertex AI, Groq

Only one mode may be set per variant. Setting both will result in an error.

Tool-based web search

Add a web_search object to your router variant. When a request is routed to that variant, the router injects a search tool, lets the LLM call it in a loop, and returns a grounded answer with url_citation annotations.

Parameter	Type	Default	Description
`engine`	`string`	`exa`	Options: `exa`, `google`, `native` (uses model’s built-in grounding)
`max_results`	`int`	`3`	Number of search results returned per step (1–20)
`max_steps`	`int`	`1`	Maximum tool-call rounds before final synthesis (1–5)

Variant configuration

{
  "variant_id": "search-grounded",
  "model_id": "openai/gpt-4o",
  "web_search": {
    "engine": "exa",
    "max_results": 5,
    "max_steps": 2
  }
}

How it works:

The router injects a search tool and sends the request to the LLM.
The LLM calls the search tool with a query.
The search engine returns results, which are injected back into the conversation.
Steps 2–3 repeat up to max_steps times.
The LLM synthesizes a final answer with url_citation annotations.

Native web search

Add web_search_options to your router variant to use a provider’s built-in search grounding. This skips the tool-calling loop entirely — the provider handles search internally.

Parameter	Type	Default	Description
`search_context_size`	`string`	`"medium"`	Amount of search context: `"low"`, `"medium"`, or `"high"`

Variant configuration

{
  "variant_id": "native-search",
  "model_id": "openai/gpt-4o-search-preview",
  "web_search_options": {
    "search_context_size": "high"
  }
}

Supported providers: OpenAI (search models only, e.g. gpt-4o-search-preview), Anthropic, Google / Vertex AI, and Groq.

Per-request web search

You can also pass web_search or web_search_options directly on a chat completion request instead of configuring it on a variant. The fields and values are the same as above.

Both fields can be passed at the top level of the request body or inside extra_body for OpenAI SDK compatibility.

Citations & streaming

Assistant messages may include OpenAI-style annotations (e.g. type: "url_citation" with url, title, content). With stream: true, annotations are delivered on the last SSE chunk, alongside finish_reason.

​Tool-based web search

​Native web search

​Per-request web search

​Citations & streaming

Tool-based web search

Native web search

Per-request web search

Citations & streaming