> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inworld.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Web search

> Ground chat completions with web search

You can ground LLM responses with real-time web search results by adding web search configuration to a router variant. Two mutually exclusive modes are available — setting both on the same variant returns an error.

| Mode           | Field                | How it works                                                                      | Supported Models                                                |
| :------------- | :------------------- | :-------------------------------------------------------------------------------- | :-------------------------------------------------------------- |
| **Tool-based** | `web_search`         | LLM calls a search engine in a tool-calling loop, then synthesizes a final answer | Any LLM that supports tool calling                              |
| **Native**     | `web_search_options` | Provider's built-in search grounding (no tool loop)                               | OpenAI (search models only), Anthropic, Google, Vertex AI, Groq |

<Note>
  Only one mode may be set per variant. Setting both will result in an error.
</Note>

## Tool-based web search

Add a `web_search` object to your router variant. When a request is routed to that variant, the router injects a search tool, lets the LLM call it in a loop, and returns a grounded answer with `url_citation` annotations.

| Parameter     | Type     | Default | Description                                                          |
| ------------- | -------- | ------- | -------------------------------------------------------------------- |
| `engine`      | `string` | `exa`   | Options: `exa`, `google`, `native` (uses model's built-in grounding) |
| `max_results` | `int`    | `3`     | Number of search results returned per step (1–20)                    |
| `max_steps`   | `int`    | `1`     | Maximum tool-call rounds before final synthesis (1–5)                |

```json Variant configuration theme={"system"}
{
  "variant_id": "search-grounded",
  "model_id": "openai/gpt-4o",
  "web_search": {
    "engine": "exa",
    "max_results": 5,
    "max_steps": 2
  }
}
```

**How it works:**

1. The router injects a search tool and sends the request to the LLM.
2. The LLM calls the search tool with a query.
3. The search engine returns results, which are injected back into the conversation.
4. Steps 2–3 repeat up to `max_steps` times.
5. The LLM synthesizes a final answer with `url_citation` annotations.

## Native web search

Add `web_search_options` to your router variant to use a provider's built-in search grounding. This skips the tool-calling loop entirely — the provider handles search internally.

| Parameter             | Type     | Default    | Description                                                |
| --------------------- | -------- | ---------- | ---------------------------------------------------------- |
| `search_context_size` | `string` | `"medium"` | Amount of search context: `"low"`, `"medium"`, or `"high"` |

```json Variant configuration theme={"system"}
{
  "variant_id": "native-search",
  "model_id": "openai/gpt-4o-search-preview",
  "web_search_options": {
    "search_context_size": "high"
  }
}
```

Supported providers: **OpenAI** (search models only, e.g. `gpt-4o-search-preview`), **Anthropic**, **Google / Vertex AI**, and **Groq**.

## Per-request web search

You can also pass `web_search` or `web_search_options` directly on a chat completion request instead of configuring it on a variant. The fields and values are the same as above.

<Tip>
  Both fields can be passed at the top level of the request body or inside `extra_body` for OpenAI SDK compatibility.
</Tip>

## Citations & streaming

Assistant messages may include OpenAI-style **`annotations`** (e.g. `type: "url_citation"` with `url`, `title`, `content`).

With **`stream: true`**, annotations are delivered on the **last** SSE chunk, alongside `finish_reason`.
