> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inworld.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create chat completion

> Generate a response for the given chat conversation

Call hundreds of models from various providers directly through our unified API, or set `model` to `auto` for automatic model selection based on criteria like price, latency, or performance.

For more advanced routing — such as conditional routing, A/B testing across variants, and reusable configurations — [**create a router**](/api-reference/routerAPI/routerservice/create-router) and reference it via the `model` field (e.g., `inworld/my-router`).

For web-grounded answers, use [`extra_body.web_search`](/router/capabilities/web-search).


## OpenAPI

````yaml post /v1/chat/completions
openapi: 3.0.0
info:
  title: Router API
  version: v1
  description: >-
    Intelligent routing and optimization for LLM requests. Automatically selects
    the best model based on your criteria, handles fallbacks, and optimizes for
    cost, latency, or intelligence. Also includes a Router CRUD API for creating
    reusable routing configurations.
servers:
  - url: https://api.inworld.ai
    description: Production server
security:
  - inworld_basic: []
tags:
  - name: Chat Completions
    description: >-
      Main API endpoint for intelligent LLM routing and optimization. Generate
      chat completions with automatic model selection or specific models with
      fallbacks.
  - name: Moderations
    description: >-
      Classify text content for harmful material. Includes OpenAI-compatible
      moderation and conversation-aware chat moderation with AILuminate safety
      signals.
  - name: Router Management
    description: >-
      Service endpoints for creating and managing reusable router
      configurations. Use these to set up routing strategies that can be
      referenced in Chat Completions requests.
paths:
  /v1/chat/completions:
    post:
      tags:
        - Chat Completions
      summary: Create chat completion
      operationId: chatCompletions
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatCompletionRequest'
            example:
              model: auto
              messages:
                - role: user
                  content: Hello!
              extra_body:
                models:
                  - openai/gpt-4o-mini
                  - google-ai-studio/gemini-2.0-flash
                sort:
                  - price
      responses:
        '200':
          description: >-
            A successful response. Returns either a complete chat completion or
            streaming chunks.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatCompletionResponse'
              examples:
                Direct model call:
                  summary: Direct model call
                  value:
                    id: chatcmpl-1772347141924
                    object: chat.completion
                    created: 1772347141
                    model: openai/gpt-4o
                    choices:
                      - index: 0
                        message:
                          role: assistant
                          content: Hello! How can I assist you today?
                        finish_reason: stop
                    usage:
                      prompt_tokens: 9
                      completion_tokens: 9
                      total_tokens: 18
                    metadata:
                      attempts:
                        - model: openai/gpt-4o
                          success: true
                          time_to_first_token_ms: 428
                      generation_id: 9b365b38-f09f-96d1-8c77-99b28c8b74bf
                      reasoning: 'Using specified model: ''openai/gpt-4o'' - success'
                      total_duration_ms: 472
                Auto model selection:
                  summary: Auto model selection
                  value:
                    id: chatcmpl-339374ec-d6f8-464b-ab93-fc2dcb14fecd
                    object: chat.completion
                    created: 1767996520
                    model: google-ai-studio/gemini-2.0-flash
                    choices:
                      - index: 0
                        message:
                          role: assistant
                          content: Hello! How can I help you today?
                        finish_reason: stop
                    usage:
                      prompt_tokens: 2
                      completion_tokens: 10
                      total_tokens: 12
                    metadata:
                      attempts:
                        - model: google-ai-studio/gemini-2.0-flash
                          success: true
                          time_to_first_token_ms: 245
                      generation_id: 019bb4a8-5523-789b-a795-12b47052a2d9
                      reasoning: >-
                        Auto-selected 'google-ai-studio/gemini-2.0-flash', from
                        specified models, sorted by lowest price - success
                      total_duration_ms: 312
                Using a router:
                  summary: Using a router
                  value:
                    id: chatcmpl-1772348076987
                    object: chat.completion
                    created: 1772348076
                    model: google-ai-studio/gemini-3-flash-preview
                    choices:
                      - index: 0
                        message:
                          role: assistant
                          content: Hello! How can I help you today?
                        finish_reason: stop
                    usage:
                      prompt_tokens: 3
                      completion_tokens: 98
                      total_tokens: 101
                    metadata:
                      attempts:
                        - model: google-ai-studio/gemini-3-flash-preview
                          success: true
                          time_to_first_token_ms: 1037
                      generation_id: 1360075d-3ca3-9248-897a-5cb78df2ee7d
                      reasoning: >-
                        Using specified model:
                        'google-ai-studio/gemini-3-flash-preview' - success
                      route_id: default
                      variant_id: gemini
                      total_duration_ms: 1038
                Web search (tool-based):
                  summary: Web search (tool-based)
                  value:
                    id: chatcmpl-1774895681858
                    object: chat.completion
                    created: 1774895681
                    model: openai/gpt-4o
                    choices:
                      - index: 0
                        message:
                          role: assistant
                          content: >-
                            Recent developments in Mars exploration have
                            revealed significant discoveries by NASA's
                            Perseverance rover...
                          annotations:
                            - type: url_citation
                              url_citation:
                                content: >-
                                  NASA's Perseverance Rover Finds Ancient River
                                  Delta on Mars...
                                title: >-
                                  NASA's Perseverance Rover Finds Ancient River
                                  Delta on Mars
                                url: https://example.com/mars-exploration
                        finish_reason: stop
                    usage:
                      prompt_tokens: 2829
                      completion_tokens: 384
                      total_tokens: 3213
                    metadata:
                      attempts:
                        - model: openai/gpt-4o
                          success: true
                          time_to_first_token_ms: 4645
                      generation_id: 896d6bb3-0e1b-4fc5-b9fd-3340e60e9033
                      reasoning: 'Using specified model: ''openai/gpt-4o'' - success'
                      total_duration_ms: 4645
        '400':
          description: Bad request
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
        '401':
          description: Unauthorized - Invalid or missing API key
          content:
            text/plain:
              schema:
                type: string
                example: Unauthorized
      x-codeSamples:
        - lang: bash
          label: Direct model call
          source: |-
            curl --location 'https://api.inworld.ai/v1/chat/completions' \
            --header "Authorization: Basic $INWORLD_API_KEY" \
            --header 'Content-Type: application/json' \
            --data '{
              "model": "openai/gpt-4o",
              "messages": [{"role": "user", "content": "Hello!"}]
            }'
        - lang: bash
          label: Auto model selection
          source: |-
            curl --location 'https://api.inworld.ai/v1/chat/completions' \
            --header "Authorization: Basic $INWORLD_API_KEY" \
            --header 'Content-Type: application/json' \
            --data '{
              "model": "auto",
              "messages": [{"role": "user", "content": "Hello!"}],
              "extra_body": {
                "models": ["openai/gpt-4o-mini", "google-ai-studio/gemini-2.0-flash"],
                "sort": ["price"]
              }
            }'
        - lang: bash
          label: Using a router
          source: |-
            curl --location 'https://api.inworld.ai/v1/chat/completions' \
            --header "Authorization: Basic $INWORLD_API_KEY" \
            --header 'Content-Type: application/json' \
            --data '{
              "model": "inworld/my-router",
              "messages": [{"role": "user", "content": "Hello!"}]
            }'
        - lang: bash
          label: Image generation
          source: |-
            curl --location 'https://api.inworld.ai/v1/chat/completions' \
            --header "Authorization: Basic $INWORLD_API_KEY" \
            --header 'Content-Type: application/json' \
            --data '{
              "model": "google-ai-studio/gemini-3.1-flash-image-preview",
              "messages": [{"role": "user", "content": "Generate an image of a cat"}],
              "modalities": ["text", "image"],
              "image_config": {
                "aspect_ratio": "16:9"
              }
            }'
        - lang: bash
          label: Web search (tool-based)
          source: |-
            curl --location 'https://api.inworld.ai/v1/chat/completions' \
            --header "Authorization: Basic $INWORLD_API_KEY" \
            --header 'Content-Type: application/json' \
            --data '{
              "model": "openai/gpt-4o",
              "messages": [{"role": "user", "content": "Latest news on Mars exploration"}],
              "web_search": {
                "engine": "exa",
                "max_results": 5,
                "max_steps": 2
              }
            }'
        - lang: python
          label: Direct model call
          source: |-
            import os
            import requests

            url = "https://api.inworld.ai/v1/chat/completions"
            headers = {
                "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
                "Content-Type": "application/json"
            }
            payload = {
                "model": "openai/gpt-4o",
                "messages": [{"role": "user", "content": "Hello!"}]
            }

            response = requests.post(url, json=payload, headers=headers)
            print(response.json())
        - lang: python
          label: Auto model selection
          source: |-
            import os
            import requests

            url = "https://api.inworld.ai/v1/chat/completions"
            headers = {
                "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
                "Content-Type": "application/json"
            }
            payload = {
                "model": "auto",
                "messages": [{"role": "user", "content": "Hello!"}],
                "extra_body": {
                    "models": ["openai/gpt-4o-mini", "google-ai-studio/gemini-2.0-flash"],
                    "sort": ["price"]
                }
            }

            response = requests.post(url, json=payload, headers=headers)
            print(response.json())
        - lang: python
          label: Using a router
          source: |-
            import os
            import requests

            url = "https://api.inworld.ai/v1/chat/completions"
            headers = {
                "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
                "Content-Type": "application/json"
            }
            payload = {
                "model": "inworld/my-router",
                "messages": [{"role": "user", "content": "Hello!"}]
            }

            response = requests.post(url, json=payload, headers=headers)
            print(response.json())
        - lang: python
          label: Image generation
          source: |-
            import os
            import requests

            url = "https://api.inworld.ai/v1/chat/completions"
            headers = {
                "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
                "Content-Type": "application/json"
            }
            payload = {
                "model": "google-ai-studio/gemini-3.1-flash-image-preview",
                "messages": [{"role": "user", "content": "Generate an image of a cat"}],
                "modalities": ["text", "image"],
                "image_config": {
                    "aspect_ratio": "16:9"
                }
            }

            response = requests.post(url, json=payload, headers=headers)
            print(response.json())
        - lang: python
          label: Web search (tool-based)
          source: |-
            import os
            import requests

            url = "https://api.inworld.ai/v1/chat/completions"
            headers = {
                "Authorization": f"Basic {os.getenv('INWORLD_API_KEY')}",
                "Content-Type": "application/json"
            }
            payload = {
                "model": "openai/gpt-4o",
                "messages": [{"role": "user", "content": "Latest news on Mars exploration"}],
                "web_search": {
                    "engine": "exa",
                    "max_results": 5,
                    "max_steps": 2
                }
            }

            response = requests.post(url, json=payload, headers=headers)
            print(response.json())
        - lang: javascript
          label: Direct model call
          source: |-
            const url = 'https://api.inworld.ai/v1/chat/completions';

            const response = await fetch(url, {
              method: 'POST',
              headers: {
                'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
                'Content-Type': 'application/json',
              },
              body: JSON.stringify({
                model: 'openai/gpt-4o',
                messages: [{ role: 'user', content: 'Hello!' }],
              }),
            });

            const data = await response.json();
            console.log(data);
        - lang: javascript
          label: Auto model selection
          source: |-
            const url = 'https://api.inworld.ai/v1/chat/completions';

            const response = await fetch(url, {
              method: 'POST',
              headers: {
                'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
                'Content-Type': 'application/json',
              },
              body: JSON.stringify({
                model: 'auto',
                messages: [{ role: 'user', content: 'Hello!' }],
                extra_body: {
                  models: ['openai/gpt-4o-mini', 'google-ai-studio/gemini-2.0-flash'],
                  sort: ['price'],
                },
              }),
            });

            const data = await response.json();
            console.log(data);
        - lang: javascript
          label: Using a router
          source: |-
            const url = 'https://api.inworld.ai/v1/chat/completions';

            const response = await fetch(url, {
              method: 'POST',
              headers: {
                'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
                'Content-Type': 'application/json',
              },
              body: JSON.stringify({
                model: 'inworld/my-router',
                messages: [{ role: 'user', content: 'Hello!' }],
              }),
            });

            const data = await response.json();
            console.log(data);
        - lang: javascript
          label: Image generation
          source: |-
            const url = 'https://api.inworld.ai/v1/chat/completions';

            const response = await fetch(url, {
              method: 'POST',
              headers: {
                'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
                'Content-Type': 'application/json',
              },
              body: JSON.stringify({
                model: 'google-ai-studio/gemini-3.1-flash-image-preview',
                messages: [{ role: 'user', content: 'Generate an image of a cat' }],
                modalities: ['text', 'image'],
                image_config: {
                  aspect_ratio: '16:9',
                },
              }),
            });

            const data = await response.json();
            console.log(data);
        - lang: javascript
          label: Web search (tool-based)
          source: |-
            const url = 'https://api.inworld.ai/v1/chat/completions';

            const response = await fetch(url, {
              method: 'POST',
              headers: {
                'Authorization': `Basic ${process.env.INWORLD_API_KEY}`,
                'Content-Type': 'application/json',
              },
              body: JSON.stringify({
                model: 'openai/gpt-4o',
                messages: [{ role: 'user', content: 'Latest news on Mars exploration' }],
                web_search: {
                  engine: 'exa',
                  max_results: 5,
                  max_steps: 2,
                },
              }),
            });

            const data = await response.json();
            console.log(data);
components:
  schemas:
    ChatCompletionRequest:
      type: object
      required:
        - model
        - messages
      properties:
        model:
          type: string
          description: >-
            The model to use, which can be:

            - A model id (e.g., `gpt-oss-120b`). The best provider is
            [automatically
            selected](/router/usage/specific-model#provider-routing) by latency,
            or you can control provider selection via `extra_body.provider`. See
            [Models](/api-reference/modelsAPI/modelservice/list-models) for
            available models.

            - A provider-prefixed model id (e.g., `openai/gpt-5`). This
            specifies the provider and model to use.

            - `auto` for automatic model selection based on criteria like price,
            latency, or intelligence

            - A router, which is specified by `inworld/<router-name>`. The
            router `name` must be prefixed by `inworld/`.
        messages:
          type: array
          description: >-
            A list of messages comprising the conversation so far.


            If using a router where a prompt is specified, these messages will
            be appended to the prompt.
          items:
            $ref: '#/components/schemas/Message'
        stream:
          type: boolean
          description: If true, partial message deltas will be sent as server-sent events.
          default: false
        temperature:
          type: number
          description: >-
            Sampling temperature between 0 and 2. Higher values make output more
            random.
          minimum: 0
          maximum: 2
          default: 1
        top_p:
          type: number
          description: Nucleus sampling parameter. Must be greater than 0.
          minimum: 0
          exclusiveMinimum: true
          maximum: 1
        max_tokens:
          type: integer
          description: Maximum number of tokens to generate.
          minimum: 1
        max_completion_tokens:
          type: integer
          description: Maximum number of completion tokens to generate.
          minimum: 1
        presence_penalty:
          type: number
          description: Penalizes tokens based on presence in the text.
          minimum: -2
          maximum: 2
          default: 0
        frequency_penalty:
          type: number
          description: Penalizes tokens based on frequency in the text.
          minimum: -2
          maximum: 2
          default: 0
        seed:
          type: integer
          description: Random seed for generation.
          format: int32
        stop:
          type: array
          description: Up to 4 sequences where the API will stop generating.
          items:
            type: string
        logit_bias:
          type: array
          items:
            $ref: '#/components/schemas/LogitBias'
          description: >-
            Modifies the likelihood of specified tokens appearing in the
            completion.
        reasoning_effort:
          type: string
          enum:
            - none
            - low
            - minimal
            - medium
            - high
            - xhigh
          description: >-
            Controls the amount of reasoning effort the model uses. Note: This
            parameter is provider/model-specific and may not be supported by all
            models (e.g., OpenAI models do not support this parameter). This
            will be overridden if `extra_body.reasoning` is specified.
        user:
          type: string
          description: >-
            A unique identifier for the end user. When used with a router, the
            same user will consistently receive the same variant across requests
            (sticky routing).
        web_search:
          $ref: '#/components/schemas/WebSearchConfig'
          description: >-
            Tool-based web search configuration. The LLM calls a search engine
            in a tool-calling loop, then synthesizes a grounded answer with
            `url_citation` annotations. Works with any LLM that supports tool
            calling. Mutually exclusive with `web_search_options`. See [Web
            Search](/router/capabilities/web-search) for details.
        web_search_options:
          $ref: '#/components/schemas/WebSearchOptions'
          description: >-
            Native web search using the provider's built-in search grounding (no
            tool loop). Supported by OpenAI (search models only), Anthropic,
            Google / Vertex AI, and Groq. Mutually exclusive with `web_search`.
            See [Web Search](/router/capabilities/web-search) for details.
        modalities:
          type: array
          description: >-
            Output modalities to generate. Defaults to `["text"]`. Include
            `"image"` to request image generation (e.g., `["text", "image"]`).
            Currently supported for OpenAI and Google image models.
          items:
            type: string
            enum:
              - text
              - image
          default:
            - text
        image_config:
          $ref: '#/components/schemas/ImageConfig'
          description: >-
            Configuration for image output. Optional when requesting image
            output via `modalities: ["image"]`.
        extra_body:
          $ref: '#/components/schemas/ExtraBody'
    ChatCompletionResponse:
      type: object
      properties:
        id:
          type: string
          description: Unique identifier for the chat completion.
        object:
          type: string
          description: Object type, always 'chat.completion'.
        created:
          type: integer
          description: Unix timestamp when the completion was created.
        model:
          type: string
          description: The model that was actually used.
        choices:
          type: array
          description: List of chat completion choices.
          items:
            $ref: '#/components/schemas/Choice'
        usage:
          type: object
          description: Token usage statistics.
          properties:
            prompt_tokens:
              type: integer
              description: Tokens in the prompt.
            completion_tokens:
              type: integer
              description: Tokens in the completion.
            total_tokens:
              type: integer
              description: Total tokens used.
        metadata:
          $ref: '#/components/schemas/Metadata'
    ErrorResponse:
      type: object
      description: >-
        Error response format varies: error can be a string or an object with
        message.
      properties:
        error:
          oneOf:
            - type: string
              description: Error message string.
              example: model field is required
            - type: object
              properties:
                message:
                  type: string
                  example: 'Invalid service provider: SERVICE_PROVIDER_INVALID'
    Message:
      type: object
      required:
        - role
        - content
      properties:
        role:
          type: string
          enum:
            - system
            - user
            - assistant
        content:
          type: string
          description: The content of the message.
    LogitBias:
      type: object
      description: Modifies the likelihood of specified tokens appearing in the completion.
      properties:
        token_id:
          type: string
          description: Token ID to apply bias to.
        bias_value:
          type: integer
          format: int32
          description: Bias value to apply to the token.
      required:
        - token_id
        - bias_value
    WebSearchConfig:
      type: object
      description: >-
        Web search configuration (under `extra_body`). The LLM calls a search
        engine in a tool-calling loop, then synthesizes a grounded answer. See
        [Web search](/router/capabilities/web-search).
      properties:
        engine:
          type: string
          enum:
            - exa
            - google
          default: exa
          description: Search backend. Valid values are `exa` and `google`.
        max_results:
          type: integer
          description: Search results per search call.
          default: 3
          minimum: 1
        max_steps:
          type: integer
          description: Maximum search/refine rounds.
          default: 1
          minimum: 1
    WebSearchOptions:
      type: object
      description: >-
        OpenAI-compatible web search options on the request body. Passed through
        to providers that support native web search. See [Web
        search](/router/capabilities/web-search).
      properties:
        search_context_size:
          type: string
          enum:
            - low
            - medium
            - high
          default: medium
          description: How much web context to retrieve.
        user_location:
          type: object
          description: Approximate user location for search relevance.
          properties:
            type:
              type: string
              enum:
                - approximate
            country:
              type: string
            city:
              type: string
    ImageConfig:
      type: object
      description: >-
        Configuration for image output in chat completions. Optional when
        requesting image output via `modalities: ["image"]`.
      properties:
        aspect_ratio:
          type: string
          description: >-
            Aspect ratio for the generated image (e.g., `1:1`, `16:9`, `9:16`).
            Supported by Google models only.
        image_size:
          type: string
          description: |-
            Size of the generated image. 
            - Google: model-specific sizes such as `1K` or `2K`.
            - OpenAI: pixel dimensions as WxH (e.g., `1024x1024`).
        partial_images:
          type: integer
          description: >-
            Number of partial/progressive image previews during streaming. Only
            used with `stream: true`. Defaults to 1 if unset. Supported by
            OpenAI models only.
          format: int32
        'n':
          type: integer
          description: |-
            Number of images to generate.
            - Google: Only support 1.
            - OpenAI: Support 1-10.
          format: int32
          minimum: 1
    ExtraBody:
      type: object
      description: Optional parameters for model routing and optimization.
      properties:
        models:
          type: array
          description: List of model identifiers for fallbacks or auto selection pool.
          items:
            type: string
        ignore:
          type: array
          description: Providers or models to exclude.
          items:
            type: string
        sort:
          type: array
          description: >-
            The sorting strategy to use for this request. Available sorting
            strategies: `price`, `latency`, `throughput`, `intelligence`,
            `math`, `coding`.
          items:
            type: string
        reasoning:
          $ref: '#/components/schemas/ReasoningConfig'
          description: >-
            Reasoning configuration. If specified, this will override the
            `reasoning_effort` parameter in the request body. Note: This
            parameter is provider/model-specific and may not be supported by all
            models. Unsupported parameters may return errors or be silently
            ignored depending on the provider.
        provider:
          $ref: '#/components/schemas/ProviderConfig'
          description: >-
            Provider routing configuration. Use when `model` is specified
            without a provider prefix (e.g., `gpt-oss-120b`) to control which
            providers are tried and in what order.
        prompt_variables:
          type: object
          description: >-
            Variables for substitution in prompt templates. Example: {"name":
            "John", "topic": "AI"}. These variables will only be substituted in
            prompts specified in a
            [router](/api-reference/routerAPI/routerservice/create-router), not
            in messages sent with each request.
          additionalProperties:
            type: string
        web_search:
          $ref: '#/components/schemas/WebSearchConfig'
          description: >-
            For OpenAI SDK compatibility, pass `web_search` via `extra_body`
            (equivalent to setting it at the top level). Tool-based web search
            configuration. Mutually exclusive with `web_search_options`. See
            [Web Search](/router/capabilities/web-search) for details.
        web_search_options:
          $ref: '#/components/schemas/WebSearchOptions'
          description: >-
            For OpenAI SDK compatibility, pass `web_search_options` via
            `extra_body` (equivalent to setting it at the top level). Native web
            search grounding. Mutually exclusive with `web_search`. See [Web
            Search](/router/capabilities/web-search) for details.
    Choice:
      type: object
      properties:
        index:
          type: integer
        message:
          type: object
          properties:
            role:
              type: string
              description: Always 'assistant' for responses.
            content:
              type: string
              nullable: true
              description: The generated content. Null when tool_calls is present.
            tool_calls:
              type: array
              description: Tool calls generated by the model (when using tools).
              items:
                type: object
                properties:
                  id:
                    type: string
                  type:
                    type: string
                  function:
                    type: object
                    properties:
                      name:
                        type: string
                      arguments:
                        type: string
        finish_reason:
          type: string
          description: 'Reason for stopping: stop, length, or tool_call.'
    Metadata:
      type: object
      description: Routing metadata providing transparency into model selection decisions.
      properties:
        attempts:
          type: array
          description: >-
            List of model attempts, including both successful and failed
            attempts.
          items:
            $ref: '#/components/schemas/Attempt'
        generation_id:
          type: string
          description: Unique identifier for tracing this request in the Inworld Portal.
        reasoning:
          type: string
          description: >-
            Human-readable explanation of why a model was selected based on the
            routing strategy.
        total_duration_ms:
          type: integer
          description: Total request duration in milliseconds.
    ReasoningConfig:
      type: object
      description: >-
        Reasoning configuration for models that support chain-of-thought
        reasoning. Provides a unified interface across different providers
        (OpenAI, Anthropic, Google, Groq, etc.).
      properties:
        effort:
          type: string
          enum:
            - unspecified
            - none
            - minimal
            - low
            - medium
            - high
            - xhigh
          description: >-
            Controls the reasoning effort level. The server will default to
            MEDIUM if effort is not specified. NONE disables reasoning entirely.
            MINIMAL uses ~10% of max completion tokens, LOW ~20%, MEDIUM ~50%,
            HIGH ~80%, XHIGH ~95%.
        max_tokens:
          type: integer
          format: int32
          description: >-
            Maximum number of tokens to use for reasoning.
            Anthropic/Google-style control. Takes precedence over effort when
            specified. For providers that only support effort levels, this is
            converted to the appropriate level.
        exclude:
          type: boolean
          description: >-
            Whether to exclude reasoning tokens from the response. When true,
            the model still uses reasoning internally but doesn't return it.
            Default is false (reasoning is included in response if available).
    ProviderConfig:
      type: object
      description: >-
        Configuration for provider-level routing when using a model without a
        provider prefix.
      properties:
        order:
          type: array
          items:
            type: string
          description: >-
            Explicit list of providers to try, in order. Example: ["groq",
            "fireworks"]. When specified, providers are tried in this exact
            order (`sort` criteria will be ignored).
        allow_fallbacks:
          type: boolean
          description: >-
            Whether to allow falling back to the next provider if the current
            one fails. Defaults to true.
          default: true
    Attempt:
      type: object
      description: Details of a model attempt during routing.
      properties:
        model:
          type: string
          description: The model identifier that was attempted.
        success:
          type: boolean
          description: Whether this attempt succeeded.
        time_to_first_token_ms:
          type: integer
          description: Time to receive the first token in milliseconds.
  securitySchemes:
    inworld_basic:
      type: apiKey
      in: header
      name: Authorization
      description: >-
        Your [authentication](../../../api-reference/introduction) credentials.
        For Basic authentication, please populate `Basic $INWORLD_API_KEY`.


        Please make sure your API Key has [write permissions for the Router
        API](/api-reference/introduction#getting-an-api-key) in order to create,
        update, and delete routers.

````