> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inworld.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Clone a voice

> Clone a voice from audio samples.

<Tip>
  **Short URL Path:** `/workspaces/{workspace}` is no longer required in the path for simplicity and clarity. When omitted, the workspace is derived from your API key. The previous URL with the full path `/voices/v1/workspaces/{workspace}/voices:clone` would continue to be supported.
</Tip>

<RequestField body="langCode" type="string">
  * LANGUAGE\_CODE\_UNSPECIFIED: Unspecified.
  * EN\_US: English - US.
  * ZH\_CN: Chinese - China.
  * KO\_KR: Korean - Korea.
  * JA\_JP: Japanese - Japan.
  * RU\_RU: Russian - Russia.
  * AUTO: Extra languages that is supported for voice cloning. Auto-detect language.
  * IT\_IT: Italian - Italy.
  * ES\_ES: Spanish - Spain.
  * PT\_BR: Portuguese - Brazil.
  * DE\_DE: German - Germany.
  * FR\_FR: French - France.
  * AR\_SA: Arabic - Saudi Arabia.
  * PL\_PL: Polish - Poland.
  * NL\_NL: Dutch - Netherlands.
  * HI\_IN: Hindi - India.
  * HE\_IL: Hebrew - Israel.
</RequestField>


## OpenAPI

````yaml post /voices/v1/voices:clone
openapi: 3.0.0
info:
  title: Inworld Text-to-Speech API
  version: v1
  contact:
    name: Inworld AI
    url: https://inworld.ai
    email: support@inworld.ai
servers:
  - url: https://api.inworld.ai
security:
  - inworld_basic: []
tags:
  - name: VoiceService
paths:
  /voices/v1/voices:clone:
    post:
      tags:
        - VoiceService
      summary: Clone a voice
      description: Clone a voice from audio samples.
      operationId: VoiceService_CloneVoice
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/VoiceServiceCloneVoiceBody'
            examples:
              example:
                summary: Clone voice request example
                description: >-
                  Replace `voiceSamples[0].audioData` with real base64-encoded
                  WAV/MP3 bytes.
                value:
                  displayName: my_voice_clone_demo
                  langCode: EN_US
                  voiceSamples:
                    - audioData: <base64-audio-data>
                      transcription: Hello! This is a short voice sample for cloning.
                  description: Voice clone created from provided WAV sample.
                  tags:
                    - demo
                    - clone
                  audioProcessingConfig:
                    removeBackgroundNoise: true
        required: true
      responses:
        '200':
          description: A successful response.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/v1CloneVoiceResponse'
              examples:
                success:
                  summary: Successful voice clone
                  value:
                    audioSamplesValidated:
                      - audioData: <base64-wav-bytes>
                        errors: []
                        langCode: EN_US
                        transcription: Hello! This is a short voice sample for cloning.
                        warnings: []
                    voice:
                      voiceId: your_workspace_id__my_voice_clone_demo_20260218_223134z
                      langCode: EN_US
                      displayName: my_voice_clone_demo
                      description: Voice clone created from provided WAV sample.
                      tags:
                        - demo
                        - clone
                      name: >-
                        workspaces/your_workspace_id/voices/my_voice_clone_demo_20260218_223134z
                      source: IVC
        default:
          description: An unexpected error response.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/rpcStatus'
              examples:
                insufficientAccessLevel:
                  summary: API key has insufficient access level (read-only key)
                  value:
                    code: 7
                    message: >-
                      API key has insufficient access level for 'voices' in
                      workspace 'your_workspace_id'. Current access level in API
                      key: read (r). Required access level: write (rw).
                    details: []
      x-codeSamples:
        - lang: bash
          label: cURL
          source: |-
            curl --location 'https://api.inworld.ai/voices/v1/voices:clone' \
            --header "Authorization: Basic $INWORLD_API_KEY" \
            --header 'Content-Type: application/json' \
            --data '{
              "displayName": "my_voice_clone_demo",
              "langCode": "EN_US",
              "voiceSamples": [
                {
                  "audioData": "<base64-audio-data>",
                  "transcription": "Hello! This is a short voice sample for cloning."
                }
              ],
              "description": "Voice clone created from provided WAV sample.",
              "tags": ["demo", "clone"],
              "audioProcessingConfig": {
                "removeBackgroundNoise": true
              }
            }'
        - lang: python
          label: Python
          source: |-
            import requests

            url = "https://api.inworld.ai/voices/v1/voices:clone"
            headers = {
                "Authorization": "Basic <api-key>",
                "Content-Type": "application/json"
            }
            payload = {
                "displayName": "my_voice_clone_demo",
                "langCode": "EN_US",
                "voiceSamples": [
                    {
                        "audioData": "<base64-audio-data>",
                        "transcription": "Hello! This is a short voice sample for cloning."
                    }
                ],
                "description": "Voice clone created from provided WAV sample.",
                "tags": ["demo", "clone"],
                "audioProcessingConfig": {
                    "removeBackgroundNoise": True
                }
            }

            response = requests.post(url, json=payload, headers=headers)
            print(response.json())
        - lang: javascript
          label: JavaScript
          source: |-
            const url = 'https://api.inworld.ai/voices/v1/voices:clone';

            const response = await fetch(url, {
              method: 'POST',
              headers: {
                'Authorization': 'Basic <api-key>',
                'Content-Type': 'application/json',
              },
              body: JSON.stringify({
                displayName: 'my_voice_clone_demo',
                langCode: 'EN_US',
                voiceSamples: [
                  {
                    audioData: '<base64-audio-data>',
                    transcription: 'Hello! This is a short voice sample for cloning.',
                  },
                ],
                description: 'Voice clone created from provided WAV sample.',
                tags: ['demo', 'clone'],
                audioProcessingConfig: {
                  removeBackgroundNoise: true,
                },
              }),
            });

            const data = await response.json();
            console.log(data);
components:
  schemas:
    VoiceServiceCloneVoiceBody:
      type: object
      properties:
        displayName:
          type: string
          description: >-
            The human-readable name shown anywhere the voice is listed or
            selected. Keep it short and distinctive so users can find it easily.
        langCode:
          $ref: '#/components/schemas/inworldttslanguage_codesLanguageCode'
        voiceSamples:
          type: array
          items:
            $ref: '#/components/schemas/v1VoiceSample'
          description: >-
            Voice samples used for cloning. For best results, provide clear
            audio and avoid speaking in multiple languages, whispering, or
            making non-verbal sounds like coughing. Instant voice cloning works
            best with a 10-15 sec audio clip; longer clips will be cutoff at
            15sec, which can affect quality. See [Voice Cloning Best
            Practices](https://docs.inworld.ai/tts/best-practices/voice-cloning)
            for guidance on how to generate a high-quality voice clone.
        description:
          type: string
          description: >-
            Longer blurb that explains the voice's tone, accent, use cases, or
            other relevant attributes. Helpful for search and selection.
        tags:
          type: array
          items:
            type: string
          description: >-
            Flat list of labels used for filtering, grouping and discovery.
            Examples could include gender, age or use case.
        audioProcessingConfig:
          $ref: '#/components/schemas/v1AudioProcessingConfig'
      description: Request message for CloneVoice custom method.
      required:
        - displayName
        - langCode
        - voiceSamples
    v1CloneVoiceResponse:
      type: object
      properties:
        voice:
          $ref: '#/components/schemas/inworldvoicev1Voice'
        audioSamplesValidated:
          type: array
          items:
            $ref: '#/components/schemas/v1AudioSampleValidated'
          description: The list of validated samples.
      description: Response message for CloneVoice custom method.
    rpcStatus:
      type: object
      properties:
        code:
          type: integer
          format: int32
          description: >-
            The status code, which should be an enum value of
            [google.rpc.Code][google.rpc.Code].
        message:
          type: string
          description: >-
            A developer-facing error message, which should be in English. Any
            user-facing error message should be localized and sent in the
            [google.rpc.Status.details][google.rpc.Status.details] field, or
            localized by the client.
        details:
          type: array
          items:
            $ref: '#/components/schemas/protobufAny'
          description: >-
            A list of messages that carry the error details. There is a common
            set of message types for APIs to use.
      description: >-
        The `Status` type defines a logical error model that is suitable for

        different programming environments, including REST APIs and RPC APIs. It
        is

        used by [gRPC](https://github.com/grpc). Each `Status` message contains

        three pieces of data: error code, error message, and error details.


        You can find out more about this error model and how to work with it in
        the

        [API Design Guide](https://cloud.google.com/apis/design/errors).
    inworldttslanguage_codesLanguageCode:
      type: string
      enum:
        - EN_US
        - ZH_CN
        - KO_KR
        - JA_JP
        - RU_RU
        - AUTO
        - IT_IT
        - ES_ES
        - PT_BR
        - DE_DE
        - FR_FR
        - AR_SA
        - PL_PL
        - NL_NL
        - HI_IN
        - HE_IL
      title: >-
        This is a extended language list for the voice clone feature. The
        structure is the same as for LanguageCode from common. Dedicated file
        created to do not break any other features that uses language common
        definition.
    v1VoiceSample:
      type: object
      properties:
        audioData:
          type: string
          format: byte
          description: >-
            Binary audio data for the sample (base64-encoded in JSON). Supports
            WAV and MP3 formats.
        transcription:
          type: string
          description: >-
            Optional user-provided transcription of the audio sample. If one is
            not provided, the transcription will be generated automatically.
      description: >-
        A single voice sample consisting of audio data and an optional
        transcription.
      required:
        - audioData
    v1AudioProcessingConfig:
      type: object
      properties:
        removeBackgroundNoise:
          type: boolean
          description: >-
            Whether to remove background noise from the samples. If true, an
            audio isolation model will be used to clean the samples. Note: This
            can degrade quality if samples are already clean.
      description: Audio processing config for voice cloning.
    inworldvoicev1Voice:
      type: object
      properties:
        voiceId:
          type: string
          description: >-
            Voice ID. SYSTEM voices use a simple name (e.g. `Alex`); IVC voices
            are workspace-prefixed (`{workspace}__{voice}`).
          readOnly: true
        langCode:
          $ref: '#/components/schemas/inworldttslanguage_codesLanguageCode'
          description: >-
            Primary language of the voice in upper-snake format (e.g. `EN_US`).
            Note that when filtering via `lang_code`, you can pass BCP-47
            (`en-US`), underscore form (`en_US`), or a language prefix (`en`) —
            but the response always returns upper-snake.
        displayName:
          type: string
          description: >-
            The human-readable name shown anywhere the voice is listed or
            selected.
        description:
          type: string
          description: >-
            Longer blurb that explains the voice's tone, accent, use cases, or
            other relevant attributes.
        tags:
          type: array
          items:
            type: string
          description: >-
            Free-form labels for filtering, grouping, and discovery (e.g.
            `british`, `calm`).
        name:
          type: string
          description: 'Resource name. Format: `workspaces/{workspace}/voices/{voice}`.'
        source:
          type: string
          enum:
            - SYSTEM
            - IVC
            - PVC
          description: >-
            Origin of the voice:


            - `SYSTEM`: Built-in voice provided by Inworld, visible to all
            workspaces.

            - `IVC`: Voice cloned from audio or created via Voice Design — owned
            by your workspace only.

            - `PVC`: Professional Voice Clone.
        gender:
          type: string
          enum:
            - male
            - female
            - neutral
            - ''
          description: >-
            Voice gender (`male`, `female`, `neutral`). Empty string if
            unspecified. Voices with no gender are excluded when filtering with
            an explicit `gender =` predicate.
        ageGroup:
          type: string
          enum:
            - young
            - middle_aged
            - elderly
            - ''
          description: >-
            Age group of the voice (`young`, `middle_aged`, `elderly`). Empty
            string if unspecified.
        categories:
          type: array
          items:
            type: string
            enum:
              - companions
              - enterprise
              - education_training
              - developer_assistants
              - healthcare
              - interactive_media
          description: >-
            Use-case categories the voice belongs to. Filterable with the `:`
            (has) operator.


            Supported values: `companions`, `enterprise`, `education_training`,
            `developer_assistants`, `healthcare`, `interactive_media`.
        promptLanguages:
          type: array
          items:
            type: string
          description: >-
            Languages the voice can handle, in BCP-47 format (e.g. `en-US`). May
            differ from `langCode` for multilingual voices.
      description: Voice resource representing a voice configuration.
      example:
        name: >-
          workspaces/your_workspace_id/voices/my_voice_clone_demo_20260218_223134z
        voiceId: your_workspace_id__my_voice_clone_demo_20260218_223134z
        langCode: EN_US
        displayName: John
        description: Cloned voice for narrations.
        tags:
          - demo
          - clone
        categories: []
        source: IVC
        gender: ''
        ageGroup: ''
        promptLanguages:
          - en-US
    v1AudioSampleValidated:
      type: object
      properties:
        langCode:
          $ref: '#/components/schemas/inworldttslanguage_codesLanguageCode'
        warnings:
          type: array
          items:
            $ref: '#/components/schemas/AudioSampleValidatedWarning'
          description: The list of detected warnings for this sample.
        errors:
          type: array
          items:
            $ref: '#/components/schemas/AudioSampleValidatedError'
          description: The list of detected errors for this sample.
        transcription:
          type: string
          description: Transcription of the processed audio sample.
        audioData:
          type: string
          format: byte
          description: >-
            The processed audio data (base64-encoded). This is the audio after
            processing (e.g., background noise removal if enabled), not the
            originally uploaded audio.
      description: The details about sample validation.
    protobufAny:
      type: object
      properties:
        '@type':
          type: string
          description: >-
            A URL/resource name that uniquely identifies the type of the
            serialized

            protocol buffer message. This string must contain at least

            one "/" character. The last segment of the URL's path must represent

            the fully qualified name of the type (as in

            `path/google.protobuf.Duration`). The name should be in a canonical
            form

            (e.g., leading "." is not accepted).


            In practice, teams usually precompile into the binary all types that
            they

            expect it to use in the context of Any. However, for URLs which use
            the

            scheme `http`, `https`, or no scheme, one can optionally set up a
            type

            server that maps type URLs to message definitions as follows:


            * If no scheme is provided, `https` is assumed.

            * An HTTP GET on the URL must yield a [google.protobuf.Type][]
              value in binary format, or produce an error.
            * Applications are allowed to cache lookup results based on the
              URL, or have them precompiled into a binary to avoid any
              lookup. Therefore, binary compatibility needs to be preserved
              on changes to types. (Use versioned type names to manage
              breaking changes.)

            Note: this functionality is not currently available in the official

            protobuf release, and it is not used for type URLs beginning with

            type.googleapis.com. As of May 2023, there are no widely used type
            server

            implementations and no plans to implement one.


            Schemes other than `http`, `https` (or the empty scheme) might be

            used with implementation specific semantics.
      additionalProperties: {}
      description: >-
        `Any` contains an arbitrary serialized protocol buffer message along
        with a

        URL that describes the type of the serialized message.


        Protobuf library provides support to pack/unpack Any values in the form

        of utility functions or additional generated methods of the Any type.


        Example 1: Pack and unpack a message in C++.

            Foo foo = ...;
            Any any;
            any.PackFrom(foo);
            ...
            if (any.UnpackTo(&foo)) {
              ...
            }

        Example 2: Pack and unpack a message in Java.

            Foo foo = ...;
            Any any = Any.pack(foo);
            ...
            if (any.is(Foo.class)) {
              foo = any.unpack(Foo.class);
            }
            // or ...
            if (any.isSameTypeAs(Foo.getDefaultInstance())) {
              foo = any.unpack(Foo.getDefaultInstance());
            }

         Example 3: Pack and unpack a message in Python.

            foo = Foo(...)
            any = Any()
            any.Pack(foo)
            ...
            if any.Is(Foo.DESCRIPTOR):
              any.Unpack(foo)
              ...

         Example 4: Pack and unpack a message in Go

             foo := &pb.Foo{...}
             any, err := anypb.New(foo)
             if err != nil {
               ...
             }
             ...
             foo := &pb.Foo{}
             if err := any.UnmarshalTo(foo); err != nil {
               ...
             }

        The pack methods provided by protobuf library will by default use

        'type.googleapis.com/full.type.name' as the type URL and the unpack

        methods only use the fully qualified type name after the last '/'

        in the type URL, for example "foo.bar.com/x/y.z" will yield type

        name "y.z".


        JSON

        ====

        The JSON representation of an `Any` value uses the regular

        representation of the deserialized, embedded message, with an

        additional field `@type` which contains the type URL. Example:

            package google.profile;
            message Person {
              string first_name = 1;
              string last_name = 2;
            }

            {
              "@type": "type.googleapis.com/google.profile.Person",
              "firstName": <string>,
              "lastName": <string>
            }

        If the embedded message type is well-known and has a custom JSON

        representation, that representation will be embedded adding a field

        `value` which holds the custom JSON in addition to the `@type`

        field. Example (for message [google.protobuf.Duration][]):

            {
              "@type": "type.googleapis.com/google.protobuf.Duration",
              "value": "1.212s"
            }
    AudioSampleValidatedWarning:
      type: object
      properties:
        text:
          type: string
          description: The warning message.
      description: >-
        Warning that detected during the audio sample validations. Used to
        inform user attention that result of voice cloning can be not ideal.
    AudioSampleValidatedError:
      type: object
      properties:
        text:
          type: string
          description: The error message.
      description: >-
        Error that detected during the audio sample validations. Used to inform
        user why a sample is not valid and thus not considered for voice
        cloning.
  securitySchemes:
    inworld_basic:
      type: apiKey
      in: header
      name: Authorization
      description: >-
        Your [API key](../../../api-reference/introduction). Read permissions
        are required for GET endpoints. Write permissions are required for POST,
        PATCH, and DELETE endpoints.

         For Basic authentication, please populate `Basic $INWORLD_API_KEY`. You can create a key in one command with the [Inworld CLI](../../../tts/resources/inworld-cli): `inworld workspace add-key`.

````