Voice Cloning

Inworld’s text-to-speech models offer best-in-class voice cloning capabilities, enabling developers to create distinct, personalized voices for their experiences. There are three ways to clone a voice:

Instant Voice Cloning - Clone a voice in minutes, from as little as 3 seconds of audio. The longer the sample (up to the current 15-second limit), the better the speaker similarity. Also known as zero-shot cloning. Available to all users through Portal.
Voice Cloning via API - Instant voice cloning via API. Useful for workflow automation or enabling your users to clone their own voices.
Professional Voice Cloning Beta - Produces a voice clone that’s more similar to the original speaker and more stable than Instant Voice Cloning, using a minimum of 10 minutes of audio. Available through Portal.

Don’t have audio samples? Use Voice Design to create a voice from a text description instead.

You can also clone a voice in one command with the Inworld CLI: inworld tts clone sample.wav --name "My Voice"

Instant Voice Cloning

Go to Inworld Portal

In Portal, select TTS Playground from the left-hand side panel. In the TTS Playground, click Create Voice and select Clone.

Upload or record audio samples

Name your voice and select the language, which should match the audio samples. Select Other if you’re cloning in an experimental language.You can either upload or record audio:

Upload: Drag and drop or browse to upload 1 audio file. Accepted formats: wav, mp3, webm. Maximum file size is 4MB. Audio samples longer than 15 seconds will be automatically trimmed to 15 seconds — we’re working on supporting longer prompts soon.
Record: Click “Record audio” and record your audio. You can use the suggested scripts to help guide your recording, or use your own script. For best results, record in a quiet place to minimize background noise, avoid mic noise, and speak with a variety of emotions to capture the full range of the voice.

Enable “Remove background noise” if you wish to remove background noise from your audio. Confirm you have the rights to clone the voice, then click “Continue”.

Check out our Voice Cloning Best Practices for helpful tips and tricks to improve the quality of your voice clones.

Test your cloned voice

Once voice cloning completes, you’ll see the “Try your cloned voice” interface. Enter text in the input field and press play to hear your cloned voice. You can test different phrases to ensure the voice sounds as expected.If the voice doesn’t sound quite right, you can delete the voice and start over, create another voice, or test it in the TTS Playground for more advanced testing options.

The number of cloned voices you can store depends on your subscription plan. See the Pricing page for per-plan limits.

Use your cloned voice via API

To use the cloned voice via API, copy the voice ID for your cloned voice in TTS Playground. Use that value for the voiceId when making an API call. See our Quickstart to learn how to make your first API call.

Instant voice cloning may not perform well for less common voices, such as children’s voices or unique accents. For those use cases, we recommend professional voice cloning.

Professional Voice Cloning Beta

Professional Voice Cloning produces a voice clone that’s more similar to the original speaker and more stable than Instant Voice Cloning, in exchange for more audio and an asynchronous training step instead of an instant result.

Professional Voice Cloning is currently only available through Portal. It is not yet supported via the Voice Cloning API.

Choose Create Voice

In Portal, select TTS Playground from the left-hand side panel. In the TTS Playground, click Create Voice.

Select Professional Clone

On the Create Voice page, select Professional Clone.

Name your voice

Enter a Voice name. Description and tags are optional.

Only English is currently supported for Professional Voice Cloning.

Upload audio samples and start training

Click Add Audio to upload your audio samples.

You can upload multiple audio files, up to a combined limit of 5GB.
Uploaded audio must total at least 10 minutes combined. More clean, high-quality audio generally produces a better clone.

Enable Remove background noise if you wish to remove background noise from your audio. Confirm you have the rights to clone the voice, then click Start Training.

Check out our Voice Cloning Best Practices for tips on recording and preparing audio for a professional voice clone.

Wait for training to complete

Once training starts, you can leave the page — training continues in the background and can take several minutes for longer samples.

Training status is displayed on the Voices page, where your request appears under Professional clone requests along with its current status.

Use your voice from My Voices

Once training finishes, the voice moves to My voices, ready to use in the TTS Playground or via API. To use it via API, copy the voice ID and use that value for the voiceId when making an API call. See our Quickstart to learn how to make your first API call.

Voice Cloning API Reference And Examples

If you want to automate voice cloning (for example, to support creator onboarding at scale), use the Voice Cloning API. This API currently supports Instant Voice Cloning only; Professional Voice Cloning is available through Portal.

API reference: Clone a voice
Node.js SDK: cloneVoice() — clone a voice in a few lines with the @inworld/tts SDK
Python example: example_voice_clone.py
JavaScript example: example_voice_clone.js

Voice cloning has lower rate limits than regular speech synthesis. For details, see Rate limits.

Next Steps

Looking for more tips and tricks? Check out the resources below to get started!

Voice Cloning Best Practices

Learn best practices for producing high-quality voice clones.

Speech Generation Best Practices

Learn best practices for synthesizing high-quality speech.

API Examples

Explore Python and JavaScript code examples for TTS integration.

Get Started

Build with Realtime TTS

Best Practices

Resources

Instant Voice Cloning

Go to Inworld Portal

Upload or record audio samples

Test your cloned voice

Use your cloned voice via API

Professional Voice Cloning Beta

Choose Create Voice

Select Professional Clone

Name your voice

Upload audio samples and start training

Wait for training to complete

Use your voice from My Voices

Voice Cloning API Reference And Examples

Next Steps

Voice Cloning Best Practices

Speech Generation Best Practices

API Examples

​Instant Voice Cloning

Go to Inworld Portal

Upload or record audio samples

Test your cloned voice

Use your cloned voice via API

​Professional Voice Cloning Beta

Choose Create Voice

Select Professional Clone

Name your voice

Upload audio samples and start training

Wait for training to complete

Use your voice from My Voices

​Voice Cloning API Reference And Examples

​Next Steps

Voice Cloning Best Practices

Speech Generation Best Practices

API Examples

Instant Voice Cloning

Professional Voice Cloning Beta

Voice Cloning API Reference And Examples

Next Steps