Skip to main content
Sometimes you may need to ensure that a word is spoken with a specific pronunciation, especially for uncommon words such as company names, brand names, nicknames, geographic locations, medical terms, or legal terms that may not appear in the model’s training data. Custom pronunciation lets you precisely control how these words are spoken.

How to Use

Inworld TTS supports inline IPA phoneme notation for custom pronunciation. Use the International Phonetic Alphabet (IPA) format, wrapped in slashes (/ /). For example:
  • Suppose you are building an AI travel agent, and it is recommending the destination Crete, which is pronounced /kriːt/ (“kreet”) in English.
  • You can ensure the correct pronunciation by passing it inline: Your interests are a perfect match for a honeymoon in /kriːt/.
The model will substitute the IPA pronunciation wherever it appears inline in your text. If the text is generated by an LLM, you can simply replace the original spelling with the IPA transcription before passing it to the TTS model.

Finding the Right IPA Phonemes

If you are unsure of the correct phonemes, there are several ways to find them:
  • Ask an LLM like ChatGPT: For example, you can ask:
    “What are the IPA phonemes for the word Crete, pronounced like ‘kreet’?”
  • Use reference websites: Resources such as Vocabulary.com’s IPA Pronunciation Guide provide tables of symbols with example words.
Once you have the correct phonemes, you can embed them directly into your TTS request: Your adventure in /kriːt/ begins today.