> ## Documentation Index
> Fetch the complete documentation index at: https://docs.inworld.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Long Text Input

> Synthesize speech from text longer than 2,000 characters by chunking and stitching audio

The TTS API accepts up to **2,000 characters** per request. For longer content — articles, book chapters, scripts — you need to split the text into chunks, synthesize each one, and stitch the resulting audio back together.

We provide ready-to-run scripts in **[Python](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_tts_long_input.py)** and **[JavaScript](https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_tts_long_input_compressed.js)** that handle this entire pipeline for you.

## How It Works

<Steps>
  <Step title="Chunk the text">
    The input is split into segments under the 2,000-character API limit. The chunking algorithm looks for natural break points in the following priority order:

    1. Paragraph breaks (`\n\n`)
    2. Line breaks (`\n`)
    3. Sentence endings (`.` `!` `?`)
    4. Last space (fallback)

    This ensures audio segments end at natural pauses, producing smooth-sounding output.
  </Step>

  <Step title="Synthesize each chunk">
    Each chunk is sent to the TTS API with controlled concurrency and automatic retry logic for rate limits. Chunks are processed in parallel (default: 2 concurrent requests) to speed up synthesis while respecting API rate limits.
  </Step>

  <Step title="Stitch the audio">
    The individual audio responses are combined into a single output file. The Python script produces a **WAV** file with configurable silence between segments, while the JavaScript script produces an **MP3** file and uses **ffmpeg** to merge segments with correct duration metadata.
  </Step>
</Steps>

## Configuration

Both scripts share the same tunable parameters:

| Parameter                 | Default | Description                                                        |
| :------------------------ | :------ | :----------------------------------------------------------------- |
| `MIN_CHUNK_SIZE`          | 500     | Minimum characters before looking for a break point                |
| `MAX_CHUNK_SIZE`          | 1,900   | Maximum chunk size (stays under the 2,000-char API limit)          |
| `MAX_CONCURRENT_REQUESTS` | 2       | Parallel API requests (increase with caution to avoid rate limits) |
| `MAX_RETRIES`             | 3       | Retry attempts for rate-limited requests with exponential backoff  |

## Running the Scripts

### Prerequisites

* An **Inworld API key** set as the `INWORLD_API_KEY` environment variable
* A text file with your long-form content
* **Python 3** (for the Python script) or **Node.js** (for the JavaScript script)
* **ffmpeg** (optional, for the JS script — produces correct MP3 duration metadata)

### Python

```bash theme={"system"}
export INWORLD_API_KEY=your_api_key_here
pip install requests python-dotenv
python example_tts_long_input.py
```

The script reads the input text file, chunks it, synthesizes all chunks with the Inworld TTS API, and saves the combined audio as a **WAV** file. It also prints a splice report showing the exact timestamps where chunks were joined, useful for quality checking.

### JavaScript

```bash theme={"system"}
export INWORLD_API_KEY=your_api_key_here
node example_tts_long_input_compressed.js
```

The script follows the same chunking and synthesis pipeline, outputting a compressed **MP3** file. When **ffmpeg** is available, it merges segments with correct duration metadata. Otherwise, it falls back to raw concatenation.

## Code Examples

<CardGroup cols={2}>
  <Card title="Python" icon="python" href="https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/python/example_tts_long_input.py">
    WAV output with splice report and configurable silence between segments
  </Card>

  <Card title="JavaScript" icon="js" href="https://github.com/inworld-ai/inworld-api-examples/blob/main/tts/js/example_tts_long_input_compressed.js">
    Compressed MP3 output with ffmpeg-based segment merging
  </Card>
</CardGroup>

## Next Steps

<CardGroup cols={3}>
  <Card title="Synthesize Speech" icon="waveform-lines" href="/tts/synthesize-speech">
    Learn about the standard (non-streaming) synthesis API.
  </Card>

  <Card title="Streaming API" icon="bolt" href="/tts/synthesize-speech-streaming">
    Use streaming for real-time playback of shorter content.
  </Card>

  <Card title="Latency Best Practices" icon="circle-check" href="/tts/best-practices/latency">
    Optimize time-to-first-audio for real-time use cases.
  </Card>
</CardGroup>
