How It Works
Chunk the text
The input is split into segments under the 2,000-character API limit. The chunking algorithm looks for natural break points in the following priority order:
- Paragraph breaks (
\n\n) - Line breaks (
\n) - Sentence endings (
.!?) - Last space (fallback)
Synthesize each chunk
Each chunk is sent to the TTS API with controlled concurrency and automatic retry logic for rate limits. Chunks are processed in parallel (default: 2 concurrent requests) to speed up synthesis while respecting API rate limits.
Configuration
Both scripts share the same tunable parameters:| Parameter | Default | Description |
|---|---|---|
MIN_CHUNK_SIZE | 500 | Minimum characters before looking for a break point |
MAX_CHUNK_SIZE | 1,900 | Maximum chunk size (stays under the 2,000-char API limit) |
MAX_CONCURRENT_REQUESTS | 2 | Parallel API requests (increase with caution to avoid rate limits) |
MAX_RETRIES | 3 | Retry attempts for rate-limited requests with exponential backoff |
Running the Scripts
Prerequisites
- An Inworld API key set as the
INWORLD_API_KEYenvironment variable - A text file with your long-form content
- Python 3 (for the Python script) or Node.js (for the JavaScript script)
- ffmpeg (optional, for the JS script — produces correct MP3 duration metadata)
Python
JavaScript
Code Examples
Python
WAV output with splice report and configurable silence between segments
JavaScript
Compressed MP3 output with ffmpeg-based segment merging