Timestamp alignment lets you retrieve timing information that matches the generated audio, which is useful for experiences like word highlighting, karaoke‑style captions, and lipsync.
Set the
timestampType request parameter to control granularity:
WORD: Return timestamps for each wordCHARACTER: Return timestamps for each character or punctuation
WORD:timestampInfo.wordAlignmentwithwords,wordStartTimeSeconds,wordEndTimeSecondsCHARACTER:timestampInfo.characterAlignmentwithcharacters,characterStartTimeSeconds,characterEndTimeSeconds
Timestamp alignment currently supports English only; other languages are experimental.
Enabling timestamp alignment slightly increases latency; internal experiments show an average ~100 ms increase.