This page demonstrates the voice cloning quality of our Thai LoRA fine-tune of IndexTTS2. The model was trained on Thai speech data and supports zero-shot voice cloning for Thai text, while retaining the original model's English and Chinese capabilities.
For each speaker, we show:
TH001.wav)All generated audio uses seed=42 for reproducibility.
| Speaker | Gender | Reference | Same Text | Different Text |
|---|
| Speaker | Gender | Reference | Same Text | Different Text | EN Generated | EN Groundtruth |
|---|