IndexTTS2 Thai LoRA

About

This page demonstrates the voice cloning quality of our Thai LoRA fine-tune of IndexTTS2. The model was trained on Thai speech data and supports zero-shot voice cloning for Thai text, while retaining the original model's English and Chinese capabilities.

For each speaker, we show:

Reference — the original voice clip used as the speaker prompt (TH001.wav)
Same Text — TTS generation of the reference clip's transcription (reconstruction test)
Different Text — TTS generation of a fixed Thai sentence: "ปัญญาประดิษฐ์กำลังเปลี่ยนแปลงวิถีชีวิตและการทำงานของเรา"
EN Generated — English TTS using the Thai voice reference (cross-lingual test, bilingual speakers only)
EN Groundtruth — the speaker's actual English recording for comparison

All generated audio uses seed=42 for reproducibility.

Zero-Shot Voice Cloning for Thai Speech

About

Thai-Only Speakers (1–12)

Bilingual Speakers (13–24)