TTS languages beyond the 8 supported in releases

sve · April 15, 2021, 10:00am

The coverage of languages for TTS (larynx) is quite impressive. Has anyone tried other languages? Esp. the big ones (counting native speakers) like Mandarin.

synesthesiam · April 15, 2021, 1:20pm

The two limiting factors for language support in Larynx are:

Support for the language in gruut (text to phoneme conversion)
Publicly available audio data

Not required, but the voice quality also improves dramatically if I first train a Kaldi ASR model, and then use it to do forced alignment of the audio data with its transcription phonemes.

For Mandarin, there’s plenty of audio data available. I “just” need to add it to gruut

Anyone understand Mandarin enough to help?

sve · April 26, 2021, 2:09pm

For a Kaldi ASR model, I can recommend the multi_cn recipe, but training takes over 20 days with an older GPU like GTX 1660.

I will try to encourage some native speakers of Mandarin to help, but no promises …