Great, thank you! I’m thinking we should keep some of the sentences with English works to get a bit of the foreign phoneme coverage. Do you think those English words are common enough?
This must have been a way of getting the phoneme pair /iː o/ (wie is /w iː/). The /iː/ is from something like “anal[y]se” too. I see maybe 10 examples in the lexicon where this sound comes at the end of a word (like zei), and no cases where the pair occurs in a single word.
Sure, the coverage file contains all of the missing pairs and example words. If the example words all look unusual or foreign, it probably means the pair can be safely ignored.
Yes, maybe I was a bit too strict in weeding out all English words. Words like “Halloween”, “liken”, “showen”, “shop” and “bye” are probably common enough to include.
I’m using GlowTTS and the Multiband MelGAN vocoder. I had to re-train from scratch this morning because of a mistake, so the vocoder isn’t sounding so great yet.
I’ll update the examples after it’s had a change to train overnight
Definitely The first step will be adding support for French to gruut. I expect this to be done by next week or so.
The recordings can be done with anything as long as you have WAV files and transcripts. With a little work, gruut will help select a small set of sentences from a large corpora (books, Wikipedia, etc.) that are maximally useful.
Training just needs a CUDA-enabled GPU that’s supported by PyTorch >= 1.5. I’m using a GTX 1060 6GB.
This is the rdh Dutch voice speaking an (accented) English sentence! Because I use IPA for both English and Dutch phonemes, I created a small mapping file that approximates the 14 or so “missing” phonemes from Dutch. I had to guess on some, but it works as a proof of concept
So this means we could re-use some of the voices for other languages, until we get native speakers in that language. To get it right, though, I will need help from people who speak both languages. For example, I don’t really know how Dutch folks pronounce the “th” sounds from “thing” and “the”.
For things like numbers and dates you should probably preprocess your text with something like Lingua Franca to convert them to words that are pronounceable by the TTS.
There are some undocumented features I’m still experimenting with, but I agree that in general a separate library should be used. Some of the features that are in there but disabled for now:
Currency recognition
“$100.12” (sort of works now)
Number types
“1_ordinal” becomes “first” in English
“1902_year” becomes “nineteen oh two” in English
Alternative pronunciations
“read_1” and “read_2” are pronounced like “red” and “reed” respectively
I also have the ability to list abbreviations for a language that are automatically expanded. I’ve got a list for English, like mr -> mister, but I don’t know any for Dutch.
Happy to help and expand on those lists.
If I remeber correctly Mycroft has something like a collaborative system on their website.
(translate.mycroft.ai)
I suppose we could do something similar with just a github directory per language and documentation on what is needed for completing a language.