Noob here and I just started delving into this fascinating rabbit hole of speech. Is there a way to get Rhasspy to pronounce ordinal numbers? I’m currently using the default Larynx and trying to get it to pronounce dates. I can get HA to return ordinal strings like ‘1st’ and ‘2nd’, but Larynx doesn’t appear to understand those. It just attempts to pronounce the st, which is ironically similar to it blowing a raspberry at me. Is there a way to train custom words or at least ordinal pronunciations?
Is it more of a “THPPTPHTPHPHHPH” than a “PFFT”? “Pffthweep” often needs checking, but likely ordinals where missing from the dataset it was trained on?
I would say more of a ‘pffst’ I suppose. You can put ‘1st’, ‘2nd’, etc. in the speak line as a test and it does the same thing. I don’t know nearly enough about this stuff to say for sure, but I’m guessing it wasn’t in the dataset?
Would it be possible to add it to the default training dataset? Should I raise a feature request for that?
If I remember right Larynx is a refactored WaveGlow | PyTorch
I think its purely on the dataset DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2 at master · NVIDIA/DeepLearningExamples · GitHub
Runs on Onnx but doesn’t seem to be quantised GitHub - rhasspy/larynx: End to end text to speech system using gruut and onnx
Says custom lexicons are not supported
but you can (you can use
Haven’t looked to what that means, its prob Gruut GitHub - rhasspy/gruut: A tokenizer, text cleaner, and phonemizer for many human languages. as that is the tokenizer, text cleaner, and IPA phonemizer for several human languages that supports SSML that doesn’t seem to be handling ordinals.