Pronouncing ordinal numbers

weavage · March 21, 2023, 4:53pm

Noob here and I just started delving into this fascinating rabbit hole of speech. Is there a way to get Rhasspy to pronounce ordinal numbers? I’m currently using the default Larynx and trying to get it to pronounce dates. I can get HA to return ordinal strings like ‘1st’ and ‘2nd’, but Larynx doesn’t appear to understand those. It just attempts to pronounce the st, which is ironically similar to it blowing a raspberry at me. Is there a way to train custom words or at least ordinal pronunciations?

rolyan_trauts · March 21, 2023, 7:01pm

Is it more of a “THPPTPHTPHPHHPH” than a “PFFT”? “Pffthweep” often needs checking, but likely ordinals where missing from the dataset it was trained on?

weavage · March 21, 2023, 8:51pm

I would say more of a ‘pffst’ I suppose. You can put ‘1st’, ‘2nd’, etc. in the speak line as a test and it does the same thing. I don’t know nearly enough about this stuff to say for sure, but I’m guessing it wasn’t in the dataset?

Would it be possible to add it to the default training dataset? Should I raise a feature request for that?

rolyan_trauts · March 22, 2023, 6:08am

If I remember right Larynx is a refactored WaveGlow | PyTorch

I think its purely on the dataset DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2 at master · NVIDIA/DeepLearningExamples · GitHub

Runs on Onnx but doesn’t seem to be quantised GitHub - rhasspy/larynx: End to end text to speech system using gruut and onnx

Says custom lexicons are not supported but you can (you can use , however)

Haven’t looked to what that means, its prob Gruut GitHub - rhasspy/gruut: A tokenizer, text cleaner, and phonemizer for many human languages. as that is the tokenizer, text cleaner, and IPA phonemizer for several human languages that supports SSML that doesn’t seem to be handling ordinals.