Getting into Rhasspy

TheLexoPlexx · May 20, 2020, 9:14pm

Hello there,

first post, definitely not the last one and here I am. I have been trying to get Rhasspy to work for a few weeks now but I am not very happy with the results so far. That is definitely not on Rhasspy’s end but far more on my end.

Now, my journey started when installing the original synesthesiam/rhasspy-package off docker on a Raspberry Pi with Matrix Voice. I quickly noticed however that speech recognition was terrible as I configured it. (Might also be due to the fault that I changed the profile to “de” later on instead of on first start).

As my next step, I decided the fault is on the “speech recogniton” part. Since everything worked superb with english commands but not with german ones. I stumbled upon AASHISHAG/deepspeech-german and tried to install that, learned a lot about Linux on the way so it was definitely not useless but I ended up cloning the current DeepSpeech and building my own German Language Model based on various given corpora AAnndd… version 0.7.1 of DeepSpeech.

By that time I also set up a spare PC with Ubuntu and a GTX970 and while training the german language model, I installed Rhasspy on that PC deciding it might be a good idea to use the Raspberry (or maybe later, multiple Raspberry Pi’s) as the endpoint and the PC as Server.

After two days I finally figured out how to transfer sounds between these Points using MQTT and I also moved to the current rhasspy-voltron v2.5.0.

However… I am still not happy.
One: I cannot use the newly built DeepSpeech-Model as there might be a bug (or misconfiguration) for the Issue I created over here. Yes, it is not even 12 hours old and I do not expect any answers so far but It is probably a fault on my end so I don’t want to waste any development time of this awesome project.
Two: The current rhasspy-asr-deepspeech-hermes uses version 0.6.1 of DS and I do not know how compatible that is, I haven’t tried it yet so far though.
Three: What are your Hardware Setups? What configurations are you running? For what language and which model?