Hi everyone! This is a “preliminary” release of 2.5.10, meaning I’ve created a rhasspy/rhasspy:2.5.10 Docker tag and uploaded new Debian packages, but I haven’t made these the “latest” release yet (the docs are not updated yet either). Due to time constraints, I haven’t been able to run all my usual tests; but I wanted to get something out to everyone
Thank you, just tried it with a manual update on my HomeAssistant Addon.
First thing I noticed is that I am getting an error regarding WaveNet:
File "/usr/lib/rhasspy/rhasspy-tts-wavenet-hermes/rhasspytts_wavenet_hermes/__init__.py", line 15, in <module>
from google.cloud import texttospeech
ModuleNotFoundError: No module named 'google.cloud'
Everything else seems to work. I am currently playing around with the Kaldi Confidence Scores in German. Very nice to have this. Although I realized that they vary widely even with the same sentence being spoken the same way. Is it possible to also see the confidence of the single words? Currently the only thing I found that changes is the likelihood of the full utterance in the ASR/textCaptured
I have only just started experimenting with rhasspy to extend my home automation system and had just got the 2.5.9 release working well with a pair of Pi4Bs. I had noticed that Larynx gave the best sounding TTS output but that it was horribly slow (around 15 seconds to generate audio for a “The time is…” sentence, so I was really looking forward to giving this a try!
Both Pi machines are running rhasspy as docker images and I use the MQTT server which runs on my Home Assistant server for both HA (also on docker) and rhasspy.
Anyway, I updated my docker installations with 2.5.10 and am glad to report that the increase in speed for Larynx is considerable. The first sentence took 5 seconds to deliver, but subsequent delivery is almost immediate.
On my NUC J5005 the Larynx component works well, altough even with low quality mode it takes several seconds to create the sound file. I would be interested to find out about what is required to create another german voice. Are there some predefinded sentences to record to help in creating a new voice option?
Just tested it again. And yes the first sentence takes several seconds. The following sentences about 1 second. High Quality adds 0.5 to 1 second to that. But i dont really hear a difference in it anyway.
This may need some tuning. I’m using the Minimum Bayes Risk from Kaldi, but I’m not entirely sure the best way to report it.
I must have gotten interrupted implementing this. The confidences are produced during transcription, they’re just not being passed up the layers into textCaptured or the NLU intent. I’ve created a bug report here to remember: https://github.com/rhasspy/rhasspy/issues/207
Ah, thank you. I was debugging a problem that ended up being with pip and forgot to turn this back on.
Yes! Thanks to volunteers like @RaspiManu, we have a set of German phrases to read Anyone who’s interested, please PM me and I’ll send you a link.
Larynx delays loading the TTS/vocoder models until it’s called the first time, so this is what you’re seeing. I might be able to add an option to preload the voice if this is an issue for people.
I tried to update Rhasspy on my raspberry pi 3 and it takes a very long time to install dependencies with pip.
When running /home/pi/rhasspy/.venv/bin/python -m pip install "/home/pi/rhasspy" I have this warning displayed:
INFO: pip is looking at multiple versions of importlib-metadata to determine which version is compatible with other requirements. This could take a while.
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking
INFO: pip is looking at multiple versions of hyperframe to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of hpack to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of h2 to determine which version is compatible with other requirements. This could take a while.
One core of the raspberry is at 100% and it takes hours to find the right version.
It’s probably a pip problem more than a rhasspy one, but maybe it should be possible to fix the version in rhasspy requirement file.
You need to force the pip version to <= 20.2.4 for it to work now. They completely ruined pip with the new dependency resolver. I can’t get anything to install now if it has more than one dependency.
If you’re installing from source, try exporting PIP_VERSION="pip<=20.2.4” before make install.
It is. I think romkabouter was giving me this tip in another thread.
Basically you just copy the hassio addon repository to your addons/local folder.
Then in the dockerfile you put
Because like tensorflow and all tensor based math the 2-3x speed increase of 64bit means that 32bit is now aimed at only for microcontrollers as it really is 2-3x at least with tensorflow but presume Onnx is very similar.
The Neon SIMD is highly optimised with all NN engines and with Armv8 the 128 Neon register to float math means in real terms 2-3x perf increase which is absolutely huge so they don’t see armv7 as viable or at least worth much mention as why would you?
I did some benchmarks with the exact same just PiOS64v32 and depending on the model like vs like of 64v32 the perf increase is 2-3x with the wider databus.