Rhasspy 2.5.8 Released

synesthesiam · November 19, 2020, 10:27pm

Hi everyone

With the holidays coming up, it seems like a good time to push out a new release. Unlike 2.5.7, there are quite a few new things in 2.5.8 to go over.

Thanks to everyone who contributed, and to the many community members who are helping us build a great voice assistant for everyone . As always, please open GitHub issues so we can squash those bugs

Larynx TTS

This release finally incorporates the Larynx text to speech system, which is a fork of MozillaTTS. The goal of this TTS system is to provide high quality voices for as many languages as possible, replacing the need for Google Wavenet.

Once it gets warmed up, Larynx runs well on x86_64 systems (NUC, etc.), and OK on a Pi 4. I wouldn’t recommend trying to use it on a Pi 3 or 2. It uses PyTorch on the CPU, so there may be room for improvement with a GPU someday in the future.

Out of the box, I have voices for Dutch, German, French, Spanish, and Russian. Many more are currently in progress, including English, Swedish, Portuguese, and Vietnamese

New Kaldi STT Models

In line with the Master Plan, I’ve trained up Kaldi speech to text models for Italian, Spanish, French, and Russian. You can use these now in Rhasspy by selecting Kaldi in the appropriate profile.

More languages are coming as I locate public speech data. There are also several efforts underway to crowd-source this data from the Rhasspy community and other places. If you know of a good dataset or would like to volunteer, please let me know!

Volume Everywhere

Many users have asked for the ability to adjust Rhasspy’s output volume, so I’ve made an effort to add this in a way that (I think) makes the most sense.

In the Settings page, you can now independently set the volumes of:

The audio output service (aplay)
The text to speech service
The dialogue feedback sounds (beeps)

On the main web UI page, there is also a handy “Set Volume” button. If you leave the site ID text box next to it blank, it will change the volume on whatever system you’re using. But you can also put specific site IDs in the box and change the volumes of multiple satellites at once (this uses a new MQTT message).

Lastly, there’s a new /api/set-volume HTTP endpoint where you can programmatically set the volume. It takes a ?siteId=site1,site2,.. parameter too if you want to set multiple site ids. Oh, and /api/text-to-speech now has a ?volume=0.5 parameter if you want just one utterance to be quiet.

Complete Changelog

Added

Russian Kaldi profile and Larynx TTS voice
Spanish Kaldi profile and Larynx TTS voice
French Kaldi profile and Larynx TTS voice
Italian Kaldi profile
German Larynx TTS voice
Volume scale (0-1) for feedback sounds and TTS
rhasspy/asr/setVolume MQTT message and /api/setVolume HTTP endpoint
rhasspy/asr/recordingFinished MQTT message sent immediately after silence detection
Satellite site ids to intent handling settings in web UI
Group separator for co-located satellites (dialogue.group_separator)
num2words support for Swedish (thanks Bostrom!)

Fixed

Argument list for sound output command system (jrouly)
Expand environment variables in TLS ca_certs
spn silence phone in Swedish profile
Use callback API in PyAudio to avoid buffer overrun
HTTP API JSON should not be forced to ASCII

Changed

Default Kaldi language model type is now text FST instead of arpa

romkabouter · November 20, 2020, 9:38am

Great work! Hope to try it soon

Platup · November 20, 2020, 6:55pm

Loaded it up on my server and satellites and no issues so far. Awesome work! Thank you.

I did however notice one of the new features to set the volume of the “beeps” doesn’t seem to be working.
Setting the aplay volume on the Satellite seems to affect the beeps and tts, but changing the volume of the “Sounds” on the Satellite (even down to .1) doesn’t seem to make an audible difference.

Speaking of the beeps, is there a way to simply disable some or all? And if the Wake WAV is disabled, will the delay be shorter before it begins listening for the command?

synesthesiam · November 20, 2020, 8:30pm

Hmmmm, I’ll take a look. Thanks for the feedback.

If you delete the file name in the web UI, it should stop playing that WAV file. There should be shorter delay too, since there’s no worry of the mic picking up the beeps as speech.

Thargor · November 20, 2020, 8:54pm

Thank you very much for the new version!

I tried Larynx TTS (de-thorsten) on my Server (Synology Intel NAS) with a satellite setup, but I always get an TimeOut Error:

[ERROR:2020-11-20 20:49:28,093] rhasspyserver_hermes: 
Traceback (most recent call last):
  File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/quart/app.py", line 1821, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/quart/app.py", line 1869, in dispatch_request
    return await handler(**request_.view_args)
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__main__.py", line 1282, in api_train
    result = await core.train()
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 461, in train
    timeout_seconds=self.training_timeout_seconds,
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 971, in publish_wait
    result_awaitable, timeout=timeout_seconds
  File "/usr/lib/python3.7/asyncio/tasks.py", line 449, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

How can I get more information what is not working? Is there somewhere more debug info?

Thank you!

synesthesiam · November 20, 2020, 8:57pm

You’re welcome! Do you need any messages from rhasspytts_larynx_hermes in the log? It can take some time for MozillaTTS to load the model; you should see a message that it successfully created a synthesizer.

Thargor · November 21, 2020, 7:48am

Unfortunately I don’t see such a log.
Only:

[DEBUG:2020-11-20 23:00:00,771] rhasspyprofile.download: Skipping tts/larynx/de/thorsten/vocoder/config.json (/profiles/de/tts/larynx/de/thorsten/vocoder/config.json)
[DEBUG:2020-11-20 23:00:00,770] rhasspyprofile.download: Skipping tts/larynx/de/thorsten/vocoder/checkpoint_500000.pth.tar (/profiles/de/tts/larynx/de/thorsten/vocoder/checkpoint_500000.pth.tar)
[DEBUG:2020-11-20 23:00:00,768] rhasspyprofile.download: Skipping tts/larynx/de/thorsten/scale_stats.npy (/profiles/de/tts/larynx/de/thorsten/scale_stats.npy)
[DEBUG:2020-11-20 23:00:00,767] rhasspyprofile.download: Skipping tts/larynx/de/thorsten/config.json (/profiles/de/tts/larynx/de/thorsten/config.json)
[DEBUG:2020-11-20 23:00:00,766] rhasspyprofile.download: Skipping tts/larynx/de/thorsten/checkpoint_380000.pth.tar (/profiles/de/tts/larynx/de/thorsten/checkpoint_380000.pth.tar)
[DEBUG:2020-11-20 23:00:00,764] rhasspyprofile.download: text_to_speech.system larynx larynx = True

synesthesiam · November 22, 2020, 6:58pm

OK, do you see files in your profile under the tts/larynx directory?

Thargor · November 22, 2020, 11:02pm

Yes:

/de/tts/larynx$ ls -Ra
.:
. … cache de

./cache:
. …

./de:
. … thorsten

./de/thorsten:
. … checkpoint_380000.pth.tar config.json scale_stats.npy vocoder

./de/thorsten/vocoder:
. … checkpoint_500000.pth.tar config.json

joshward9182 · November 24, 2020, 4:38pm

@synesthesiam I have my Rhasspy server running as a Home Assistant Add-On.

There is currently no option to update this from 2.5.7.2 in HA.

Does it typically take a while to filter through to HA?

romkabouter · November 24, 2020, 9:51pm

It is currentlt already available, but is it renamed to Rhassy Assistant, removing the 2.4 version.
You can savely install that and remove the Rhasspy 2.5, but first make a copy of the configuration

Do not worry, your profiles folder on the share will not be deleted

After uninstalling Rhasspy Assistant 2.5 and installing the new Rhasspy Assistant (pointing to 2.5.8), when you reload the addons the 2.5 will be gone

nordeep · November 25, 2020, 7:32am

@synesthesiam Congratulations! Thank you for your work!

Seems something broken in downloading Kaldi base_dictionary.txt
Can’t download https://raw.githubusercontent.com/rhasspy/ru_kaldi-rhasspy/raw/master/base_dictionary.txt.gz - 404: Not Found

synesthesiam · November 25, 2020, 2:56pm

You’re welcome

Ah, I see what happened here. I’ll get a fix pushed out for this soon.

synesthesiam · November 25, 2020, 3:12pm

@Thargor, silly question: have you tried completely restarting Rhasspy? For some reason, the voice didn’t work for me until I did this.

But I do have this in my console log:

[DEBUG] {'de-thorsten': {'model_path': PosixPath('/home/hansenm/.config/rhasspy/profiles/de/tts/larynx/de/thorsten/checkpoint_380000.pth.tar'), 'config_path': PosixPath('/home/hansenm/.config/rhasspy/profiles/de/tts/larynx/de/thorsten/config.json'), 'vocoder_path': PosixPath('/home/hansenm/.config/rhasspy/profiles/de/tts/larynx/de/thorsten/vocoder/checkpoint_500000.pth.tar'), 'vocoder_config_path': PosixPath('/home/hansenm/.config/rhasspy/profiles/de/tts/larynx/de/thorsten/vocoder/config.json')}}
[DEBUG] Creating Larynx synthesizer (de-thorsten)...
[INFO] Created synthesizer for de-thorsten

Thargor · November 25, 2020, 3:45pm

Rhasspy runs on a Synology NAS inside Docker. You are right, the Docker log shows:

If I try to say “Das ist ein Test” from the Satellite-Webfrontend, i get the following log:

Seems like there is a problem with an “illegal instruction” and the process crashed …
Any Idea what could be the reason?

synesthesiam · November 25, 2020, 4:35pm

OK, I’m guessing that the CPU in your Synology does not support AVX instructions. I’m using the official PyTorch CPU wheel for x86_64, which is probably compiled for AVX.

I may try and compile my own non-AVX wheel. It sucks that the moment you step into PyTorch/Tensorflow land, it suddenly really matters which year your CPU was made or what tier it is.

Thargor · November 25, 2020, 5:35pm

Yes you are right, it is an Celeron J3455 without AVX. At least, this mystery is solved.

KiboOst · November 26, 2020, 4:20pm

Many thanks for that !!

I’ve integrated it into Jeedom plugin, works perfect !
We now have to set device volume to 100% when raspberry start
amixer -c 0 set Playback 100%

TotalSpaceshipguy · November 26, 2020, 6:58pm

Hi

First post here so apologies if I’ve messed something up.
I’ve just updated the deb package from 2.5.7 to 2.5.8. When I run “rhasspy -p en” I now get this error…

Starting up…
Using virtual environment at /usr/lib/rhasspy/.venv
python3: error while loading shared libraries: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory

On the latest Ubuntu (20.10) python is 3.8. I’ve tried sym linking the 3.8 lib to that file but then I just get more errors…
symbol lookup error: python3: undefined symbol: _Py_UnixMain

Is this something anyone else has come across or has a workaround for ?

synesthesiam · November 26, 2020, 7:06pm

Hi @TotalSpaceshipguy, thanks for posting

Can you try apt-get installing libpython3.7 and see if that fixes it? I tried a different method for packaging the .deb files this time around, and I may have missed a requirement.