Rhasspy 2.5.10 Released

The official release of 2.5.10 is here! Thanks to every one for the testing and feedback :slight_smile:

I’ve added a few bug fixes since the preliminary release, but the main new features since 2.5.9 are still:

  • ASR support for Swedish (sv)
  • New Larynx that’s faster and has a ton of new voices
    • Hint: switch to “Low Quality” on a Pi 4 or below for a big speed-up
    • This version should also work with older x86_64 CPUs (no AVX)
  • Kaldi ASR now has confidence value for words and sentences
  • Dialogue manager now has a minimum ASR confidence threshold (speech_to_text.<system>.min_confidence where <system> is kaldi, deepspeech, etc.)

Since the preliminary release, I’ve also added:

  • ASR confidences show up now in asr/textCaptured and nlu/intent messages (see asrConfidence and asrTokens).
  • Custom entities from /api/listen-for-command now show up in the recognized nlu/intent (see customEntities)
  • TTS timeouts are based on the text length and dialogue.say_chars_per_second
  • Sound paths in the dialogue manager may be directories - a random WAV file is chosen each time then
  • Only one satellite within a group should start recording if more than one detect the wake word at the same time

Added

  • New version of Larynx with improved performance and 35 voices (20 English, 1 German, 3 French, 2 Spanish, 3 Dutch, 2 Italian, 1 Swedish, 3 Russian)
  • Kaldi ASR model for Swedish (sv)
  • Confidence and word timings for Kaldi ASR
  • Minimum ASR confidence threshold for dialogue manager
  • Detect AVX support and warn for Larynx, DeepSpeech, and Precise in Web UI
  • Handle spaces in converter arguments with word!(converter, …)
  • rhasspy-tts-cli-hermes TTS commands may be Jinja2 templates (–use-jinja2)
  • Support for MaryTTS effects (jasonhildebrand)
  • customData added to hermes/nlu/query message
  • customData is copied by NLU services from query to intent/intentNotRecognized
  • lang property added for wake, speech_to_text, and intent profile sections
  • Wake, ASR, NLU services all set lang properties if null
  • Profile now has “parent” setting, allowing one profile to load settings from another
  • Dialogue manager sound paths may be directories, from which a random WAV will be chosen each time (thanks plafue)

Fixed

  • Remote HTTP service sets site_id of satellite for ASR/NLU endpoints
  • DeepSpeech token output (was letters, now words)
  • Multiple values in custom converters are sent as a list on stdin
  • Don’t show restart/shutdown button if “sudo” isn’t available (Docker, Hass.io)
  • Added missing espeak phonemes for some profiles
  • MaryTTS voice test in Web UI
  • Remove dialogue session from site cache on end
  • Don’t throw error about system not configured if message is intent for satellite (schnopsi)
  • Custom entities from /api/listen-for-command are passed through to NLU intent
  • Slots inside sub-directories will properly show up in the web interface
  • Use locks in dialogue manager to prevent multiple group satellite sessions during audio playback

Changed

  • /api/listen-for-command uses a proper wake workflow now (requires dialogue manager)
  • Show absolute paths for custom models (precise, snowboy, porcupine) in Web UI
  • TTS timeouts are computing using text length (dialogue.say_chars_per_second)
11 Likes

Great work,

Question, there are two addon repo’s:


and

I us the first, but it might be a good idea to depricate that and start using the latter.
This first is up-to-date with 2.5.10, the latter is a couple versions behind.

What do you think @synesthesiam

Sounds great, a couple of questions…

As a noob how do we get the update??

Is it a ‘breaking’ update? i.e. if applied to a working setup, will things stop working and require days of fiddling about :slight_smile:

I’ve updated both to the most recent version. I’ll eventually deprecate the one under my Github user name. Maybe a few more versions :wink:

1 Like

I do my best not to break things, but it’s really hard to test all of the possible configurations :confused:

All past versions are kept at least, so worse case you can revert back to your working version.

Thanks, how are the updates applied?

1 Like

It depends on how you installed Rhasspy. If you’re using Docker, just follow: https://rhasspy.readthedocs.io/en/latest/installation/#updating

Thanks for the link, updating now…

I just followed the link, thanks

just updated my rhasspy to 2.5.10 and tryed to switch to larynx ( french )
i m having trouble downloading additional files:

DownloadFailedException: (‘https://github.com/rhasspy/gruut/releases/download/v0.9.0/fr-fr.tar.gz’, ‘File size mismatch (got 9991077 byte(s), expected 9983999)’)

1 Like

Thanks! I’ll upload a fix today :+1:

Is there any documentation on how to use Larynx. I selected it from the gui but testing fails

TtsException: [ONNXRuntimeError] : 3 : NO_SUCHFILE : Load model from /home/pi/.config/rhasspy/profiles/en/tts/larynx/en-us/kathleen-glow_tts/generator.onnx failed:Load model /home/pi/.config/rhasspy/profiles/en/tts/larynx/en-us/kathleen-glow_tts/generator.onnx failed. File doesn't exist``

I need to put in the documentation that setting up Larynx involves:

  1. Select Larynx for TTS
  2. Save settings and restart Rhasspy
  3. Go to the top of the page and Download the required voice files
  4. Enjoy!

I followed these steps before you sent them from my Firefox/Linux PC and step 3 never showed up (nothing at the top of the page).
I redid them from my Android phone and I had a prompt at the top asking me to download files, which I did and then a “training …” message appeared and stayed there.
I looked at the log and here is what I see (am on RPI 3b+)

[ERROR:2021-04-16 18:49:29,508] rhasspyserver_hermes:
Traceback (most recent call last):
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/quart/app.py", line 1821, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/quart/app.py", line 1869, in dispatch_request
    return await handler(**request_.view_args)
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__main__.py", line 840, in api_wake_words
    hotwords = await core.get_hotwords()
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 910, in get_hotwords
    handle_finished(), messages, message_types
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 994, in publish_wait
    result_awaitable, timeout=timeout_seconds
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/asyncio/tasks.py", line 449, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError
[ERROR:2021-04-16 18:49:29,552] rhasspyserver_hermes:
Traceback (most recent call last):
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/quart/app.py", line 1821, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/quart/app.py", line 1869, in dispatch_request
    return await handler(**request_.view_args)
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__main__.py", line 824, in api_speakers
    speakers = await core.get_speakers()
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 881, in get_speakers
    handle_finished(), messages, message_types
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 994, in publish_wait
    result_awaitable, timeout=timeout_seconds
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/asyncio/tasks.py", line 449, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError
[ERROR:2021-04-16 18:49:29,589] rhasspyserver_hermes:
Traceback (most recent call last):
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/quart/app.py", line 1821, in full_dispatch_request
    result = await self.dispatch_request(request_context)
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/quart/app.py", line 1869, in dispatch_request
    return await handler(**request_.view_args)
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__main__.py", line 789, in api_microphones
    microphones = await core.get_microphones()
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 848, in get_microphones
    handle_finished(), messages, message_types
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 994, in publish_wait
    result_awaitable, timeout=timeout_seconds
  File "/usr/lib/rhasspy/usr/local/lib/python3.7/asyncio/tasks.py", line 449, in wait_for
    raise futures.TimeoutError()
concurrent.futures._base.TimeoutError

Hi,
So I’m working with a matrix creator. Since the update. Rhasspy is having a hard time keeping track of the sound card. Every time I reboot or power cycle. Upon start up, I have to go in and manually point rhasspy at the microphone. It also keeps creating new hardware. It starts with device 3, then adds 4 and 5. After start-up. Ideas?

I am able to run Larynx properly in RPI 4B properly. The first time a new speech runs, it takes some time before it is played (the logs show some synthesis is going on). However, if the same speech is run next time, it is immediate (assuming it plays from cache). Is there a way to load all the speeches initially? Also, any dynamic sentence will then have a delay. Anyways to overcome this?

Setting to medium or low quality helps quite a bit on a Raspberry Pi. Try both and see which your prefer.

I do plan to add this. It’s tricky because the speech models may not have been downloaded yet when the TTS service starts. So what I’ll probably end up doing is attempting to load them, but falling back to waiting for the first TTS request if that fails.

thanks. I checked with voice profiles -
High Quality - 6-7 seconds
Medium - 2 -3 seconds (most of the time it is 2 seconds)
Low - 2 seconds
Hardware - RPI4B 4GB - JBL speaker connected via jack

Also, let me know how I can contribute if you can point to specific developer docs on this.
I have been using Rhasspy for quite some time now - love it and appreciate all the efforts. This Larynx by far to me seemed to be best bet, will be an exciting journey towards its refinement!

1 Like

This cake [v2.5.10] is great. So delicious and moist!

The new Larynx voices are a whole different level of quality compared to the old ones.

1 Like

Thanks @VoxAbsurdis :slight_smile:

Love the username, btw.