Rhasspy 2.5 Pre-Release

rickmini · March 28, 2020, 5:44pm

Sorry im late getting back to you. When i said it didnt run, i did so based on the web interface not being able to connect. I rebooted, then docker ps, i found 3 instances of the docker image listed. After closing all and trying again, i had strange results, not worth relating, so i went back to my external mqtt server, and everything works great. So i am satisfied to use this solution. I am spending my time now modifiying my application that formerly ran on Snips to run on Rhasspy. Ver 2.5 allows me to do this for the first time, since i make extensive use of hermes mqtt and the program flow that it provides. As of right now, I was able to compile about 95% of it. The port depends upon internal hermes systems like its dialogue flow: “hermes/dialogueManager/sessionStarted” and …ended as well as how the intents are parsed for slot values. I also use other hermes messages to determine how to make robot functions fire.
I have found a few inconsistencies that work on Snips but not Rhasspy.
I am in the process of documenting this now. When i am finished (probably before apr 4) i will provide this to you. What is the best method to do that.
Also, in previous versions i did not use Kaldi. I am using it now, and it seems much more accurate and faster? (not sure about the speed since im now on a Pi4) Is this your experience as well?

kaykoch · March 28, 2020, 6:47pm

My experiences with 2.5:

Master

marryTTS: Docker on Ubuntu
mqtt: Docker on unbuntu (mosquitto)
Master: Docker on Ubuntu

–> Works with satellites

2.4.16, zero, env, ReSpeaker2, API
2.4.18, PI3, Docker, ReSpeaker2, API
2.5, PI3, Docker, ReSpeaker2, mqtt

Satellites:

2.5-pre, zero, env, ReSpeaker2

All satellites with UDP Audio Port :12202 and Output siteId: wohnzimmer

Satellite reacts on “Wake Up -Button”:

Microphone works, Intents are recognized, Beep-sound OK, Answer from node-red

Satellite reacts on spoken "Wake Word:"

Beep-Sound OK, nothing more, after 30 sec: Beep,

[DEBUG:2020-03-28 19:09:13,878] rhasspyspeakers_cli_hermes: Publishing 63 bytes(s) to hermes/audioServer/schlafzimmer/playFinished

Then mqtt messages are sent for 30 sec.

  "hermes/audioServer/schlafzimmer/audioFrame"

Then:

 hermes/asr/stopListening
 hermes/dialogueManager/sessionEnded
 hermes/asr/textCaptured (no text inside)
 hermes/hotword/toggleOff

Then beep sound and

 hermes/hotword/toggleOn
 hermes/asr/toggleOn

AFTER 30 SECONDS:

[DEBUG:2020-03-28 19:09:41,964] rhasspymicrophone_cli_hermes: <- AsrStopListening(siteId=‘schlafzimmer’, sessionId=‘schlafzimmer-default-2b6e8623-f681-4f12-bab6-fbc91b6f0d9e’)
[DEBUG:2020-03-28 19:09:41,982] rhasspyserver_hermes: <- AsrTextCaptured(text=’’, likelihood=0, seconds=0, siteId=‘schlafzimmer’, sessionId=‘schlafzimmer-default-2b6e8623-f681-4f12-bab6-fbc91b6f0d9e’, wakewordId=’’)
[DEBUG:2020-03-28 19:09:41,985] rhasspymicrophone_cli_hermes: Enable UDP output
[DEBUG:2020-03-28 19:09:42,000] rhasspyserver_hermes: Playing WAV /root/.config/rhasspy/profiles/de/wav/answer.wav
[DEBUG:2020-03-28 19:09:42,007] rhasspyserver_hermes: -> HotwordToggleOff(siteId=‘schlafzimmer’)
[DEBUG:2020-03-28 19:09:42,015] rhasspyserver_hermes: Publishing 26 bytes(s) to hermes/hotword/toggleOff
[DEBUG:2020-03-28 19:09:42,029] rhasspywake_snowboy_hermes: <- HotwordToggleOff(siteId=‘schlafzimmer’)
[DEBUG:2020-03-28 19:09:42,034] rhasspyserver_hermes: -> AsrToggleOff(siteId=‘schlafzimmer’)
[DEBUG:2020-03-28 19:09:42,039] rhasspywake_snowboy_hermes: Disabled
[DEBUG:2020-03-28 19:09:42,042] rhasspyserver_hermes: Publishing 26 bytes(s) to hermes/asr/toggleOff
[DEBUG:2020-03-28 19:09:42,058] rhasspyserver_hermes: Subscribed to hermes/audioServer/schlafzimmer/playFinished
[DEBUG:2020-03-28 19:09:42,062] rhasspyserver_hermes: -> AudioPlayBytes(48940 byte(s))
[DEBUG:2020-03-28 19:09:42,122] rhasspyspeakers_cli_hermes: <- AudioPlayBytes(48940 byte(s))
[DEBUG:2020-03-28 19:09:42,143] rhasspyspeakers_cli_hermes: [‘aplay’, ‘-q’, ‘-t’, ‘wav’]
[DEBUG:2020-03-28 19:09:42,187] rhasspyserver_hermes: <- AsrAudioCaptured(44 byte(s))
[DEBUG:2020-03-28 19:09:42,234] rhasspyserver_hermes: -> HotwordToggleOn(siteId=‘schlafzimmer’)
[DEBUG:2020-03-28 19:09:42,254] rhasspyserver_hermes: Publishing 26 bytes(s) to hermes/hotword/toggleOn
[DEBUG:2020-03-28 19:09:42,266] rhasspywake_snowboy_hermes: <- HotwordToggleOn(siteId=‘schlafzimmer’)
[DEBUG:2020-03-28 19:09:42,276] rhasspywake_snowboy_hermes: Enabled
[DEBUG:2020-03-28 19:09:42,272] rhasspyserver_hermes: -> AsrToggleOn(siteId=‘schlafzimmer’)
[DEBUG:2020-03-28 19:09:42,284] rhasspyserver_hermes: Publishing 26 bytes(s) to hermes/asr/toggleOn
[DEBUG:2020-03-28 19:09:43,497] rhasspyspeakers_cli_hermes: -> AudioPlayFinished(id=‘1a9e4f7c-9cb5-482d-bd4e-6b5216b1db3b’, sessionId=’’)
[DEBUG:2020-03-28 19:09:43,502] rhasspyspeakers_cli_hermes: Publishing 63 bytes(s) to hermes/audioServer/schlafzimmer/playFinished

2.5-pre, zero, Docker, ReSpeaker2

Satellite reacts on “Wake Up -Button”: (with Wake-Word: Disabled)

Microphone works, Intents are recognized, Beep-sound OK, Answer from node-red

Satellite does NOT react on spoken "Wake Word:"

2020-03-28 18:36:37,498 INFO spawned: ‘wake_word’ with pid 408
2020-03-28 18:36:38,514 INFO success: wake_word entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-03-28 18:36:47,289 INFO exited: wake_word (exit status 1; not expected)
2020-03-28 18:36:48,319 INFO spawned: ‘wake_word’ with pid 417

ENDLESS

2.5-pre, PI3, docker, JABRA 410 (This machine works with 2.4.18)

Satellite reacts on on “Wake Up -Button” and spoken "Wake Word:"

Microphone works, Intents are recognized, No Beep-sound, no Answer, no speaker
I tried all possible Devices. No DEBUG Information. Speaker is simply not working.
Stopping 2.5 Docker and starting 2.4.18 --> everything works

I hope, someone can handle that information.

frkos · March 29, 2020, 8:05am

I’ve just tried 2.5 and… I can’t use it =((
It’s super fast but it recognizes 1 of 10 phrases, while previous version works well

I understand it’s because of Kaldi, so is it possible to tune it somehow? I’m ok with 2-3 sec delay, but have good recognition ratio

romkabouter · March 29, 2020, 8:06am

I see some unexpected messages popping up when wakeword is used

Rhasspy 2.5 is running as docker on a i3 NUC.
I have Matrix Voice streamer as mic input, hotword get detected (snowboy), but the I see

1 message on hermes/hotword/default/detected -> correct
2 message on hermes/hotword/toggleOff -> expecting just 1
followed within a second with hermes/hotword/toggleOn -> unexpected

This causes the leds to turn green and straight away to blue (toggleOff / toggleOn)
The speech recognition is working properly.

Image below is: wakeword detect, say nothing and wait for timeout
Clearly seen can be the double hotword toggles

geoffrey · March 29, 2020, 3:12pm

I tracked down the issue a bit more and it seems to be a permissions problem with regards to /dev/stdout that is used by supervisor.

When I disable the following line in rhasspy-satellite/rhasspy-supervisor/rhasspysupervisor/__init__.py, then I am able to start it as a service.

print("stdout_logfile=/dev/stdout", file=out_file)

Would you be able to check which user and group in your setup own /dev/stdout? This is my output:

pi@raspberrypi:~ $ ls -al /dev/std*
lrwxrwxrwx 1 root root 15 Feb 14  2019 /dev/stderr -> /proc/self/fd/2
lrwxrwxrwx 1 root root 15 Feb 14  2019 /dev/stdin -> /proc/self/fd/0
lrwxrwxrwx 1 root root 15 Feb 14  2019 /dev/stdout -> /proc/self/fd/1

Martin_Maier · March 29, 2020, 5:02pm

Hi @geoffrey, I’ m a little bit astonished about your problems and your way how you try to find these errors. I can show you how the complete installation should work on a Pi Zero an I did it a lot times in this way:

Flashing raspbian-buster-lite on a sd card using belana Etcher

After first boot apt update and upgrade

Configuring wlan, ssh, keyboard, timezone …

Installing samba to get file access to the pi

Install HermesLedControl https://github.com/project-alice-assistant/HermesLedControl which also provides the drivers for the Respeaker 2 Hat.

I also install snips-satellite (sudo apt install mosquitto mosquitto-clients snips-satellite snips-hotword-model-heysnipsv4 -y) because for me it is the best wake word at the moment.

After this install the snips-satellite venv … test it (should work) … install service …

that’s it. In this way I’ve installed many Pi Zero satellites which all work pretty well.

kaykoch · March 29, 2020, 5:46pm

Update:

Zeros are working

I tried today again with newer docker images.
I installed the newest docker image from 27.3.20.
With that I could rhasspy work on all zeros with ReSpeakaer.
HOORAY

Jabra USB Problem

wrong Sound CARD NUMBER in rhasspy used

But the problem with usb speaker remains. Still no sound.
I was deactivating the onboard sound card on pi and zeros by default.

 dtparam=audio=off

This was working on all satellites but the one with jabra. All of them have ReSpeaker2.
Aplay recognized them as CARD 0. (aplay -l)
The USB-Jabra was recognized as CARD 1. Even if i deactivate onboard Card.
I activate the onboard soundcard and plugged in a speaker.
Rhasspy works. Rhasspy works always. Even, If i changed to USB Speaker
So, I think, there is a fault in the last Rhasspy image

The last rhasspy 2.5 image is ALWAYS using CARD 0. You can change the Device to what ever you want.
I solved this problem by changing the Index Number: (WORKARROUND)
!! USB is the id of the card (USB [Jabra SPEAK 410 USB]) !!

Create a file: /etc/modprobe.d/alsa-base.conf

insert:

options snd_usb_audio index=0
options snd_usb_audio id="USB"

reboot

After that, aplay -l gives:

 Karte 0: USB [Jabra SPEAK 410 USB], Gerät 0: USB Audio [USB Audio]

Now, also the last one is working.
HOORAY

marryTTS

wrong locale in rhassppy given

For those, who uses marryTTS:
Check the locale code with

 http://IP-ADRESS:59125/locales

It gives you the write one.
For a german voice it is de
IT IS NOT THE ONE WHICH IS GIVEN IN RHASSPY (de-DE)

Something more

I’m using two wakewords on different satellite, so I want different voices.
This works, if you use a standard one, which is controlled by master, and a special one which is controlled by this special satellite. You have to point the TTS to the master for standard and the special one directly to marryTTS.
IMPORTANT: Only the standard satellite names should be in the masters Satellite siteIds in TTS.
If you leave the special one also inside, you get the answer twice.

So this was a good Weekend.
@ synesthesiam

Please look for the sound card problem
change locale in marryTTS

And thanks for all your great work.

geoffrey · March 29, 2020, 8:42pm

Thank you for the description of how you set it up.

It’s not that much of an issue to go through it as it e.g. the new Docker image for the satellite does work. It’s just that if I encounter these things, others might as well, so I dug a little deeper.

frkos · March 30, 2020, 7:16am

Guys, does anyone have issues with speech recognition using Kaldi?
Especially if you have PS3eye… I’m stuck with this

When I play last voice command through the interface I hear my voice clearly…

jrb5665 · March 30, 2020, 1:59pm

Hi @Martin_Maier,
I experience the some of issues that @geoffrey reported and have not been able to successfully run it as a service. I haven’t tried his workaround for starting the services. I haven’t been able to add much to the discussion so far and was wondering if I would have to wait for this to get more mature.

It’s interesting that you have had no problems.

I asked you a while back about your setup and even blew my setup away a couple of time to make sure I followed what you did properly. The only difference I think I have is that I am trying to use snowboy, not snips hotword, but I don’t think that explains the issues trying to run it as a service.

I haven’t tried the docker image yet

hawkeye217 · March 30, 2020, 3:46pm

I have a PS3eye and sometimes recognition has been sketchy (with both Kaldi and Pocketsphinx), but I think that’s mainly because I have the gain turned up - the few issues I’ve been having are with silence detection and the recording not terminating if I’m far away from the mic.

Though I have noticed that Kaldi sometimes doesn’t recognize me as well as Pocketsphinx does…

Is silence detection adjustable in 2.5 as it was in 2.4?

EDIT: Another question: should webrtcvad parameters be adjusted on the master or on the satellite? And are those parameters adjustable and functional as they were in 2.4?

Martin_Maier · March 30, 2020, 4:52pm

Hi jrb5665, I did a fresh installation with only the rhasspy-satellite venv. The Test with ‘bin/rhasspy-satellite --profile en’ run Ok. Then I installed the service … started it … and got some errors from mqtt connection. So I started the test again … changed MQTT to external (ip = satellite-ip, port = 1883) … stopped the test … started service again … and now it runs. It seems there are still some problems with the internal mqtt but of course this is a satellite its mqtt connection will be always external … and so this should not be a problem.

geoffrey · March 30, 2020, 7:50pm

The errors I receive are not restricted to only not being able to start mosquitto, that’s what I first noticed.

In the end it are all of Rhasspy’s service that are not able to start through supervisor. So even after connecting to my main external broker, the next service fails to launch.

Thank you for testing along. Together we will squash all the bugs

jrb5665 · March 30, 2020, 9:52pm

Mine is also a satellite and always set to external

synesthesiam · March 31, 2020, 2:19pm

Just a heads up: lots of bug fixes will be coming soon, hopefully by the end of today. The move to Python 3.7 and a newer version of the quart web framework broke quite a bit more than I expected. But it has made the code ultimately simpler and easier to maintain!

I’m adding more and more unit tests, so bugs should stay squashed for good

As one example, the websocket API was completely broken outside of a web browser because the quart guy decided that the HTTP Origin header was now a requirement, despite being explicitly optional in the spec. Node-RED doesn’t set this header, so I have to monkey patch quart to fix the problem

geoffrey · March 31, 2020, 2:35pm

Something else I noticed in NodeRed when connecting to the external MQTT broker is that the messages are not recognized as JSON objects but rather plain text.

I have to set the Output property of the mqtt in node to a parsed JSON object instead of the default auto-detect (string or buffer).

Your previous post made me think of it I came across this when having issues with the websockets connection and tried changing to MQTT.

Thank you for the continued effort and heads up!

thomas_cologne · March 31, 2020, 8:19pm

Not a bug, but maybe a proposal for one of the next versions:

Sometimes it´s hard to hit the correct position in the upper bar. Maybe it is possible to enlarge the link not just to the red marked zone, but extense it to the blue marked?
Does it makes sence?
As I said, just a smaller remark.

synesthesiam · March 31, 2020, 9:11pm

I agree. I’m always hoping someone will really want to take the web stuff off my hands…

fastjack · April 7, 2020, 12:37pm

3 posts were split to a new topic: Rhasspy 2.5: Porcupine library issue

NathanC · April 2, 2020, 7:46am

Hey! I just joined the forum-- I set up rhasspy ~ 2 weeks ago (used to use Snips).

This project is super cool, and I just tried the pre-release and it works like a charm (although I need to use the rhasspy dialog manager, hermes has some issues). Excited to leverage the new expanded hermes spec-compliance in my python intent handler.

One thing I noticed is that audio frames are getting streamed over MQTT even w/o the wake word being detected. What do you think about some (potentially default) setting to only stream the frames after the wakeword is detected? Otherwise mic data is going to be going over the queue 24/7 which seems privacy impacting.

I’m going to look into contributing to this project later! Also, my background is in Comp Sci but my current focus is AppSec-- feel free to ping me if you ever want to bounce around security questions or do any threat modeling.

side note: the new web UI for the pre-release is a lot smoother, especially the settings section! great job