Rhasspy 2.5: Wake word and audio feedback issue

j3mu5 · April 3, 2020, 6:33pm

Hello, everybody,

I hope you and those close to you are well and everyone is healthy.
Thanks for the great work on Rhasspy 2.5.

Some things already work pretty well in my docker-based 2.5.0-pre:
MQTT, Audio Recording, Intent Handling, WakeWord (Snowboy), STT (Kaldi), Inten Recognition, TTS (PicoTTS and MaryTTS - @kaykoch thanks for the tip!).

But I don’t get any audio feedback at the moment.
However, I can follow up on the log to see that spoken text and the Snowboy Wakeword is recognized.
[DEBUG:2020-04-03 18:22:54,387] rhasspyserver_hermes: <- HotwordDetected(modelId='33_hey_pico_tone_mod_full_2x', modelVersion='', modelType='personal', currentSensitivity=0.9, siteId='default', sessionId='', sendAudioCaptured=None)

But with Porcupine or Precise I don’t get that far. I can select the WakeWord files stored in the respective folders, save and restart rhasspy or train - but in the log I do not see any confirmation that the respective Wake Word was recognized.

Is this to be expected in the current stage of 2.5.0-pre, or should Porcupine or Precise already work?

Andrew49 · April 3, 2020, 6:48pm

Hi @j3mu5,
Porcupine is working for me with Rhasspy 2.5-pre in Docker. (Also Snowboy but it gave me a lot of false positives. I haven’t tried Precise.)
It took a while to get to that stage, I found that Home Assistant (v. 106.6 running in a different container on same machine - Intel NUC) was interfering and preventing the mic from working with Rhasspy. Don’t know if you have anything like this? I also found that the volume setting for the mic took a while to find the “sweet spot” where it picks up the voice, but not so loud that distortion occurs. You can listen to your recordings to judge the sound quality. Another thing, I only got it working reasonably well when I switched from the internal MQTT broker to my external one. And finally, Rhasspy still seems to lose the plot sometimes and I need to restart the Docker container (not the button in Rhasspy) to get it listening again, but in between it works nicely and links with Home Assistant to carry out intents and return speech.
Hope these things help you. Good luck!

geoffrey · April 3, 2020, 8:14pm

I was the same issue as well that I no longer hear the sounds play when a wakeword is detected, the message is recorded or the intent not recognized, using the 2.5.0-pre-arm32v6 Docker image on a Raspberry Pi Zero.

The log did however show the wakeword being detected and the intents are recognized.

I just updated to the most recent version of the 2.5 Docker image for the server and now the sound issue is resolved.

HTH

hawkeye217 · April 4, 2020, 12:11am

Yep, fixed for me too. But now the wake sound itself appears in the recording of the command… Wonder if that was an intentional or unintentional side effect of moving the sounds to the dialogue system.

j3mu5 · April 4, 2020, 9:02am

New installation of Rhasspy via docker-compose. today on 4.4. on new, freshly installed Raspberry Pi4. I will use this one now for trying around with 2.5.

I found the get-rhasspy script in the documentation. Is this necessary for the restart button in Rhasspy to work as it should? So I would have to restart Rhasspy either employing the script or manually in bash via “docker-compose restart rhasspy”. Is that correct?
I am a bit confused, because I now have a “docker-compose.yaml” outside the Rhasspy profile folder and additionally I find a “docker-compose.yml” in the de profile folder. In the “docker-compose.yaml” outside the Rhasspy profile folder I have a stack of containers and defined that Rhasspy will not start until MaryTTS is started.
Which docker-compose should I use?

On the new pi the output of sound works fine now. By copying the Rhasspy profile from the new installation to the old pi the sound output worked there, too! Either copying the profile or updating the docker image on the old Pi solved the sound problem. Thank you @geoffrey @synesthesiam

j3mu5 · April 4, 2020, 2:26pm

Porcupine and Snowboy both work now - I think the trick was to first restart the Docker container with the “Restart” button in the web interface and then restart the whole Docker container.
Or maybe it’s the current image of 2.5.0-pre. Nevertheless, it works!

Unfortunately Precise does not trigger yet. What I have tried so far:
Set a precise model from here or my self-trained one (both files *.pb and *.pb.params).

Investigate the start of Rhasspy with supervisord.log: Snowboy, Porcupine and Precise show no error message here, for all 3 I see something like this (unless I refer to a not existing wake word - then there are error messages insted):
2020-04-04 14:05:37,039 INFO spawned: 'wake_word' with pid 159

Follow start with “docker-compose up” or ws: Again, I don’t see a specific error message for Precise. It looks like precise is started.

I tried different wake words with a sensitivity of 0.5 or 0.9 and a trigger level of 3. Under no circumstances will the wakeword be triggered.

Is the sensitivity like Snowboy and Precise a value between 0 and 1, where higher = more sensitive?

What is the trigger level? I have found an explanation here (which I think refers to precise), but I don’t understand it completely. It sounds like: Lower trigger level, more sensitive.

Others also use trigger level = 2.

Martin_Maier · April 4, 2020, 2:26pm

Hi together,
today I’ve installed the latest docker 2.5.0-pre on my test master (Ubuntu Server) and the latest rhasspy-satellite venv on a test Pi-Zero with respeaker 2. This combination is now nearly unusable because most of the time after wakeword detection, which was shown with led and beep indication, an error beep and error led follows immediately without the possibilty to speak any command. This will happen approximately 8 times on 10 tries. If after the wake word only the listen led is turned on without beep the command will be recognized and the Ok led comes without beep. The same device combination works with the older software (14 days ago) accept some minor issues pretty well.

hawkeye217 · April 4, 2020, 2:53pm

Just recently, @synesthesiam committed some changes to 2.5-pre that moved the sound handling to the dialogue system. I’m assuming this is why the wake sound now appears in the command recordings and is resulting in the unusable behavior for you. With my setup it still seems to work fairly reliably, though. I’m sure he will figure out a solution soon!

Martin_Maier · April 5, 2020, 7:46am

Hi @hawkeye217,
thanks for the reply. First I’ve tried to disable the audio playing which doesn’t work (I deleted the sound entry within profile.json on the satellite but sound still apears). After setting the level of the sound to 0 the satellite works well again.

hawkeye217 · April 5, 2020, 12:52pm

Yes, I also tried to remove the wav file line in my profile, and it still played the sound as well. You should create an issue on Github so that @synesthesiam will see it.

synesthesiam · April 5, 2020, 1:50pm

Setting the path to the WAV file to an empty string should disable the sound. I’m still working on getting the timing right here; unfortunately, I don’t see this issue on any of my test systems (Pi zero/2/3/desktop).

The dialogue manager should be waiting for the wake sound to finish playing before having the ASR start listening. What makes this complicated is that there may not be an audio playing service running (and hence no finished message), so I have it time out after the duration of the WAV file. I may need to add an option to delay recording by some number of seconds, since this approach won’t take the time to transmit the play message and start playing the WAV file.

Ah, the joys of distributed systems…

hawkeye217 · April 5, 2020, 2:05pm

Thanks for all your hard work @synesthesiam. If it helps, here’s a log capture from my satellite where I end up with the wake sound in the command recording. In this specific instance, it did what @Martin_Maier was experiencing - not only was the wake sound in the recording, but it immediately gave me the error sound and I wasn’t able to speak a command at all.

[DEBUG:2020-04-05 14:02:25,714] rhasspywake_snowboy_hermes: Wake word detected: computer_activate (siteId=musicpi4)
[DEBUG:2020-04-05 14:02:25,731] rhasspywake_snowboy_hermes: -> HotwordDetected(modelId='computer_activate', modelVersion='', modelType='personal', currentSensitivity=0.55, siteId='musicpi4', sessionId='', sendAudioCaptured=None)
[DEBUG:2020-04-05 14:02:25,744] rhasspywake_snowboy_hermes: Publishing 171 bytes(s) to hermes/hotword/computer_activate/detected
[DEBUG:2020-04-05 14:02:25,810] rhasspyserver_hermes: <- HotwordDetected(modelId='computer_activate', modelVersion='', modelType='personal', currentSensitivity=0.55, siteId='musicpi4', sessionId='', sendAudioCaptured=None)
[DEBUG:2020-04-05 14:02:25,855] rhasspyspeakers_cli_hermes: <- AudioPlayBytes(83948 byte(s))
[WARNING:2020-04-05 14:02:25,879] rhasspyserver_hermes: Dialogue management is disabled. ASR will NOT be automatically enabled.
[DEBUG:2020-04-05 14:02:25,893] rhasspyspeakers_cli_hermes: ['aplay', '-q', '-t', 'wav']
[DEBUG:2020-04-05 14:02:25,872] rhasspywake_snowboy_hermes: <- HotwordToggleOff(siteId='musicpi4', reason='playAudio')
[DEBUG:2020-04-05 14:02:25,926] rhasspywake_snowboy_hermes: Disabled
[DEBUG:2020-04-05 14:02:26,270] rhasspywake_snowboy_hermes: <- HotwordToggleOn(siteId='musicpi4', reason='playAudio')
[DEBUG:2020-04-05 14:02:26,287] rhasspywake_snowboy_hermes: Enabled
[DEBUG:2020-04-05 14:02:26,325] rhasspywake_snowboy_hermes: <- HotwordToggleOff(siteId='musicpi4', reason='dialogueSession')
[DEBUG:2020-04-05 14:02:26,320] rhasspymicrophone_cli_hermes: <- AsrStartListening(siteId='musicpi4', sessionId='musicpi4-computer_activate-94d8fc23-974e-44c6-a18e-5d37d1cb643b', stopOnSilence=True, sendAudioCaptured=True, wakewordId='computer_activate')
[DEBUG:2020-04-05 14:02:26,340] rhasspymicrophone_cli_hermes: Disable UDP output
[DEBUG:2020-04-05 14:02:26,359] rhasspywake_snowboy_hermes: Disabled
[DEBUG:2020-04-05 14:02:26,589] rhasspyspeakers_cli_hermes: -> AudioPlayFinished(id='ec8aa8db-8e9f-4f8f-a524-189b187e49ae', sessionId='')
[DEBUG:2020-04-05 14:02:26,613] rhasspyspeakers_cli_hermes: Publishing 63 bytes(s) to hermes/audioServer/musicpi4/playFinished
[DEBUG:2020-04-05 14:02:27,572] rhasspymicrophone_cli_hermes: <- AsrStopListening(siteId='musicpi4', sessionId='musicpi4-computer_activate-94d8fc23-974e-44c6-a18e-5d37d1cb643b')
[DEBUG:2020-04-05 14:02:27,695] rhasspymicrophone_cli_hermes: Enable UDP output
[DEBUG:2020-04-05 14:02:27,745] rhasspywake_snowboy_hermes: <- HotwordToggleOn(siteId='musicpi4', reason='dialogueSession')
[DEBUG:2020-04-05 14:02:27,738] rhasspyspeakers_cli_hermes: <- AudioPlayBytes(119908 byte(s))
[DEBUG:2020-04-05 14:02:27,847] rhasspyspeakers_cli_hermes: ['aplay', '-q', '-t', 'wav']
[DEBUG:2020-04-05 14:02:27,831] rhasspywake_snowboy_hermes: Enabled
[DEBUG:2020-04-05 14:02:27,930] rhasspywake_snowboy_hermes: <- HotwordToggleOff(siteId='musicpi4', reason='playAudio')
[DEBUG:2020-04-05 14:02:27,871] rhasspyserver_hermes: <- NluIntentNotRecognized(input='', id='', siteId='musicpi4', sessionId='musicpi4-computer_activate-94d8fc23-974e-44c6-a18e-5d37d1cb643b')
[DEBUG:2020-04-05 14:02:28,045] rhasspywake_snowboy_hermes: <- HotwordToggleOn(siteId='musicpi4', reason='playAudio')
[DEBUG:2020-04-05 14:02:28,124] rhasspyserver_hermes: Sent 98 char(s) to websocket
[DEBUG:2020-04-05 14:02:28,343] rhasspywake_snowboy_hermes: Disabled
[DEBUG:2020-04-05 14:02:28,439] rhasspywake_snowboy_hermes: Enabled
[DEBUG:2020-04-05 14:02:28,460] rhasspywake_snowboy_hermes: <- HotwordToggleOff(siteId='musicpi4', reason='playAudio')
[DEBUG:2020-04-05 14:02:28,540] rhasspywake_snowboy_hermes: <- HotwordToggleOn(siteId='musicpi4', reason='playAudio')
[DEBUG:2020-04-05 14:02:28,608] rhasspywake_snowboy_hermes: Disabled
[DEBUG:2020-04-05 14:02:28,657] rhasspywake_snowboy_hermes: Enabled
[DEBUG:2020-04-05 14:02:29,345] rhasspyspeakers_cli_hermes: -> AudioPlayFinished(id='41a8b490-f7e0-4cf3-9296-ba71e4a7f860', sessionId='')
[DEBUG:2020-04-05 14:02:29,363] rhasspyspeakers_cli_hermes: Publishing 63 bytes(s) to hermes/audioServer/musicpi4/playFinished
[DEBUG:2020-04-05 14:02:29,398] rhasspyspeakers_cli_hermes: <- AudioPlayBytes(155492 byte(s))
[DEBUG:2020-04-05 14:02:29,435] rhasspyspeakers_cli_hermes: ['aplay', '-q', '-t', 'wav']
[DEBUG:2020-04-05 14:02:30,717] rhasspyspeakers_cli_hermes: -> AudioPlayFinished(id='73156c31-4d9e-4b02-9f9b-a8348a9f47b3', sessionId='')
[DEBUG:2020-04-05 14:02:30,733] rhasspyspeakers_cli_hermes: Publishing 63 bytes(s) to hermes/audioServer/musicpi4/playFinished

synesthesiam · April 6, 2020, 7:32pm

Hi @hawkeye217, I see from your log that you have dialogue management disabled. What are you using to start/stop the ASR, etc.?

hawkeye217 · April 6, 2020, 10:48pm

Yep, this is the log from the satellite where I have dialogue management disabled per the Getting Started Guide for a shared MQTT broker. Dialogue management is enabled on the master, also per the guide. Should it be different?