Listener doesnt stop after done talking

dblanc28 · August 27, 2020, 6:36pm

Hello,

I am currently running the 2.4.2 version on a raspberry pi zero as a satellite and all is going well except one thing. i speak the wake word and it wakes, i speak the command and it hears me but after im done talking, it continues to listen until it times out after 30 seconds. after the 30 seconds it processes what i said correctly. it just seems like it cant tell when i stopped talking.

I actually have 2 different setups doing the same thing. i have one pi zero using the google aiy kit and another pi zero using the 2 mic seeed-voicecard

if anyone has seen this before or knows of a solution that would be great

thank you

koan · August 27, 2020, 7:11pm

The 2.4 release isn’t maintained anymore. Can you migrate to Rhasspy 2.5?

dblanc28 · August 27, 2020, 7:15pm

thanks Koan. I have done some testing and playing around with 2.5 and ran into some problems on the pi zero with it. i found that 2.5 was more resource intensive and with the limited computing power the pi zero has i felt it was better to stick with the 2.4

thanks for your comment!

coxtor · October 15, 2020, 10:43am

Hi, I am on 2.5.6 and am having similar issues. The wake word works quite well using Snowboy. However, after issuing a command it never stops listening until I say a random word really loud after a short period of silence. I am using the ps2 eye cam. In prior versions I had no issues.

coxtor · October 15, 2020, 10:49am

So what I have figured out based on the logs, is that the time to “receiving audio” and the confirmation sound, that audio recording is active, takes approximately 4 seconds -> thus creating my issue. All is done on a ASRock J3455-ITX cpu which has more than enough ressources to provide.

coxtor · October 15, 2020, 10:53am

A simple hacky solution could be to actually play the sound when the device is receiving audio. That at least would create less confusion. Is there a way to implement this ?

fastjack · October 15, 2020, 5:38pm

This looks like an issue with the silence detection component (VAD) configuration. Maybe you have some background noises that prevent Rhasspy from detecting the end of utterance. Try looking in earlier topics of this forum. I remember some folks tinkered with the parameters to improve end of utterance detection.

coxtor · October 16, 2020, 8:05am

Hi thanks for your suggestion however, I don’t think that is the issue here is a section of my log output, you can see that the process simply takes over two seconds until the device is prepared to record audio after the sound has played.

rhasspy    | [DEBUG:2020-10-16 08:01:22,450] rhasspyasr_pocketsphinx_hermes: Receiving audio
rhasspy    | [DEBUG:2020-10-16 08:01:31,223] rhasspywake_porcupine_hermes: -> HotwordDetected(model_id='/profiles/de/porcupine/blueberry_linux.ppn', model_version='', model_type='personal', current_sensitivity=0.5, site_id='default', session_id=None, send_audio_captured=None, lang=None)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,223] rhasspywake_porcupine_hermes: Publishing 210 bytes(s) to hermes/hotword/blueberry_linux/detected
    rhasspy    | [DEBUG:2020-10-16 08:01:31,226] rhasspyserver_hermes: <- HotwordDetected(model_id='/profiles/de/porcupine/blueberry_linux.ppn', model_version='', model_type='personal', current_sensitivity=0.5, site_id='default', session_id=None, send_audio_captured=None, lang=None)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,227] rhasspydialogue_hermes: <- HotwordDetected(model_id='/profiles/de/porcupine/blueberry_linux.ppn', model_version='', model_type='personal', current_sensitivity=0.5, site_id='default', session_id=None, send_audio_captured=None, lang=None)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,227] rhasspydialogue_hermes: Playing WAV /profiles/de/sounds/beep_hi_when.wav
    rhasspy    | [DEBUG:2020-10-16 08:01:31,231] rhasspydialogue_hermes: -> HotwordToggleOff(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,231] rhasspydialogue_hermes: Publishing 44 bytes(s) to hermes/hotword/toggleOff
    rhasspy    | [DEBUG:2020-10-16 08:01:31,232] rhasspydialogue_hermes: -> AsrToggleOff(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,232] rhasspydialogue_hermes: Publishing 44 bytes(s) to hermes/asr/toggleOff
    rhasspy    | [DEBUG:2020-10-16 08:01:31,233] rhasspywake_porcupine_hermes: <- HotwordToggleOff(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,233] rhasspydialogue_hermes: -> AudioPlayBytes(165932 byte(s)) to hermes/audioServer/default/playBytes/ad03a606-6b7f-42dc-8a0d-0c602cf173a3
    rhasspy    | [DEBUG:2020-10-16 08:01:31,233] rhasspywake_porcupine_hermes: Disabled
    rhasspy    | [DEBUG:2020-10-16 08:01:31,235] rhasspydialogue_hermes: Waiting for playFinished (timeout=2.130816326530612)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,237] rhasspyasr_pocketsphinx_hermes: <- AsrToggleOff(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:31,238] rhasspyasr_pocketsphinx_hermes: Disabled
    rhasspy    | [DEBUG:2020-10-16 08:01:33,201] rhasspydialogue_hermes: <- AudioPlayFinished(id='ad03a606-6b7f-42dc-8a0d-0c602cf173a3', session_id='')
    rhasspy    | [DEBUG:2020-10-16 08:01:33,201] rhasspytts_wavenet_hermes: <- AudioPlayFinished(id='ad03a606-6b7f-42dc-8a0d-0c602cf173a3', session_id='')
    rhasspy    | [DEBUG:2020-10-16 08:01:33,204] rhasspydialogue_hermes: -> HotwordToggleOn(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,204] rhasspydialogue_hermes: Publishing 44 bytes(s) to hermes/hotword/toggleOn
    rhasspy    | [DEBUG:2020-10-16 08:01:33,206] rhasspydialogue_hermes: -> AsrToggleOn(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,207] rhasspydialogue_hermes: Publishing 44 bytes(s) to hermes/asr/toggleOn
    rhasspy    | [DEBUG:2020-10-16 08:01:33,208] rhasspydialogue_hermes: Starting new session (id=default-blueberry_linux-241c713d-d64d-43a5-a590-dda59794e434)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,210] rhasspydialogue_hermes: -> DialogueSessionStarted(session_id='default-blueberry_linux-241c713d-d64d-43a5-a590-dda59794e434', site_id='default', custom_data='blueberry_linux', lang=None)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,210] rhasspywake_porcupine_hermes: <- HotwordToggleOn(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,210] rhasspywake_porcupine_hermes: Enabled
    rhasspy    | [DEBUG:2020-10-16 08:01:33,211] rhasspydialogue_hermes: Publishing 145 bytes(s) to hermes/dialogueManager/sessionStarted
    rhasspy    | [DEBUG:2020-10-16 08:01:33,213] rhasspydialogue_hermes: -> HotwordToggleOff(site_id='default', reason=<HotwordToggleReason.DIALOGUE_SESSION: 'dialogueSession'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,214] rhasspydialogue_hermes: Publishing 50 bytes(s) to hermes/hotword/toggleOff
    rhasspy    | [DEBUG:2020-10-16 08:01:33,213] rhasspyasr_pocketsphinx_hermes: <- AsrToggleOn(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,214] rhasspyasr_pocketsphinx_hermes: Enabled
    rhasspy    | [DEBUG:2020-10-16 08:01:33,215] rhasspydialogue_hermes: Listening for session default-blueberry_linux-241c713d-d64d-43a5-a590-dda59794e434
    rhasspy    | [DEBUG:2020-10-16 08:01:33,216] rhasspydialogue_hermes: -> AsrStartListening(site_id='default', session_id='default-blueberry_linux-241c713d-d64d-43a5-a590-dda59794e434', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id='blueberry_linux', intent_filter=None)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,216] rhasspydialogue_hermes: Publishing 217 bytes(s) to hermes/asr/startListening
    rhasspy    | [DEBUG:2020-10-16 08:01:33,219] rhasspywake_porcupine_hermes: <- HotwordToggleOff(site_id='default', reason=<HotwordToggleReason.DIALOGUE_SESSION: 'dialogueSession'>)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,220] rhasspywake_porcupine_hermes: Disabled
    rhasspy    | [DEBUG:2020-10-16 08:01:33,220] rhasspyasr_pocketsphinx_hermes: <- AsrStartListening(site_id='default', session_id='default-blueberry_linux-241c713d-d64d-43a5-a590-dda59794e434', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id='blueberry_linux', intent_filter=None)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,221] rhasspyasr_pocketsphinx_hermes: Starting listening (session_id=default-blueberry_linux-241c713d-d64d-43a5-a590-dda59794e434)
    rhasspy    | [DEBUG:2020-10-16 08:01:33,264] rhasspyasr_pocketsphinx_hermes: Receiving audio

fastjack · October 16, 2020, 9:09am

Looking at the logs you provided I would suspect the culprit is either the beep_hi_when.wav audio file or its playback.

Waiting for playFinished (timeout=2.130816326530612)

This might indicate that Rhasspy estimate the length of the audio file to be around 2 seconds. I do not see the publishing of AudioPlayFinished topic in the logs (only its reception), is the audio out service on another device?

Is the beep_hi_when.wav a custom sound?
How long is it?
What sample rate and bit depth?
Is it played successfully?

coxtor · October 16, 2020, 9:38am

Awesome - thanks it didn’t occur to me that could be the issue - default audio works fine.
I use MQTT Hermes to play the sound remotely. The file I use is 1 sec long and has a bit rate of 705kbps at 51000 Hz Mono. So I guess I have to say goodbye to that sound…