Rhasspy eats first word after wakeword

ulno · March 9, 2020, 1:42pm

I have 2.4.19 installed on 2 x230 laptops and an Intel NUC with various microphones on manjaro and ubuntu running in a venv. I am using kaldi/openfst and experience a slightly annoying problem both with porcupine as well as snowboy as wakeword listener. Commands like “turn all lights on” or “make reading area 50 percent” work well - however “tell me a joke” or “set reading area to 50 percent” are shortened to “me a joke” and “reading area to 50 percent” both with porcupine and snowboy in about 80% of my tests (snowboy seems to perform a tiny bit better). Recording directly from the web interface (holding the button) always detects the first word.

I played with various vad settings (0,1,2,3) but things only seem to get worse the more aggressive the vad. Making a break after the first word like “set … reading area to 50 percent” makes the recognition more likely.

Any ideas what I could change/test?

frkos · March 9, 2020, 2:18pm

Hi @ulno
Did you try to change value for throwaway_buffers?

https://rhasspy.readthedocs.io/en/latest/command-listener/#webrtcvad

ulno · March 9, 2020, 6:24pm

I looked and played with them when using porcupine - I tried smaller values as the default - not seeing any improvement. I will give it now another go with snowboy.

Do I understand it correctly that I have to decrease these buffer to throw less away or do I have to increase the buffer if more should be recorded before the actual command activity?

ulno · March 9, 2020, 6:32pm

Setting the throwaway buffers to 1 seems to take care both in snowboy and porcupine (on my desktop) - need to test on the other pcs still.

ulno · March 9, 2020, 9:42pm

Yep, works on everything slightly modern.
Can’t use my intel NUC with celeron though as it cannot run openfst - required for kaldi.

leoben7 · May 6, 2020, 4:02pm

I had this problem as well…

Anyway at the moment I am using a not very good microphone, so I don’t know if it could depend on it, but I could hear the recorded audio (in the web gui) and it actually seemed to be missing of the first part.

I had also to add speech_buffers to 0 for webrtcvad to resolve my problem.