I have 2.4.19 installed on 2 x230 laptops and an Intel NUC with various microphones on manjaro and ubuntu running in a venv. I am using kaldi/openfst and experience a slightly annoying problem both with porcupine as well as snowboy as wakeword listener. Commands like “turn all lights on” or “make reading area 50 percent” work well - however “tell me a joke” or “set reading area to 50 percent” are shortened to “me a joke” and “reading area to 50 percent” both with porcupine and snowboy in about 80% of my tests (snowboy seems to perform a tiny bit better). Recording directly from the web interface (holding the button) always detects the first word.
I played with various vad settings (0,1,2,3) but things only seem to get worse the more aggressive the vad. Making a break after the first word like “set … reading area to 50 percent” makes the recognition more likely.
I looked and played with them when using porcupine - I tried smaller values as the default - not seeing any improvement. I will give it now another go with snowboy.
Do I understand it correctly that I have to decrease these buffer to throw less away or do I have to increase the buffer if more should be recorded before the actual command activity?
Anyway at the moment I am using a not very good microphone, so I don’t know if it could depend on it, but I could hear the recorded audio (in the web gui) and it actually seemed to be missing of the first part.
I had also to add speech_buffers to 0 for webrtcvad to resolve my problem.