Speaking without pause between wakeword and sentence

KiboOst · January 11, 2020, 10:01am

Hi,

Just a question, but with Google Assistant you can say in one go “Ok google turn on the light”

Why, with snips, rhasspy etc, we have to trigger the wakeword, wait for listening start, and finnally talk ?

Why the asr can’t have entire stream and when wakeword is recognized, decode what was said immediately after wakeword and stop when silence is detected ?

I guess it’s not doable otherwise it would work yet, but why google can do it and not open source assistants ?

synesthesiam · January 12, 2020, 12:52am

It is actually doable, just a bit harder. The Python SpeechRecognition library does this with snowboy, for example.

You can add this as an enhancement on Github, and we can look into it

romkabouter · January 13, 2020, 6:57am

I think Snips could do that as well.
Internally, they use some kind of rewind feature on the Hermes but I do not know any details

koan · January 13, 2020, 7:51am

Indeed, Snips did it by adding extra metadata in the audio messages, which I bumped into when I was developing hermes-audio-server:

[…] the rewind and replay is what allow us to reduce drastically the necessary gap between hotword detection and asr start of decoding […]

Since then, Snips has documented the format in their source code.

But I haven’t looked at how the wakeword and ASR components use this metadata. I don’t speak Rust (yet), and I don’t think Snips has open-sourced the code needed to investigate this.

fastjack · January 13, 2020, 8:54am

I think Google (and Amazon probably) do the same thing with their devices. Someone posted about the records containing spoken words before the wakeword « Ok Google ».

romkabouter · January 13, 2020, 9:29am

Yes, that is most probably the reason.

nYce · November 23, 2020, 10:42pm

I would also love this feature.
I would love to contribute, so if someone is adding this feature i’m willing to help.