I’m not quite sure that confidence levels are being fully honoured. Here’s an example.
I just came home from collecting a package and my daughter asked what was in the box. I replied “It’s another small computer”. Even though my microphone is in the living room, some six odd metres away from the front door, my wake word (computer) was detected.
As the ASR system does, it guessed what the following sounds might mean and decided that the result was good enough to send the perceived intent off to Home Assistant! In the log, I can see the following result:
[DEBUG:2021-04-06 15:22:41,309] rhasspyasr_kaldi_hermes: Transcription result: Transcription(text='toggle kodi', likelihood=0, transcribe_seconds=0.3336428640031954, wav_seconds=1.024, tokens=[TranscriptionToken(token='toggle', start_time=0.0, end_time=0.0, likelihood=0.425189), TranscriptionToken(token='kodi', start_time=0.0, end_time=1.02, likelihood=0.675266)])
When I then actually said “toggle kodi” to turn my TV off again, the result was:
[DEBUG:2021-04-06 15:28:38,781] rhasspyasr_kaldi_hermes: Transcription result: Transcription(text='toggle kodi', likelihood=0.99822817, transcribe_seconds=1.3164074919986888, wav_seconds=1.856, tokens=[TranscriptionToken(token='toggle', start_time=0.0, end_time=0.81, likelihood=1.0), TranscriptionToken(token='kodi', start_time=0.81, end_time=1.86, likelihood=1.0)])
Even though the first overall likelihood was zero, it seems that the likelihood from the individual words was what actually determined the final score.
What I can tweak to ensure random nonsense does not get turned into a valid intent?