Minimum confidence in Kaldi

Does tweaking this help cure fake triggers? I’m trying to find a way to stop the occurrences of a mis-heard intent triggering a random return, its a bit of a pain as it usually turns the tv or lights off :slight_smile:

It does. Altough in my short term experience with this, i found that most false positives come from a complicated sentence structures with lots of optional words.
I have now set my minimum confidence to 0.9 and so far it works relatively well. I had to change some sentences though that were consistenly making troubles.

Its a bit vague - i have an intent - “tell me the weather”

I am monitoring the result in node red and if i say “tell me the” it has a confidence of 0.99165548 so almost perfect

I’ll try upping to 0.9999999

ASR confidence has nothing to do with intent confidence (but it could).

If you say “tell me the” and Kaldi recognize “tell me the” then the ASR confidence will be close to 1 but the intent confidence will probably be far less because it is missing the word “weather”.

Increasing the confidence threshold to 0.99999 will only hurt by increasing false negatives.

How many intents do you have in your dataset? How many different words? I think increasing the size of the dataset could help (more intents, more words, more possibilities).

Interesting.
I’m just trying to find a way to reduce or stop it sending random intents when it mis-hears a command, there must be a way :slight_smile:

Currently 10 intents listed

If its the short sentence, i’m not sure there is a fix as you dont want to have to reel off a paragraph just to get the time - not many ways to make “whats the time” longer really :slight_smile:

I have been using “tell me the [time]{action}” so making the slot appear empty unless “time” is found, this means i can detect an empty 1st slot in node red and abort. It seems to work but means you must remember that its the 1st slot always and a single word intent like “skip” for skipping a media track will not work at all

tell me the [time]{action} explains it all.

By using [time] you make the word time optional. Your intent will be recognized if you say tell me the hence the high confidence score.

I do not think this is the correct way to go.

If your intent is GetTheTime, then the word time is pretty important (probably the most important in the utterance). If you want some kind of “all-in-one” intent (which I don’t advise), put your actions in a actions slot and use ($actions){action}.

Hope this helps.

Yes but what it does do is spit out an empty {slot} if “time” is not recognised - that empty slot can be trapped in node red.

There is no bracketting method for words that “must” be found???

Every word that is not between [ and ] must be found for the intent to match (if using fsticuffs, not fuzzywuzzy).

Can you post your sentenses.ini?

That makes sense, maybe i am chasing two different things then. I’ll take the optionals out and start afresh.

[GetTime]
what [time] {time} is it
tell me the [time] {time}
whats the [time] {time}

[GetWeather]
tell me the [weather] {weather}
whats the [weather] {weather}

[GetDate]
whats the [date] {date} [today]
what [date] {date} is it [today]
tell me the [date] {date}

[GetTemperature]
whats the [temperature] {temperature} (inside | outside) {area}
how [(hot | cold)] {hotcold} is it (inside | outside) {area}

[SkipMusicTrack]
skip [track] {track}

[ChangeLightState]
turn (on | off) {state} [the] ($lights) {name}
turn [the] ($lights) {name} (on | off) {state}

[ChangeLightColor]
set [the] ($lights) {name} [to] ($colours) {color}
make [the] ($lights) {name} ($colours) {color}

[ChangePlexState]
turn [plex] {plex} (on | off) {state}

[SetTimer]
minutes = (1){min} minute | (2…59){min} minutes
seconds = (1){sec} second | (2…59){sec} seconds
set [a] timer for
set [a] timer for (1){min} and (a half){sec:30!int} minutes
set [a] timer for
set [a] timer for [and]

[CancelTimer]
stop {action} [the] [timer] {timer}
cancel {action} [the] [timer] {timer}

Yep. You seem to confuse intents with slots. In your intent handler (Node RED?) you need to check the recognized intent to trigger actions. The slots represent the variable part of the intent (a duration, a temperature, a location, etc.).

[GetTime]
what time is it
tell me the time
whats the time

[GetWeather]
tell me the weather
whats the weather

[GetDate]
whats the date [today]
what date is it [today]
tell me the date

[GetTemperature]
whats the temperature (inside | outside) {area}
how [(hot | cold)] is it (inside | outside) {area}

[SkipMusicTrack]
skip [track]
...

Yes, in node red i first had it checking for [GetTime] intent name but it was found that on a false trigger it would spit out a random intent, mostly harmless but sometimes turned the TV off etc :slight_smile:

So i added the variable slots - node red now checks slot(0) first and if that is empty it aborts, if it contains something, doesnt matter what, then it goes on to check the intent name and act accordingly.

its the random actions that were causing the grief

I am going back to standard and taking out the slot-checker in node red, lets see what happens…

Since 2.5.10 the minimum confidence will greatly reduce the false positives and the random recognized intentents.

Previously, just silence or gibberish would result in any on my defined intents to be triggered.
Now, when you check the ASR Confidence on these cases, they are somewhere below 0.5 in most cases. This can be filtered out now with this minimum confidence score.

So yes, give it another try. Should be much improved.

Currently set on 0.5 - you feel that should be a little higher maybe??

I can only recommend to try it out. If you are using MQTT, then listen to the messages and check the score. Try saying several real sentences and maybe some gibberish. And then choose based on a level that you feel gives you the best result.

Playing about and monitoring in Node Red i get some confidence figures…

“whats the time” = 1, intent passed = [GetTime]
“whats the (cough)” = 0.9811175, intent passed = [GetTime]
“whats the dog” = 0.774761, intent passed = [GetDate}
“tell me the time” = 1, intent passed = [GetTime]
“tell me the dog” = 0.9142964, intent passed = [GetDate]
“tell me the (cough)” = 0.594773, intent passed = [GetTime]
“whats the temperature outside” = 0.99392109, intent passed = [GetTemperature], slot = outside
“whats the temperature jelly” = 0.833085, intent passed = [GetTemperature], slot = outside

So yes the numbers do vary but its going to be tight to split, probably a confidence of 0.99 maybe??

A value of 0.99 helps but the more complex stuff like timers fail 100% as the highest score i seem to is around 0.76 for some reason ??

[SetTimer]
minutes = (1){min} minute | (2…59){min} minutes
seconds = (1){sec} second | (2…59){sec} seconds
set [a] timer for
set [a] timer for (1){min} and (a half){sec:30!int} minutes
set [a] timer for
set [a] timer for [and]

[CancelTimer]
stop {action} [the] [timer] {timer}
cancel {action} [the] [timer] {timer}

What NLU system are you using? I’d advise fsticuffs.

Yes it is set on fsticuffs

I wonder if I should normalize the Minimum Bayes Risk “confidence” value in Kaldi by sentence length or something? It’s not really clear from the docs how it’s supposed to be used.

That sounds good but its way above my head :slight_smile:

1 Like