Is there a way to setup a 'default' or 'fallback' intent?

fenyce_m · June 25, 2020, 9:05am

Hello!

I am using intent matching with fuzzywuzzy and I am loving it!

However, if the user says words which are not part of any of the intents in the sentences.ini file, the system forces a match with one of the existing intents and returns it (with lower confidence than 1, but not always). The result is that random words trigger an intent.

Is there a way to set a ‘default’ intent which is returned when no other intent is matched?

frkos · June 28, 2020, 6:57pm

Hi @fenyce_m
It’s a dirty trick, but here is my solution

fenyce_m · June 28, 2020, 7:32pm

Thanks @frkos!

Definitely a bit dirty but good to know it works for you as it would have taken quite a bit of time to test it myself!

Do you find the intent matching is slower with so many words in the [DUMMY] intent? Which recogniser and handler are you using? I am using Kaldi and fuzzywuzzy. However, I have also found that Fsticuffs is less agreesive in that respect and doesn’t force the intent as fuzzywuzzy does.

frkos · June 28, 2020, 7:56pm

I’m using kaldi and fsticuffs.
I see no performance degradation. It’s super fast

synesthesiam · June 29, 2020, 2:46pm

Fuzzywuzzy goes out of its way to force an intent. It’s very fuzzy

There are some things that we could do, though, to improve things. The dialogue manager in Rhasspy has support for enabling/disabling intents during different turns in a conversation (session). If you have yes/no response intents, they probably aren’t meant to start a session. We should have some easy way to specify which intents are “on” by default. Others are enabled explicitly with continueSession messages.

Another idea is a modification to rhasspy-fuzzywuzzy that lets you set confidence thresholds per intent. For yes/no intents then, you could set the threshold to 1 to require an exact match. Everything else will be subject to the global threshold you configure in the web UI.

I will say too that fsticuffs is usually the better choice if you’re just speaking to Rhasspy. It tends to only accept sentences that can be spoken (according to sentences.ini). Fuzzywuzzy works better if you also want to accept text chat, where you might have typos or misspellings.

fastjack · June 29, 2020, 2:59pm

Handling the “none” intent (when speaking an utterance that is not event close towhat was defined) is rather complicated.

I’ve experimented with noise generation for the ASR/NLU part but that did not work out the way I expected (ASR accuracy went down and NLU results started acting funny).

I’ve also noticed that Snips struggled with the recognition of the “none” intent (the NLU tries very hard to match an existing intent).

@synesthesiam Having a enabled/disabled by default property for each intent would be cool!

DanielW · June 29, 2020, 6:42pm

Yes, this is very much needed. Sometimes you want to use very generic terms like Yes, No, Stop in a “skill” and they should be off by default.

synesthesiam · June 29, 2020, 7:01pm

It’s technically possible to have the enable/disable per intent right now, but it might not behave how people expect.

Training Rhasspy creates both a speech and intent model. It’s easy to toggle parts of the intent model, but the speech model has to be re-created for each combination of intents (Kaldi is an exception).

So while you could disable the yes/no intent, Rhasspy will still pick up words/sentences from that intent. I’ve experimented a little with having the ASR system pick up the continueSession messages and re-generate speech models for each new combination of intents. This could be slow on a Pi if the user has a lot of intents, so I haven’t pushed it yet.

Any thoughts?

DanielW · June 29, 2020, 7:17pm

How? I know that I can provide a list of Intents to filter for in the coninuteSession but they have to exist (be enabled) in Rhasspy.

Regarding the speech model always containing all intents even disabled one: Is that really such a big issue? I find Rhasspy even usable in open transcription mode.

Think about this way: If continueSession would be used to filter for YES and NO as possible Intents and you would generate a new speech model containing only YES and NO the user could say someting else altogether and it matches either YES or NOW “at random”. While in practice you would like to prompt the user with “Please answer with yes or no” if they answer with something else. So having a speech model with more stuff it it would be better in this case.

fastjack · June 29, 2020, 7:47pm

I think the enabled/disabled state (and the intent ecosystem) is only for the NLU part.

The NLU system should score all the intents, filter on the intentFilter list (or the enabled ones if not provided) and return the one with the best score. Below a threshold you should get intentNotRecoginzed. I have no idea how to handle the « none » intent in the current setup though.

The NLU service should be the one responsible for the enabling/disabling of the intents via MQTT topics so it can be changed dynamically.

I do not think it is easy (nor advisable) to manipulate the language graph at runtime (though it has been done with Kaldi).

My 2c

synesthesiam · June 29, 2020, 7:59pm

The hermes/dialogueManager/configure message will set the default intent filter for new sessions in the dialogue manager. After that, you can override it within a session using the intentFilter property of startSession or continueSession.

I don’t think it’s a big issue for most people (and it seems @fastjack agrees). Maybe the simplest way forward is to add a part to the web UI where you can specify which intents are enabled up front and just let the NLU system take care of it from there?

koan · June 29, 2020, 8:18pm

I think we should also be able to enable/disable specific intents by default in the sentences.ini file. This way app developers can specify this in the sentences.ini file they distribute so the user doesn’t have to configure this in the UI.

synesthesiam · June 29, 2020, 8:21pm

How do you think that should be specified?

koan · June 29, 2020, 8:31pm

As it’s an ini file, we’re bound by the syntax ConfigParser understands. My first thought was to (mis)use comments for this purpose, but the ConfigParser._read() method ignores comments, so unless we want to override this method, this doesn’t work.

I’m not sure about another way to specify this, maybe a special value that won’t be interpreted as a sentence?

synesthesiam · June 29, 2020, 9:03pm

I’d thought about adding metadata to an intent with an @ syntax, something like:

[MyIntent]
@foo
a sentence for the intent @bar

Then we would have @foo assigned as metadata to MyIntent and @bar as metadata for the first (and only) sentence.

We could then have some known metadata tags, like @disabled, etc.

koan · June 29, 2020, 9:34pm

This looks nice, yes.

itsMattShull · January 7, 2021, 3:49pm

Is there an update on implementation of this feature?

synesthesiam · January 9, 2021, 2:15am

Not yet, sorry. If you’re using Fsticuffs, then you’ll at least get an “intent not recognized” message that you can handle in NodeRED or something.

rickmini · January 16, 2021, 1:05pm

I just update to 2.58, and i am struggling to get an Intent recognizer to work reliably. I am running on a Raspi 4 - 4meg. with a full upgrade to the lastest Raspi as of 1/15.
The system is fast and very responsive, but if i cough, mumble, say jabberwocky , i always get a recognized intent.
For example if i say jabberwocky i get an intent of roboServo, if i cough i get a response of roboQuery:

[roboServo]
srv_num = ( 0…32 ){num}
srv_degs = ( 0…180 ){degs}
servo <srv_num> [to] <srv_degs>

[roboQuery]
query_cmd = ( time | date | name | address | distance | information ){cmd}
tell me [what] [is][the] <query_cmd>
tell me [who] [is] [the] <query_cmd>
tell me [the] <query_cmd>
tell me [your] <query_cmd>
tell me [when] [is][the] <query_cmd>

earlier in this post you mentioned that fuzzywuzzy was “real fuzzy” and fisticuffs will only respond to my sentences. In this example i have fisticuffs selected and the []fuzzy check box unchecked. (i have tried checked and unchecked)

I must be doing something wrong, because all ambient noises and any sentences give me (seemingly) random intent results. I hardly ever get a “intent not recognized” result

I also can listen to my voice command using your Play button on the web interface and it is perfectly clear, so that cant be an issue. (I am using the Respeaker USB 4 mic)

what exactly does the fuzzy checkbox do?
is it my sentence structure that is the issue here? (see roboServo sentence above)

Any advice would be appreciated.

RaspiManu · January 17, 2021, 3:11pm

Sorry that I will not solve your problem, but I just wanted to give you a hint about the sentences:

You could use tell me [( what | who | your | when )] [is] [the] <query_cmd> to have all the sentences for this intent in one line and keep things more compact, or did you do it intentionally?