Conversation to fill in slots

CrankyCoder · December 13, 2021, 10:26pm

I have been going through the forums and reading docs on the dialogue management and continuing sessions.

I was wondering on 2 things.

I noticed that it seems alot of the dialogue management stuff is remotely triggered vs wake word triggered. Is that correct? Or am I misunderstanding? (examples of dialogue continuing from wakework detection would be fantastic!! )
Is there a way to format the sentences and slots so that an intent is identified even without the slots being filled in. Example below.

Lets say I want to do something like booking a flight (this is my go to example lol)

I would like to be able to say “book me a flight” and the intent would be recognized. BUT that intent before it can be handled has slots that are needed.

TravelFrom (where you flying from)
TravelTo (where you flying to)
TravelWhen (when do you want to fly)

I know there are more things like (number of people flying, which airline, ect) but for this example lets keep it easy with those 3.

So The idea is “Book me a flight” is a recognized intent. When the intent goes to the intent handler, how would you show that the slots are empty? How would you format the sentences?

I was thinking that if the intent of “Book a flight” was recognized, but the expected slots (don’t know how to indicate expected slots either) would be empty, then the dialogue manager would respond and do the continueSession and prompt for the missing info.

But I am unsure how to do this, or really where to get started. I can do python development so Im not afraid of taking that approach, just not sure how to format the sentences/slots/requirements.

Thanks!

rejoe2 · December 14, 2021, 12:01pm

Not really sure if this helps:
First, you may use wake word detection to start some automatics, but that’s not really how to do some kind of dialogue with Rhasspy. At least in my personal view, wakeword (in most cases) will just open STT functionalities “ears”.

Sentences may include optional stuff. E.g. I have an intent (excerpt)

[de.fhem:SetOnOff]
rooms=([(im|in dem|auf dem|in der|auf der)] $de.fhem.Room{Room})
openclose=((hoch|auf){Value:on} | (zu|runter){Value:off})
[...]
<cmddrive> [<den>] $de.fhem.Device-blind{Device} [<rooms>] <openclose>

matching e.g. “mach den rollladen auf” (“set the blind open”) or “mach die jalousie im esszimmer runter” (“set the venetian blind in the dining room close”) with slots “Device”, “Value” (on or off) and (optional) “Room”.
The rest of the analysis then is done on the controller side (FHEM). Either there’s enough data provided (“rollladen” may be unique or not, but unique in combination with the provided Room), then the action is executed (and - at least at the moment - the session is closed). Or there remain several options, then the dialogue will be kept open (FHEM’s job to do this), and additional info is requested. To allow for short answers, this is handled by seperate intents like these:

[de.fhem:ChoiceRoom]
nimm [das Gerät] ( aus ( dem | der ) | im | den | die ) $de.fhem.MainRooms{Room}

[de.fhem:ChoiceDevice]
ich hätte gerne [das Gerät] $de.fhem.Aliases{Device}

and the possible/expected answers will be spoken by Rhasspy.
All slots starting with “$de.fhem.” are automatically filled from FHEM side, so “Aliases” means sth. like the most used name to identify a device, and “MainRooms” are a smaller subset of “Rooms”.
When a request for additional info is initiated, most other intents get deactivated.

In other words: Imo you will have to do a lot of coding on the controller side to get your flight booked .

Note: if there are too much possibilities, then no request is returned, but just the message about the fact there had not been provided enough info…

CrankyCoder · December 15, 2021, 2:44am

Ok, so it sounds like in my case, maybe wrote a python app, set that as my intent handler, pass some stuff off to home assistant.

But for my “flight” example, I would continue the session. You mention disabling the other intents. I saw some tv internet about intentFilter. Is that what you are referring to? Is that intentFilter an “include” filter or filter out exclude filter (not at my computer so not as easy to check it out)

It definitely seems like if someone is doing home autiomation and wants to do something like I’m thinking. Writing some sort of middleware seems needed.

I saw home intent project, but don’t think it’s doing dialogue type stuff yet.

rejoe2 · December 15, 2021, 8:32am

Don’t know about any video on intentFilter. My own steps wrt. to dialogue management can be found here: “lost in dualogues”. At the start of the journey, I just wanted an option to prevent direct actions in FHEM but request a confirmation first (switching my stereo on and off all the time wasn’t that funny ).
So basically there are 2 dialogue specific intents in sentences.ini (I’d now recommend to just use one) called ConfirmAction and CancelAction:

[de.fhem:ConfirmAction]
( ja mach | tu es | ist ok | aber gerne doch ){Mode:OK}
( lieber doch nicht ){Mode}
( Geh zurück | bleib da ) zurück{Mode:Back}
( Mach | geh ) weiter{Mode:Next}

[de.fhem:CancelAction]
(lass es | nein | abbrechen | abbruch ){Mode:Cancel}

As you can see, CancelAction not really is needed any more, as the “Mode” content is sufficient to decide on the “app” side how to proceed, so I’d recommend to kick that out, as more intents also means: you have to disable it when not needed/wanted. “CancelAction” holds the shortest possible answers in my entire set of sentences. (Atm. the “Back” and “Next” options are not really used).

Basically, when FHEM/the “app” starts, it will disable it’s own dialogue specific intents (the “Choice”-intents and the two generic for cancel/confirm) for all siteId’s, and then will do that again in case of detection of the filters not be set any longer. This may happen when Rhasspy is restarted somewhen later.

When beeing in some sort of dialogical mode, the needed intents will be activated for the single siteId the dialogue is relevant for, see also my short summary for the restrictions here.

So the FHEM plugin acts somehow like a “middleware”:

It will provide options to do some labeling on (e.g.) all actuators and sensors in the FHEM installation (e.g. to provide “speakable” names for these items (and the groups or rooms they make part of, what type they are and so on),
collect that internally, so the entire structure is visible to the user/system admin
derive the data to build (quite a few) slots in Rhasspy
listen to the MQTT traffic to do actions when requested

You may find the code here , but most likely this is rather “hard stuff” to follow, because (beside beeing written in Perl) it’s highly relying to other parts of the entire FHEM code and quite a lot of the logics stuff for handling intents is very complex unfortunately (up to 60 as McCabe score).

There are a couple of users in FHEM using this methods, as the entire code doesn’t contain any individual stuff, thats all done by adding the mentionned labeling to actuators and sensors.

Most likely, doing some more extented writeup in the “lost” thread would be helpfull for others, but that’s really hard work, as one has to have a look at all involved parts in parallel.