Dear community and Rhasspy developers,
This time I would like to offer you to develop a feature that potentially would make your product far more supperior in terms of the results and usability compared to any other software available in this field.
As an introduction, let me explain the source of the problem. We all know, that direct voice commandig is a stone age in home automation universe. At this stage we all want to communicate to our assistants as if those were a living human beings. To implement this we usually build a dialogues, so in a form of conversation one could set the lights or climate. The necessity for this originates from a simple fact that we are human beings, our memory is not perfect and nobody of us can store all that commands we make in our heads, neither our family members could know what options for setting are available.
So, normally, as I do it, I start conversation, example:
- Rhasspy, lets do the lighting.
- O.k. where do you want me to set the lighting, no room was mentioned
- What rooms are there on the first floor
- There are rooms r1, r2… ri found on the first floor
- What room am I in?
- The name of your room is…
and so on. It is easy to teach voice assistant the skills of defining the location, available lights in the room, available scenes for the light, so it would tell you what you can do in order you to decide on your action. So far Rhasspy does well !!!
But here comes the pitfall, it does not set the context for the incomming request. Let me explain. For example, you have a scene named ‘warm’ for the lights, and you want to set a tempereture to make the room warm - when you do your command you will always face the problem, that every now and then Rhasspy would return you a wrong intent - for the heating, when you talk about the lighting, and for the lighting, when you talk about the heatig…
SOLUTION:
To avoid this happening you need to introduce ‘context setting command’ - in your sentences file, among your other sentences, you have the one(ore several), marked with some flag, that say to Rhasspy, that these are context setters. For example:
[context_setter]#context
(Set the context. let’s do the):doing (heating | lighting | climate | music)
So, when this command commes to Rhasspy, it remembers exact outcome of this, let suppose you’ve asked about heating, so it remembers: “doing heating”.
Then, as long, as the context is set, it ADDs automatically “doing heating” to every of your recognised text, and passes this modified string to the intent recognition module. For example, the command: “make it warm in the livingroom” would be passed to Kaldi as “doing heating, make it warm in the livingroom”. So this way you would never recieve a wrong intent back!
Then it is enough to say: “release the context”(or any other phrase you’ve defined to release the context) for the Rhasspy to stop doing context setting.
Incase you have forgotten to release the context Rhasspy would return you a JASON with the intent: “no intent recognised for that context” so you know, that you’ve forgotten to release the context and you have to do it.
And one more important thing - it remembers the context for EVERY SATELITE, so it would return the contexted ‘heating’ intents to the livingroom and contexed ‘climate’ intents to the badroom.
This is a very simple and fast implementing feature, since it does not require any additional coding apart from that it is alredy in the Rhasspy - you only need to recompose abit what you already have, and could make Rhasspy standing out from the line of numerouse voice assistant software for it would make building the dialogues with 99.(9)% accuracy so easy.
Please feel free to comment on my idea, and hopefully, we can see this implemented in the next release already!
Thanks for your time!