Possible to have Rhasspy get secondary input after first intent recognition?

Hello!

Before I dive deep into the API itself to do something manual (if possible), I wanted to ask if there’s a way for Rhasspy to prompt for secondary input. Imagine implementing something like an echo intent.

me: Hey Porcupine
rhasspy:
me: echo what I say back to me
rhasspy:
me: Some random sentence that hasn’t been trained or anything, just a random English sentence.
rhasspy: “Some random sentence that hasn’t been trained or anything, just a random English sentence.” (via text to speech)

What I’m looking for is a built in way to do the above, if it exists. I’m currently sending intents to home assistant, if that helps (or hurts) my chances of using something out of the box.

Hi Mike and welcome.

I have seen this discussed here previously, but have not got involved personally - however hopefully I can point you in the right direction before other more knowledgeable users respond.

My understanding is that Rhasspy is really a toolkit of voice assistant services, which you can use with your own app or several existing home automation projects (including Home Assistant). I believe that Home Assistant handles simple intents, and does not have the capability for extended dialog “out of the box” … which does not mean that it’s impossible, just requires a different approach :wink:

I believe the Rhasspy Dialog Manager see (Services - Rhasspy) is the part of most interest here; and so I suggest you do searches for keywords “dialog” and “session” for discussions which might help.

Thank you! I’ll dive into the API and see what’s possible.

Might want to check my recent post out. Not fully working but getting there.

@JoeSherman Thank you for directing me to the post. I’ll check it out.
I was able to figure something out on my own. Continuing the dialog session was straight forward! Just sent the correct json to the Hermes topic on MQTT as an intent response from Home Assistant. That allowed the Rhasspy satellite to re-prompt for more input.

HOWEVER I think a blocker for me is that I had assumed Rhasspy could just take arbitrary non-trained speech and turn it into text. That does not seem possible from what I’ve noticed.

Is there anyway to have Rhasspy understand speech that hasn’t been directly trained?

@Mike1

That was part of my post. I have been successful in getting Rhasspy to understand anything I say. The key is the Vosk STT server with the full english model. The responses are not recognized as valid intents so currently my setup still plays the unrecognized intent sound. But the response is handled correctly by my hermes app. Currently all it does is repeat back to you want you say and asks you to say something again. You end up in an endless conversation with Rhasspy where kit repeats the last thing you said and then asks you to say something else. If you say “No” it stops the intent handling. It certainly isn’t a useful skill yet but it does work in my setup.

Wow answering my future question from the past! I should have checked out your code before commenting. I’ll dive into it now! Thank you!

Hopefully this is a related enough question to ask here - is there a way to have Rhasspy ask for an “authorization” (any predetermined word or code) after recognizing the intent but before sending it to HA? I’d like to use this for certain actions that I don’t want anyone to be able to trigger and also add a step in the process to avoid accidental intent handling since the actions could cause problems.

Oooh, I can see a use for this … but it would be easy for someone to overhear and remember the authorisation code :frowning: Maybe it would be better for Rhasspy to recognise whose voice issued the command ?

Unfortunately neither of these are available in Rhasspy AFAIK.

Honestly I really only need it as a second-step safety option and not for real security purposes. Turning on a heater is something I want to have any type of confirmation for. I’ve already had it turn on once when I was just talking near the microphone.

Ahhh, my misunderstanding. I think false positives will always be a problem, though some modules are better than others.

My approach is to get Rhasspy to give a voice confirmation of its actions, like “Bedroom heater turned on”. A bit tedious to then turn off any false positives, and I expect in time the false positives will go from me wondering “how did it get that from what was actually said ?” to being just annoying.

My understanding is that the limitation isn’t Rhasspy (which is basically just a toolkit of voice functions), but the layer behind it. Home automation and Rhasspy have been oriented towards a simple command and response structure - but to create more of a conversation would require developing a layer which fits between. There have been several threads here about using Rhasspy for conversations; there are probably some worth following up.

You can use the hermes/dialogueManager/continueSession topic for that.

https://rhasspy.readthedocs.io/en/latest/reference/#dialogue-manager

Here is a search:
https://community.rhasspy.org/search?q=hermes%2FdialogueManager%2FcontinueSession

There might be something in there to make it work for you :slight_smile:

Thanks…I’ll check those out! I do need something - my short term fix was to add the phrase “authorization code XXXX” to the beginning of the sentence for Rhasspy to learn after separating the intent and handling for these specific actions. I thought that would eliminate false positives but it triggered something yesterday when a YouTube video was on and there didn’t seem to be anything close to that very complicated series of words spoken in the video!