Advice to use rhasspy offline and combine online services

Hi,
i’m very impressed and grateful, thx.
What i like to do is using rhasspy and node-red for basic use cases
but features like a todo list i think (as far i understand the possibilities) i need a online STT
“porcupine” -> “add todo” -> start listening or pipe recording to an online service.
How you realized that kind of scenarios?

Define an intent and in case the intent gets detected just call your to do list api. Thats how I would do it. Which todo list are you exactly using?

Hi thanks for your reply!
It’s more a general question with the intention having rhasspy with privacy
and being able to “porcupine” -> “ask (google | Alexa)” -> “what ever”(no privacy)
I found this morning the following thread and with out a deeper look it seem as a solution

Does rhasspy Support two STT Engines now?

Does rhasspy Support two STT Engines now?

Hmm, I don’t recall Rhasspy having a fallback STT engine.

Assuming that you want to recognize a far variety of words (which makes sense for a todo list) you would probably have to resort to an online STT service.

I’d recommend using hermes/dialogueManager/continueSession after recognizing the intent “add todo” to ask the user which item he wants to add to the todo list.

This gets a lot more tricky if you want to use two different STT engines, because you won’t be able to use the dialogue API in this case (as Rhasspy does not offer an option to change the STT engine on-the-fly; correct me if I’m wrong).

I will have a look at it. thank you.
Just to get more clear, i want google assistant as a sub system which does not listen all the time.

I did exactly that by using the intent filter property of the hermes/dialogueManager/continueSession.

When receiving an askGoogle intent, send a continueSession message with a dummy intent name (that does not exists) and the sendIntentNotRecognized property enabled

Use the subsequent hermes/asr/startListening and hermes/asr/stopListening to listen for hermes/audioServer/<siteId>/audioFrame and push the audio chunk to Google Assistant SDK.

When receiving audio data from Google Assistant SDK use hermes/audioServer/<siteId>/playBytesStream to play it on the correct siteId.

You can even handle multi turn dialogue with the same technique.

@synesthesiam I had to add a little magic to my dialogue manager to pause the session timeout when the audioServer of the siteId of the session is playing audio.

Works nicely for me.

2 Likes

Sounds great,
so first i changed the “dialog management” to Hermes MQTT.
Is this the right documentation https://docs.snips.ai/reference/hermes?
Do i need node-red or comparable?
Does google listen all the time in your setup?
Rhasspy, hermes even google sdk are pretty new to me, so
i do need time to figure out what you describe.
Many thanks

If this is not already the case, I strongly advise to update to Rhasspy 2.5 because the Hermes protocol is a first class citizen from this version and onward.

Rhasspy follows what is detailed in that documentation so you should find what you need in there. There is also a complete Rhasspy Hermes reference here:
https://rhasspy.readthedocs.io/en/latest/reference/

In my setup Google does not listen. My skill listen for audio chunks between ASR start/stop listening topics and pushes them to Google SDK. So I control exactly what and when I send audio data to the cloud.

Figuring things out is the best part :blush: Enjoy

Hope this helps.

@fastjack ok, that help a lot to get a first understanding.
i will explore and try it out and come back with smarter questions.
Thank you very much for your time.

Technically everything is running. I can send and receive all hermes msg.
I receive hermes/intent/askGoogle and do hermes/dialogueManager/continueSession
with:
{“sendIntentNotRecognized”: true, “intentName”:“googleSDK”}

Use the subsequent hermes/asr/startListening and hermes/asr/stopListening to listen for hermes/audioServer/<siteId>/audioFrame and push the audio chunk to Google Assistant SDK.

I’am using node-red not sure how the flow should work.
in parallel continueSessionand startListening?
Should i set a custom siteId to be able to seperate event hermes/audioServer/CUSTOM/playBytesStream?
may i ask for screenshot of your flow :slight_smile: ?

Unfortunately I’m not using Node-Red (I made my own skill system) so I cannot help you with the flow design. :confused:

@fastjack Do you use “Rhasspy” or “Hermes MQTT” for “Dialouge Managememt” ?

Actually I’m using my own homemade system (since I’m not a python dev, I wanted something I can more easily tinker with) but it also uses the Hermes protocol so I’m positive it can also be done with Rhasspy (that I follow very closely).

Using Hermes MQTT means you have snips-dialogue running somewhere (not a good idea as it is now defuncted).

I suggest to use the Rhasspy dialogue management as it is a maintained one to one port of the Snips’ one.

@synesthesiam Is there still a need for the Hermes-MQTT dialogue management? Maybe it can be removed from the list? The Rhasspy-dialogue one should be enough?

1 Like

I’d like to keep it in case someone wants to create an entirely new dialogue service. All of the services (wake word, STT, etc.) have a “Hermes MQTT” option that really means “the user knows there’s a service out there to handle this, just listen for MQTT messages”. Maybe it could be better named?

i have made some progress

listen on intent askGoogle

hermes/intent/askGoogle

then send continueSession

hermes/dialogueManager/continueSession
 {  
    sessionId: payload.sessionId,
    intentFilter: ["GoogleSDK"],
    sendIntentNotRecognized: true,
    customData: {
        action: "GoogleSDK"
 }

on continueSession send startListening

hermes/asr/startListening
{
    sessionId: msg.payload.sessionId,
    siteId: msg.payload.siteId,
    wakewordId: msg.payload.wakewordId,
    stopOnSilence: true,
    sendAudioCaptured: true   }

Now i can say what i like to and it’s recorded. Now i’am able to send this to
google assistant sdk and process it’s response.

But hermes/nlu/query has happen even when if intentFilter: ["GoogleSDK"]
What i do wrong here?

And what would be the best event to listen and send recorded speech ?
After hermes/dialogueManager/sessionEnded took a while but in other events
like rhasspy/asr/default/default/audioCaptured payload does not include intent so
feel not able to filter that like initial hermes/intent/askGoogle ?

Thank’s for reading and maybe suggestion.

hermes/hotword/porcupine/detected {“modelId”: “/usr/lib/rhasspy/rhasspy/rhasspywake_porcupine_hermes/porcupine/resources/keyword_files/raspberrypi/porcupine.ppn”, “modelVersion”: “”, “modelType”: “personal”, “currentSensitivity”: 0.5, “siteId”: “default”, “sessionId”: null, “sendAudioCaptured”: null, “lang”: null}
hermes/hotword/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/audioServer/default/playFinished {“id”: “a6ae1f74-dd02-4180-9be4-ad7d68c304cf”, “sessionId”: “”}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/dialogueManager/sessionStarted {“sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “siteId”: “default”, “customData”: “porcupine”, “lang”: null}
hermes/hotword/toggleOff {“siteId”: “default”, “reason”: “dialogueSession”}
hermes/asr/startListening {“siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “lang”: null, “stopOnSilence”: true, “sendAudioCaptured”: true, “wakewordId”: “porcupine”, “intentFilter”: null}
hermes/asr/textCaptured {“text”: “ask google”, “likelihood”: 1, “seconds”: 1.3692768319997413, “siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “wakewordId”: null, “asrTokens”: null, “lang”: null}
hermes/hotword/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/audioServer/default/playFinished {“id”: “ccaad002-717a-40ed-8120-64078105f22b”, “sessionId”: “”}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/stopListening {“siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “dialogueSession”}
hermes/nlu/query {“input”: “ask google”, “siteId”: “default”, “id”: null, “intentFilter”: null, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “wakewordId”: “porcupine”, “lang”: null}
hermes/nlu/intentParsed {“input”: “ask google”, “intent”: {“intentName”: “askGoogle”, “confidenceScore”: 1.0}, “siteId”: “default”, “id”: null, “slots”: [], “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”}
hermes/intent/askGoogle {“input”: “ask google”, “intent”: {“intentName”: “askGoogle”, “confidenceScore”: 1.0}, “siteId”: “default”, “id”: null, “slots”: [], “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “customData”: null, “asrTokens”: [[{“value”: “ask”, “confidence”: 1.0, “rangeStart”: 0, “rangeEnd”: 3, “time”: null}, {“value”: “google”, “confidence”: 1.0, “rangeStart”: 4, “rangeEnd”: 10, “time”: null}]], “asrConfidence”: null, “rawInput”: “ask google”, “wakewordId”: “porcupine”, “lang”: null}
hermes/dialogueManager/continueSession {“sessionId”:“default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”,“intentFilter”:[“GoogleSDK”],“sendIntentNotRecognized”:true,“customData”:{“action”:“GoogleSDK”}}
hermes/asr/stopListening {“siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”}
hermes/hotword/toggleOff {“siteId”: “default”, “reason”: “dialogueSession”}
hermes/asr/startListening {“siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “lang”: null, “stopOnSilence”: true, “sendAudioCaptured”: true, “wakewordId”: null, “intentFilter”: null}
hermes/asr/textCaptured {“text”: “what time is it to the”, “likelihood”: 1, “seconds”: 1.818475637001029, “siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “wakewordId”: null, “asrTokens”: null, “lang”: null}
hermes/hotword/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/audioServer/default/playFinished {“id”: “706423be-751b-426f-873d-153b04f8f20f”, “sessionId”: “”}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/stopListening {“siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “dialogueSession”}
hermes/nlu/query {“input”: “what time is it to the”, “siteId”: “default”, “id”: null, “intentFilter”: [“GoogleSDK”], “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “wakewordId”: “porcupine”, “lang”: null}
hermes/nlu/intentNotRecognized {“input”: “what time is it to the”, “siteId”: “default”, “id”: null, “customData”: null, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”}
hermes/hotword/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOff {“siteId”: “default”, “reason”: “playAudio”}
hermes/audioServer/default/playFinished {“id”: “18dfc890-f0ae-4a03-91cc-25f8e4368f0d”, “sessionId”: “”}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/asr/toggleOn {“siteId”: “default”, “reason”: “playAudio”}
hermes/dialogueManager/intentNotRecognized {“sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “siteId”: “default”, “input”: “what time is it to the”, “customData”: {“action”: “GoogleSDK”}}
hermes/asr/stopListening {“siteId”: “default”, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”}
hermes/dialogueManager/sessionEnded {“termination”: {“reason”: “timeout”}, “sessionId”: “default-porcupine-b42b2613-ba54-430e-9d66-095234a0e000”, “siteId”: “default”, “customData”: {“action”: “GoogleSDK”}}
hermes/hotword/toggleOn {“siteId”: “default”, “reason”: “dialogueSession”}