Mute music during Rhasspy session

Hello, I am trying to integrate Rhasspy 2.5 with Home Assistant conversation using free speech and Almond assistant.

Using Node Red I am:

  1. listening on hermes/nlu/query for STT results
  2. sending recognized text to HA using websocket API
  3. sending HA reply back to Rhasspy hermes/tts/say and generating empty Intent

That all work as expected. But I would like to also mute and unmute my snapcast audio playing on my Rhasspy smart speaker.

For muting it seems to be simple, I can listen for either hermes/dialogueManager/sessionStarted, or for hermes/hotword/toggleOff

  1. The problem with hermes/dialogueManager/sessionStarted seems to be that it is NOT being send when triggering Rhasspy via webserver Recognize button. But when I start session with Wake word it function OK.

  2. For umuting I was not able to find good trigger. I can’t use hermes/dialogueManager/sessionEnded, as it is terminated BEFORE potential reply is read back by the Rhasspy STT.

Also sessionId in dialogueManager doesn’t corresponds to sessionId in hermes/audioServer and hermes/tts, so it is very hard to correlate these events.

  1. Almond seems to supports having multiple “rounds” inside one session:
    https://almond.stanford.edu/doc/almond-dialog-api-reference.md#interaction-model

It would be very nice to keep Rhasspy dialogue session open without need to use wake word again. This is currently limited, as I think HA Almond API doesn’t transfer round/session data yet, but that is for another community.:slight_smile: )

Did anyone try similar things? Any hints please? Thank you.

I think the problem is going to be ‘mute’ is such a simple word with many homonyms, 'unmute; guess isn’t so bad.

‘Rhasspy mute’ & ‘Rhasspy unmute’ are probably going to be far less prone for error as guess you could set up multiple KWS commands and why a bit worried about the simplicity and homonyms of ‘mute’.

Think you can do Multiple Wake Words

I am sorry for confusing you, I am not muting Rhasspy! I am muting snapcast music playing on the same RPi speaker Rhasspy works - see my original text:

That all work as expected. But I would like to also mute and unmute my snapcast audio playing on my Rhasspy smart speaker.

So what I am looking for are ONLY MQTT messages I can use as triggers to mute/unmute my snapcast audio (using node red) So mute/unmute as written above ment for snapcast, not for Rhasspy.

Yeah but that is all that snapcast is doing from memory if you are playing on the same RPi speaker that Rhasspy works.

You are basically doing MQTT -> Snapcast server -> Snapcast embeds volume -> Snapcast client mutes local volume.

If you have multiple snapcast clients then yeah your way is better if one then its a choice of either

I am afraid you still do not understand what I am trying to achieve. Could you please try to read my post one more time?

In short I know very well how to mute / unmute Snapcast client, what I am looking for is the right Rhasspy mqtt message as a trigger to such actions. I described my issues with Rhasspy dialogue manager mesaages, both having some quirks.

So if you want to reply, please try to read my post and understand it first, before firing some code. Thank you.

Its not code its the Snapcast JSON RPC Control API (server).

Its how you can control snapcast remotely either way.

Hello,

you keep replying to something I did not asked for. :see_no_evil: Please READ first.

  • I do not NEED you to explain howto control Snapcast! I know that very well and I never asked fot that! Please forgot my word snapcast altogether ok?
  • I only would like to know WHICH MQTT Rhasspy message is best to use as a trigger to mute ANY audio playing - as it would indicate Rhasspy dialogue manager starts and consequently which message to use as trigger to un mute audio after Rhasspy finish playing audio response if any.
  • and no, obvious hermes/dialogueManager/sessionStarted and hermes/dialogueManager/sessionStarted doesn’t produce fully correct results. Please read in my original post WHY.

And finally please if you are not able to provide meaningful and on topic answer to these questions, than please not answer at all, thank you.

I would suggest to stick to the hermes/dialogueManager/sessionStarted and hermes/dialogueManager/sessionEnded topics because they are the correct/best ones for your use case (muting/unmuting playback during a session).

@synesthesiam The behavior or the sessionEnded topic described does not seems correct to me (it should only trigger when the TTS has finished playing). Maybe it is a bug in the dialogue manager.

The “Recognize” button is not there to initiate a session but to test the ASR/NLU systems.

If you really need to use other topics (don’t :wink: ), I would suggest hermes/hotword/toggleOff or hermes/asr/startListening to mute. Unmuting is more complicated… maybe hermes/hotword/toggleOn.

Hope this helps.

1 Like

Thank you very much for reply.

I fully agree, that hermes/dialogueManager/sessionStarted and hermes/dialogueManager/sessionEnded would be best way.

with recognize button - thank you for explanation, I was using it for test only as well, so no problem if it doesn’t triggers hermes/dialogueManager/sessionStarted :slight_smile:

with hermes/dialogueManager/sessionEnded I will again check behavior to be really sure.

One more question, i there any proper way how to create empty intent? Or is it enough to disable Intent handling? In Rhasspy 2.4 original HA conversation integration was always generating empty intent, so I stick to it with 2.5 as well, but maybe it is not needed…

I agree with @fastjack, you shouldn’t listen to these topics: they are low-level. hermes/dialogueManager/sessionStarted and hermes/dialogueManager/sessionEnded are more suitable for your purpose.

I’m not seeing this issue here, what version of Rhasspy are you running exactly? 2.5.1? And which components?

For comparison, here’s the complete flow of MQTT messages (excluding the raw audio data) when I wake Rhasspy and ask for the time:

hermes/hotword/okay-rhasspy-02.wav/detected {"modelId": "okay-rhasspy-02.wav", "modelVersion": "", "modelType": "personal", "currentSensitivity": 0.22, "siteId": "livingroom", "sessionId": null, "sendAudioCaptured": null, "lang": null}
hermes/hotword/toggleOff {"siteId": "livingroom", "reason": "playAudio"}
hermes/asr/toggleOff {"siteId": "livingroom", "reason": "playAudio"}
hermes/audioServer/livingroom/playFinished {"id": "4b94584f-41ed-4017-906f-a9ec2eea91c1", "sessionId": ""}
hermes/hotword/toggleOn {"siteId": "livingroom", "reason": "playAudio"}
hermes/asr/toggleOn {"siteId": "livingroom", "reason": "playAudio"}
hermes/dialogueManager/sessionStarted {"sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "siteId": "livingroom", "customData": "okay-rhasspy-02.wav", "lang": null}
hermes/hotword/toggleOff {"siteId": "livingroom", "reason": "dialogueSession"}
hermes/asr/startListening {"siteId": "livingroom", "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "lang": null, "stopOnSilence": true, "sendAudioCaptured": true, "wakewordId": "okay-rhasspy-02.wav", "intentFilter": null}
hermes/asr/textCaptured {"text": "what time is it", "likelihood": 1, "seconds": 1.7489265769254416, "siteId": "livingroom", "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "wakewordId": null, "asrTokens": null, "lang": null}
hermes/hotword/toggleOff {"siteId": "livingroom", "reason": "playAudio"}
hermes/asr/toggleOff {"siteId": "livingroom", "reason": "playAudio"}
hermes/audioServer/livingroom/playFinished {"id": "e30d0fe0-c72a-435a-876e-1d0c8809089b", "sessionId": ""}
hermes/hotword/toggleOn {"siteId": "livingroom", "reason": "playAudio"}
hermes/asr/toggleOn {"siteId": "livingroom", "reason": "playAudio"}
hermes/asr/stopListening {"siteId": "livingroom", "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450"}
hermes/hotword/toggleOn {"siteId": "livingroom", "reason": "dialogueSession"}
hermes/nlu/query {"input": "what time is it", "siteId": "livingroom", "id": null, "intentFilter": null, "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "wakewordId": "okay-rhasspy-02.wav", "lang": null}
hermes/nlu/intentParsed {"input": "what time is it", "intent": {"intentName": "GetTime", "confidenceScore": 1.0}, "siteId": "livingroom", "id": null, "slots": [], "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450"}
hermes/intent/GetTime {"input": "what time is it", "intent": {"intentName": "GetTime", "confidenceScore": 1.0}, "siteId": "livingroom", "id": null, "slots": [], "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "customData": null, "asrTokens": [[{"value": "what", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 4, "time": null}, {"value": "time", "confidence": 1.0, "rangeStart": 5, "rangeEnd": 9, "time": null}, {"value": "is", "confidence": 1.0, "rangeStart": 10, "rangeEnd": 12, "time": null}, {"value": "it", "confidence": 1.0, "rangeStart": 13, "rangeEnd": 15, "time": null}]], "asrConfidence": null, "rawInput": "what time is it", "wakewordId": "okay-rhasspy-02.wav", "lang": null}
hermes/dialogueManager/endSession {"sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "siteId": "livingroom", "text": "It's 19 20", "customData": null}
hermes/hotword/toggleOff {"siteId": "livingroom", "reason": "ttsSay"}
hermes/asr/toggleOff {"siteId": "livingroom", "reason": "ttsSay"}
hermes/tts/say {"text": "It's 19 20", "siteId": "livingroom", "lang": null, "id": "30279b41-0b20-4780-91a0-ad58a76a30e9", "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450"}
hermes/audioServer/livingroom/playFinished {"id": "30279b41-0b20-4780-91a0-ad58a76a30e9", "sessionId": ""}
hermes/tts/sayFinished {"siteId": "livingroom", "id": "30279b41-0b20-4780-91a0-ad58a76a30e9", "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450"}
hermes/hotword/toggleOn {"siteId": "livingroom", "reason": "ttsSay"}
hermes/asr/toggleOn {"siteId": "livingroom", "reason": "ttsSay"}
hermes/asr/stopListening {"siteId": "livingroom", "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450"}
hermes/dialogueManager/sessionEnded {"termination": {"reason": "nominal"}, "sessionId": "livingroom-okay-rhasspy-02.wav-e5107fc2-6981-4b7f-b380-925e5b3f4450", "siteId": "livingroom", "customData": "okay-rhasspy-02.wav"}
hermes/hotword/toggleOn {"siteId": "livingroom", "reason": "dialogueSession"}

You can see that the same session ID is kept in all relevant messages.

You can also see that in my setup, hermes/dialogueManager/sessionEnded is only triggered after hermes/tts/sayFinished, as it should be.

But isn’t the problem too that you are using hermes/tts/say for the reply? This is a low-level message, you should use hermes/dialogueManager/endSession to end a session correctly.

2 Likes

Thank you very much for reply. I think that I am doing it wrong - as you mentioned.

First on my master Rhasspy I set Dialogue manager to Rhaspy and disabled both Intent Recognition and Intent Handling - as both are IMHO handled by my automation chain:

MQTT <> Node Red <> HA <> Almond

As input from Rhasspy to NodeRed I am using STT output in hermes/nlu/query

As output from NodeRed to Rhasspy I am using TTS input hermes/tts/say

When I posted to hermes/tts/say I was also trying to generate empty intent with hermes/intent/<intentName>, as otherwise Rhasspy session was stuck waiting till timeout. But I have some problems generating it right (maybe this could be problem as well). As I mentioned before, is that empty intent really required?

Also you mentioned that I shall be using hermes/dialogueManager/endSession, but I do not understand what for… I was thinking that when I generate empty intent and sent hermes/tts/say, then hermes/dialogueManager/endSession is generated automatically by Dialogue Manager.

May I please ask you for more details how the chain should look like, If I would like to handle Intent Recognition and Handling via my scripts and only use STT and TTS Rhasspy components? Thank you. :slight_smile:

OK, I read you response again and I now hopefully understand it. :slight_smile: So I did change behaviour of my scripts are follows:

  • mute on hermes/dialogueManager/sessionStarted
  • unmute on hermes/dialogueManager/sessionEnded
  • STT input to HA conversation from hermes/nlu/query
  • TTS output from HA conversation to hermes/dialogueManager/endSession

In this case as you mentioned hermes/dialogueManager/endSession terminates correctly session and I do not need to generate any Intent at all.

Also using this approach everything works as expected and also session ID corresponds through whole session!

Thank you very much for help. :vulcan_salute:

If anyone is interested in, here is whole NodeRed flow, which works perfectly on local HomeAssistant installation with HA conversation API. This actually replaces missing HA Conversation integration in Rhasspy 2.5. It also includes switching on speaker, muting music playing via snapcast client on it (managed as well via HA) and unmuting snapcast afterward Rhasspy terminates the session. :wink:

[{"id":"b01d537b.89964","type":"tab","label":"Rhasspy","disabled":false,"info":""},{"id":"16403908.3e491f","type":"mqtt in","z":"b01d537b.89964","name":"Session Start","topic":"hermes/dialogueManager/sessionStarted","qos":"0","datatype":"json","broker":"f49b320f.8c6958","x":150,"y":140,"wires":[["a1d9654a.320f58","7100a56b.1339ac"]]},{"id":"383dc077.9f584","type":"ha-get-entities","z":"b01d537b.89964","server":"e55307b8.8b98e8","name":"is snapcast","rules":[],"output_type":"random","output_empty_results":false,"output_location_type":"msg","output_location":"payload.entity","output_results_count":1,"x":550,"y":140,"wires":[["89b596b0.4e40e8"]]},{"id":"a1d9654a.320f58","type":"function","z":"b01d537b.89964","name":"","func":"\nmsg.payload.rules = [ { property: \"entity_id\", logic: \"is\", value: \"media_player.snapcast_client_\" + msg.payload.siteId, valueType: \"str\" } ]\n\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":360,"y":140,"wires":[["383dc077.9f584"]]},{"id":"69a842ea.7a0184","type":"api-call-service","z":"b01d537b.89964","name":"mute snapcast","server":"e55307b8.8b98e8","version":1,"debugenabled":false,"service_domain":"media_player","service":"volume_mute","entityId":"","data":"","dataType":"json","mergecontext":"","output_location":"","output_location_type":"none","mustacheAltTags":false,"x":940,"y":140,"wires":[[]]},{"id":"89b596b0.4e40e8","type":"function","z":"b01d537b.89964","name":"","func":"\nif (! msg.payload.entity.attributes.is_volume_muted ) {\n    return { payload: { data: { entity_id: msg.payload.entity.entity_id, is_volume_muted: true } } };\n} else {\n    return null;\n}","outputs":1,"noerr":0,"initialize":"","finalize":"","x":740,"y":140,"wires":[["69a842ea.7a0184"]]},{"id":"4774c243.91801c","type":"function","z":"b01d537b.89964","name":"","func":"\nif ( msg.payload.entity.state === \"off\" ) {\n    return { payload: { data: { entity_id: msg.payload.entity.entity_id } } };\n} else {\n    return null;\n}","outputs":1,"noerr":0,"initialize":"","finalize":"","x":740,"y":200,"wires":[["606249be.8fbe48"]]},{"id":"7100a56b.1339ac","type":"function","z":"b01d537b.89964","name":"","func":"\nmsg.payload.rules = [ { property: \"entity_id\", logic: \"is\", value: \"switch.\" + msg.payload.siteId, valueType: \"str\" } ]\n\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":360,"y":200,"wires":[["9f753624.a196"]]},{"id":"9f753624.a196","type":"ha-get-entities","z":"b01d537b.89964","server":"e55307b8.8b98e8","name":"is speaker","rules":[],"output_type":"random","output_empty_results":false,"output_location_type":"msg","output_location":"payload.entity","output_results_count":1,"x":550,"y":200,"wires":[["4774c243.91801c"]]},{"id":"606249be.8fbe48","type":"api-call-service","z":"b01d537b.89964","name":"speaker on","server":"e55307b8.8b98e8","version":1,"debugenabled":false,"service_domain":"switch","service":"turn_on","entityId":"","data":"","dataType":"json","mergecontext":"","output_location":"","output_location_type":"none","mustacheAltTags":false,"x":930,"y":200,"wires":[[]]},{"id":"d0efc196.57503","type":"mqtt in","z":"b01d537b.89964","name":"Input text","topic":"hermes/nlu/query","qos":"2","datatype":"json","broker":"f49b320f.8c6958","x":140,"y":340,"wires":[["4ca5d53.dcb782c"]]},{"id":"b273070b.f4692","type":"ha-api","z":"b01d537b.89964","name":"Conversation API","server":"e55307b8.8b98e8","debugenabled":false,"protocol":"websocket","method":"get","path":"","data":"","dataType":"json","location":"payload.result","locationType":"msg","responseType":"json","x":550,"y":300,"wires":[["ee106ba2.8a9d68"]]},{"id":"4ca5d53.dcb782c","type":"function","z":"b01d537b.89964","name":"","func":"\nif ( msg.payload.input !== \"\" ) {\n    msg.payload.data = { \"type\": \"conversation/process\", \"id\": msg.payload.sessionId, \"text\": msg.payload.input };\n    return [msg, null];\n} else {\n    return [null, msg];\n}\n\n","outputs":2,"noerr":0,"initialize":"","finalize":"","x":360,"y":340,"wires":[["b273070b.f4692"],["ee106ba2.8a9d68"]]},{"id":"ee106ba2.8a9d68","type":"function","z":"b01d537b.89964","name":"","func":"\nvar newMsg = { \"payload\": { \"siteId\": msg.payload.siteId, \"sessionId\": msg.payload.sessionId, \"text\": msg.payload.result.speech.plain.speech } };\n\nreturn newMsg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":740,"y":340,"wires":[["370df68b.17bfe2"]]},{"id":"45891d11.d1c3f4","type":"mqtt in","z":"b01d537b.89964","name":"Session End","topic":"hermes/dialogueManager/sessionEnded","qos":"2","datatype":"json","broker":"f49b320f.8c6958","x":150,"y":480,"wires":[["3617b3c9.96dbac"]]},{"id":"ac2d6cde.a2e448","type":"ha-get-entities","z":"b01d537b.89964","server":"e55307b8.8b98e8","name":"is snapcast","rules":[],"output_type":"random","output_empty_results":false,"output_location_type":"msg","output_location":"payload.entity","output_results_count":1,"x":550,"y":480,"wires":[["e80a4025.35a69"]]},{"id":"3617b3c9.96dbac","type":"function","z":"b01d537b.89964","name":"","func":"\nmsg.payload.rules = [ { property: \"entity_id\", logic: \"is\", value: \"media_player.snapcast_client_\" + msg.payload.siteId, valueType: \"str\" } ]\n\nreturn msg;","outputs":1,"noerr":0,"initialize":"","finalize":"","x":360,"y":480,"wires":[["ac2d6cde.a2e448"]]},{"id":"1afa8327.5dd235","type":"api-call-service","z":"b01d537b.89964","name":"unmute snapcast","server":"e55307b8.8b98e8","version":1,"debugenabled":false,"service_domain":"media_player","service":"volume_mute","entityId":"","data":"","dataType":"json","mergecontext":"","output_location":"","output_location_type":"none","mustacheAltTags":false,"x":950,"y":480,"wires":[[]]},{"id":"e80a4025.35a69","type":"function","z":"b01d537b.89964","name":"","func":"\nif ( msg.payload.entity.attributes.is_volume_muted ) {\n    return { payload: { data: { entity_id: msg.payload.entity.entity_id, is_volume_muted: false } } };\n} else {\n    return null;\n}","outputs":1,"noerr":0,"initialize":"","finalize":"","x":740,"y":480,"wires":[["1afa8327.5dd235"]]},{"id":"370df68b.17bfe2","type":"mqtt out","z":"b01d537b.89964","name":"Output text","topic":"hermes/dialogueManager/endSession","qos":"2","retain":"false","broker":"f49b320f.8c6958","x":930,"y":340,"wires":[]},{"id":"f49b320f.8c6958","type":"mqtt-broker","z":"","name":"local","broker":"192.168.3.2","port":"1883","clientid":"","usetls":false,"compatmode":false,"keepalive":"60","cleansession":true,"birthTopic":"","birthQos":"0","birthRetain":"false","birthPayload":"","closeTopic":"","closeQos":"0","closePayload":"","willTopic":"","willQos":"0","willPayload":""},{"id":"e55307b8.8b98e8","type":"server","z":"","name":"Home Assistant","addon":true}]
1 Like