I have the following setup, each in a docker container:
- signal-cli web api → handles incoming messages from signal messenger and can send messages
- rhasspy → handles my main site id (livingroom) and 2 satellites (office->another rhasspy instance, signal->injecting and extracting text commands and responses)
- homeassistant → provides hardware abstraction for all my smart devices and provides the entities to node red
- node red → handles all the automation flows and incoming and outgoing signal messages in connection with signal-cli web api
In general I have a signal messenger group with ‘my home’ and me and text messages in that group are treated like voice commands and the responses are sent back to the group. It essentially looks like this:
I Receive text messages via signal messenger (sideId: signal) in node red and pass them to
/api/text-to-intent?siteId=signal
(text-to-intent supports the siteId parameter and then treats it as the text message came through that siteId).
To get the responses, I subscribe via websockets to /api/mqtt/hermes/tts/say
and filter for siteId: signal
. I extract the text and send it back to my group chat in signal messenger.
Additionally I now extract voice messages from signal (they come in the format aac), convert them to wav and then pass them to /api/speech-to-intent
to treat them as voice commands. I wanted to get the response as in the previous text based use case and send it back as text to my signal group, however the end point does not support the siteId parameter as /api/text-to-intent
does and automatically uses the configured default siteId for this particular rhasspy instance.
A workaround would probably be to run another rhasspy instance for siteId: signal and forward everything to the main instance for handling the intents and then use /api/speech-to-intent
from this instance, but this seems like a lot of extra steps for this particular use case, especially as it’s already working with text-to-intent end point.