Using external MQTT "in a gentle way" --not choking it with WAV messages

I have just started to try the 2.5 rhasspy-voltron and have been playing around a little bit to understand how is the best architecture in my home, where I want to have several light-ish Raspberry Pi devices (with a Respeaker microphone / speaker combo) and a beefier NVIDIA Jetson TX2 for DeepSpeech + extra stuff.

My initial idea was to centralize the MQTT in the “beefy” machine, and have the Raspberry Pi devices have hotword detection and outsorce the TTS, STT and Intent Recognition to this central brain. However, as soon as I activated external MQTT, I realized that WAV frames are constantly being sent over MQTT. My scenario doesn’t need to send WAV outside ASR --and the quantity and size of all those messages seems to be fairly relevant! I realized that if I put a hostname for UDP audio-sending, WAV frames stop being sent while outside ASR, just as I wanted to achieve.

This is good as a workaround, but it seems very hackish to force rhasspy to send messages to UDP just to ignore a certain output. Am I doing something wrong? Is there some flag or option that allows me to “deactivate WAV messages while not in ASR”? This functionality exists, because it is what happens when UDP audio sending is active; but I don’t want to use UDP, I simply want to do wakeword recognition in-device.

Maybe I am overengineering everything or I missed something. Is there a simpler way to achieve what I have in mind?

UDP is what you want to use in this case. You should set the UDP port of the wake word and microphone service to the same value (leaving off a hostname implies localhost). This will keep all of the WAV traffic local until the ASR is activated, which is definitely what you want with an external MQTT broker that’s shared.

1 Like

Oh ok! I hadn’t realized that the wakeword UDP had to match this other parameter.

I thought that wakewordI was working out of band (I suppose that I tried the manual “start” from the WebUI and didn’t realize that wakeword was not really working and started to make wrong assumptions). I got completely confused myself during my experimentation. Thanks for the tip! Makes a lot of sense now that I read it haha.

1 Like

If i have no satellite, and want to use an external broker, how do i setup rhasspy to work normally, but without audio frames being sent to the external broker. I understand your discussion above, but i have no idea of how to set this up using the rhasspy web interface. Could you be specific, on detailing what it is that i would have to do? Thanks for bringing this issue up. I could never track all of hermes/# to see what is going on, because all i get is hermes/AudioServer/# and besides, im using my external mqtt server for robotic commands (some hopefully coming from Rhasspy) and i agree with you that this flood of audio frames is unworkable.

From my tests and understanding, if you explicitly set up UDP audio-sending (sorry for being a tad cryptic here, right now my rhasspy hardware is offline and cannot check it), then audio frames will not be sent to the external broker.

This applies to the audio frames for wakeword recognition. Audio frames after wake are sent for TTS.

Once you set up UDP host and port, you should see how hermes/AudioServer/# stop flooding --you will see things there after wakeword is recognized, but not before.

I hope this makes sense and helps you!

You need to enter the same port number (the port should be free) under both the audio recording setting and under the wakeword settings. Host can be left empty if on the same device. This should stop sending audio frames over mqtt untile the wakeword is detected. The audio frames for speach to text will still be send over your mqtt broker

The others have answered your question, but I want to add this tip: you can easily ignore specific MQTT topics (including wildcards) with the -T option of mosquitto_sub. When I’m debugging Hermes messages, I’m always using the following command:

mosquitto_sub -t 'hermes/#' -T 'hermes/audioServer/+/playBytes/+' -T 'hermes/audioServer/+/audioFrame' -v

This shows all Hermes MQTT messages except for the raw audio messages.


Thanks, that will help a lot. Can you also help me with the other part of this issue?

A.I need to have an external mqtt broker respond to rhasspy intents. I believe that i can do this 2 ways:

  1. Have rhasspy mqtt settings point to my external mqtt server

  2. Get intents from local host and broadcast what i need to the external server (which multiple iot devices and robots are connected to) I guess this is similar to using rhasspy as a voice front end to hassio.

B. My goals would be to:

  1. Have the localhost mqtt serving rhasspy handle all voice output and wakeup (i am confused as to exactly how to do this)

  2. Have the external mqtt handle all intents and other hermes messages

    What specifics are written to the profile or what specifics are add

It’s not really clear to me what exactly you are trying to do and what you have already done. I suggest you try these configurations in order of increasing difficulty:

  1. Configure Rhasspy with an internal MQTT server.
  2. Configure Rhasspy with your external MQTT server.
  3. Configure UDP audio streaming to keep the audio messages locally when you’re using the external MQTT server.

You can consult Rhasspy’s documentation, especially the tutorials, for some guidance.

Test whether these configurations work, and then let us know where you’re stuck, preferably with a screen shot of the relevant settings. This makes it easier for us to follow your steps and help.

I think you have two options to solve this problem:

  1. Set up UDP audio streaming and configure Rhasspy to use your external MQTT server.

    Pros: Quick and easy to setup.
    Cons: After the wake word is detected, Rhasspy will switch from UDP to MQTT audio streaming until the ASR service stops listening. This may not be desired and can be fixed with the second solution.

  2. Use two MQTT brokers with a bridge configuration.

    This is the setup I’m using and it works great for me. I’m using an external, local MQTT broker (Mosquitto) on the same device as Rhasspy and another remote one which is hooked up to my smart home system. I configured the local MQTT broker to relay recognized intents to the remote broker. I’m not using UDP audio streaming for this setup as the audio frames are never leaving the Rhasspy host.

    Pros: Audio is never networked, keeping the network congestion to a minimum
    Cons: Setup is a bit more complex

    This is my local Mosquitto config:

    connection <name you can freely choose>
    address <address and port of your remote MQTT broker>
    topic hermes/intent/# out 0
    topic hermes/dialogueManager/endSession in 0

    You can learn more about Mosquitto bridge configuration here:

Thank you. Your second solution is exactly what i need.

1 Like