I have just started to try the 2.5 rhasspy-voltron and have been playing around a little bit to understand how is the best architecture in my home, where I want to have several light-ish Raspberry Pi devices (with a Respeaker microphone / speaker combo) and a beefier NVIDIA Jetson TX2 for DeepSpeech + extra stuff.
My initial idea was to centralize the MQTT in the “beefy” machine, and have the Raspberry Pi devices have hotword detection and outsorce the TTS, STT and Intent Recognition to this central brain. However, as soon as I activated external MQTT, I realized that WAV frames are constantly being sent over MQTT. My scenario doesn’t need to send WAV outside ASR --and the quantity and size of all those messages seems to be fairly relevant! I realized that if I put a hostname for UDP audio-sending, WAV frames stop being sent while outside ASR, just as I wanted to achieve.
This is good as a workaround, but it seems very hackish to force rhasspy to send messages to UDP just to ignore a certain output. Am I doing something wrong? Is there some flag or option that allows me to “deactivate WAV messages while not in ASR”? This functionality exists, because it is what happens when UDP audio sending is active; but I don’t want to use UDP, I simply want to do wakeword recognition in-device.
Maybe I am overengineering everything or I missed something. Is there a simpler way to achieve what I have in mind?