Create Volume Control Endpoints

Would it be possible to add API and MQTT endpoints to turn up, turn down, mute, and unmute the speaker? I am looking at GitHub even trying to figure out how API works (I’m ignorant but hope to learn something).

Currently the only way to change the volume is by CLI from the host or container… nothing in the GUI or API that I could find. I think this would be a nice addition for automations, and would be very happy to help - if I had any clue what I was doing :slight_smile:.

I have started to dig into the repo, but have no clue even where to find the existing API code to start digesting it and understand how it works.

Thanks,
DeadEnd

1 Like

Ok, I can help you navigate Rhasspy’s vast code base :slight_smile:

I’ll limit myself to the Hermes MQTT part, because that’s what I’m using.

First you need to define new messages or extend existing messages in the Hermes protocol. Those are defined in the rhasspy-hermes project. The API is documented (not yet completely reworked to the new documentation format, but already usable).

If you want to turn up and down the volume, you’ll need to extend the rhasspyhermes.audioserver module. This is one of the few modules where the documentation is not yet converted to the new, much clearer format, but I’ll see if I can find some time to work on this sooner.

You can ask for information about the current audio devices by sending a rhasspyhermes.audioserver.AudioGetDevices message. This translates to the rhasspy/audioServer/getDevices MQTT topic. Have a look at the code for now to see this, as the documentation of this module is not yet complete. After sending this message, you get a rhasspyhermes.audioserver.AudioDevices message with information about all the available audio devices. This includes their mode (input or output), id, name, description and so on.

Note that you can already mute and unmute the audio: see rhasspyhermes.audioserver.AudioToggleOff and rhasspyhermes.audioserver.AudioToggleOn.

If you also want to turn up and down the volume, what you should do is define a message with MQTT topic rhasspy/audioServer/foobar to set the volume (as a float) for an audio device with a specific ID. You should probably also define a message to ask what the current volume of an audio device is. Or maybe you can extend the current getDevices topic and let it include the current volume of every audio device. I don’t know what’s best.

After defining these Hermes MQTT messages, you have to implement how Rhasspy reacts to them. This comes in rhasspy-speakers-cli-hermes and rhasspy-server-hermes.

And maybe you also want to do something similar for the microphone? That would be in rhasspy-microphone-cli-hermes and rhasspy-microphone-pyaudio-hermes.

While possible, I think that those kind of functionalities do not fit the scope of Rhasspy API.
That would in my opinion be the scope of an intent handler, just like turning on light or any other type of skill.
You can write a skill which controls the volume, by means of a commandline or python script.
I suggest trying it with https://rhasspy-hermes-app.readthedocs.io/en/latest/index.html

Okay, so the better direction would be to create a custom intent handling script - set the intent handler to local command?

Currently mine is disabled (intent handler) since I am using Node-Red and the Rhasspy websocket. What other effect does turning on the intent handler do? Does it just give an extra output from Rhasspy?

In the case of using the local command intent handler - you can only setup one command correct? I’ve no experience with this, so do you have to within the single handler script outline each intent you want to execute? Is that correct? The docs example currently isn’t finished I believe… it only has cat written… o wait - the python example has some detail, just not the bash script. :slight_smile:

You will still be able to use the websocket at the same time.
With command selected for intent handling. The programm will be called every time an intent is sucessfully recognized.

You can use Node-Red as an intent handler as well.

An intent is nothing more then a message on the MQTT hermes/intents/ topic.
The intentName is the same as the [intentName] in your sentences.ini.

Yes, but Node-Red does not have access to the sound device nor the software to control the audio device. I started going that route, but installing extra software in the container, fixing permissions, adding user groups - it just became a mess and if I could just post something to Rhasspy container it would be much simpler… Rhasspy already has everything it needs to do it.

The problem is that you have to run this app on all your satellites, and you have to configure it to use the right device. And there’s already a service running on these satellites: Rhasspy’s audio service, which is already configured to use the right audio device. So it seems natural to extend the API to add this feature, which can then be used by other apps without having to configure your audio devices again or reading the service’s configuration.

Well I’ve tried creating a custom intent handler using the example python script as an example. So far it doesn’t work, and I can’t find any logs to see what is going on. I’m sure I’m in over my head at this point.

Okay, I figured out some mistakes in the python script and have it working in the terminal.
I set it up as an intent handler, but have no clue what needs to go into arguments? I tried with it blank (after creating a sentence that the intent is defined in the custom handler) and it isn’t working.

… closer - but still not there…

I agree that it might seem a good idea to add this to a piece of code running on every satellite.
However, I feel the separation of duties is crossed there.

To me it feels more logical to add functional code as separate intent handlers, not build-in intent handlers.
And run the app on all satellites in from my point of view not a valid argument, since that goes for all apps you want to create.
Or more: with an app you can choose on which satellites your want them to run.

Having said that, if someone want to create such a volume handler then it is possible indeed :slight_smile:

Tried following the example hermes-app and same result… nothing happens. Can’t find anything in logs either… I guess I’ll just have to wait and read until I find the solution in someone else’s problem to getting the local intent handler working.

FYI it appears that it may not be working in 2.5… but the hermes-app is suppose to be working.

What if you run your Rhasspy Hermes app with the --debug option to see what’s happening?

Can you share your code maybe? That’s easier for others to debug.

Bouncing between two posts is my fault. Replied in the other about the app. I’ll keep this post to the intent handler within Rhasspy instead. I posted something to the issue posted above but it might not be useful… IDK - still learning :slight_smile:

Thanks everyone for the help!
I was able to get a local command intent handler working!
It now has the ability to adjust the volume between 0% and 100%.
Very excited to have this working - also created a second intent to play music through the speaker.
Only downside is I haven’t figured out how to stop the music yet :laughing: (other than killing the process).
I expect that the speaker/mic combo I’m using has trouble hearing the wake word while its playing music… but I’m hoping that it might be possible to have it stop playing when it hears the wake word.

Either way, an endpoint would be great - but this work around will do.

Cheers!
DeadEnd

1 Like

Great you have it working.
I think that is an issue in general with the wakeword detections.

@DeadEnded i think Wakeword detection could still work. My Wakeword still gets detected with a lot of noise. Only problem is the end of speech for asr does not get detected. Depending on your setup it could work if you mute the music temporary after wakeword detection.

Yes, this is what I expect… I tested last night and for sure I could wake from the GUI, but wasn’t able to test if wake word was working. My thought is that it probably does, but you don’t hear the acknowledgement tone because music is playing.

I have been thinking on how to pause the music when wake word is detected… currently (because I’m trying to not install any extra software) I am using sox play to play music through Rhasspy… I haven’t found a pause command - the only thing I have found is to kill the process. Long term I will probably be moving to a separate speaker like a Sonos or equivalent, so it might not be worth the effort to hack together something for it.

Update:
I just tested and when music is playing wake word detection does not work. So it looks like either I have to kill it another way, or just wait for it to finish (currently setup to play single songs by name).

Back to the original topic of Volume Endpoints…
I finished my automation - and it is now working.

With the custom intent handler in place, I now have an automation in Node-Red that triggers a volume adjustment by sending a payload to the api/mqtt endpoint. When kids door is closed/opened it triggers the automation to send the following to ws://RhasspyIP:12101/api/mqtt:

{
    "type": "publish",
    "topic":"hermes/nlu/query",
    "payload":{
        "input":"set volume to ## percent"
    }
}

Replacing the ## with whatever volume I want. This works perfectly as it processes the intent and calls the handler.

Very happy to have this solution in place - WAF on the rise :wink:.

Thanks everyone for the help!
DeadEnd