Part 1 - the basics and a first example
As I can surely remember my struggles to get Rhasspy going in combination with HomeAssistant, I just wanted to share a few lines with you, if you just installed Rhasspy and want to use your first voice command.
This is not a guide about the installation of Rhasspy, there are a lot of good guides out there on the net. This is just a starter, to get you going and have some kind of working example to built upon. I hope you find it useful.
Our starting point is right after you have installed Rhasspy, it doesnāt matter how, Docker or as a HA-AddOn or whichever way you chose. Iām as well assuming you have a running HA instance and you know where to setup your automations in HA.
If you now open your web admin from Rhasspy, youāll find this menu in the upper left.
The menu contains the following items from top to bottom:
- Home Here you find a status page with some testing possibilities
- Sentences Here you set up your sentences, that means, the words you speak to Rhasspy after the wake word
- Slots Slots are lists of things or devices, that you can load into your sentences, so you donāt need to write nearly identical sentences
- Words This page is around the words and their pronaunciation in Rhasspy
- Settings Here you change the setup for all the different parts of Rhasspy
- Documentation This opens the Rhasspy documentation (Note: this is the offline docu, that was installed together with Rhasspy)
Now, choose the settings page in the menu, and you should be presented with this screen:
Letās check a few things first, to get the setup right.
-
siteId
Itās a good time to name your Rhasspy instance. Give it a name you can remember later and that is describing. If you ever change to a setup where you use Rhasspy satelites, youāll need this and as we use this name later on in automations, it makes sense to do it now. -
MQTT
These are the settings for your MQTT broker. If you use HA-OS, a supervised install or a standalone HA-core installation, you normally will have a MQTT-broker already configured. In my case Iām running HA-OS, so I already use the mosquitto broker from the HA-AddOn store. So if you have a broker running, change the settings to āexternalā and fill in the data for your broker.-
Host Fill in your IP address or the domain name from your MQTT broker, with HA-OS it is the same as your HA address. Example:
192.168.178.100
orhomeassistant.local
-
Port The default port is
1883
-
User I recommend to setup a new user for your broker in the settings of HA. If you use the AddOn, you find these under
settings > people > users
. - Password Same as above
If you donāt have a MQTT broker running, leave the setting to āinternalā.
-
Host Fill in your IP address or the domain name from your MQTT broker, with HA-OS it is the same as your HA address. Example:
-
Audio Recording, Wake Word, Speech to Text, Intent Recognition, Text to Speech, Audio Playing and Dialogue Management are out of the scope of this guide. If you need help with these, please refer to the documentation of Rhasspy, which you can find here. As you can see I went with the recommended options.
-
Intent Handling
This is the important part, here we setup HA as our intent handler. This means, what you speak to Rhasspy gets ātranslatedā and then send to HA to actually do something, like switching a light.
So choose āHomeAssistantā and restart Rhasspy to reflect your change.There are two ways for Rhasspy to talk to HA. One is with
intents
, the other one is withevents
. As I couldnāt getintents
to work correctly, and after reading up some tutorials, I choose theevent
way. In the end it doesnāt make a huge difference in function, butevents
are def. easier to handle.- Hass URL Fill in the url to your HA instance
- Access Token Setup an access token in HA under your user profile and fill it in here
- Set the intent handling to
Send events to Home Assistant (/api/events)
- Save your settings and let Rhasspy restart
Now that we have our setup complete, we can start right into writing up our first sentence. Open the sentences
page (via the menu) and youāll see the default sentences.ini
file presented in your editor window. Delete all the entries, we donāt need them for now and later on we are able to make our own sentences that really fit our needs.
Now add the following to the editor window:
[GetDate]
what date is today
This is very small, but it shows the principles, that are involved in training Rhasspy and send something to HA. So what are we looking at?
The first line [GetDate]
is the name of our intent.
The second line is the sentence we need to speak, to tell Rhasspy what we want.
Just think of the following way:
- You speak your wake word, Rhasspy wakes up and sends a short signal so we can now speak and Rhasspy listens.
- Whatever sentence is set here, Rhasspy tries to get your spoken word right and ātranslatesā it to a command (the first line).
- Summed up, you speak, Rhasspy translates that to a command and this will be sent to HA to do something. This is what we call an āintentā.
As you might guess, it is not always easy and welcomed, if you need to get the sentence exactly right, so there is the possibility to set more than one sentence. But in the end, Rhasspy ātranslatesā this always to one command.
Change the text in the editor by adding a third line
[GetDate]
what date is today
give me the date
Now we can speak one of the two sentences, and Rhasspy ātranslatesā this always to just one command, namely [GetDate]
. Just to make it clearer: You need to speak one of the sentences, and Rhasspy will āanswerā with that one command.
We will come back to our sentences file later, but for now, safe it and let Rhasspy re-train, so it knows the sentences we just added.
Now we have to do something in HA, as Rhasspy already did itās first part of the job. Move now over to HA and setup an automation. Iāll show here the YAML
version of the automation, just because explaining whatās going on behind the scenes is easier. You can always do this automation in the UI
editor of HA, itās entirely your choice.
Letās see how an automation could look like with the sentences we added before:
automation:
- id: Rhasspy GetDate
alias: Rhasspy GetDate
mode: single
trigger:
- platform: event
event_data: {}
event_type: rhasspy_GetDate
action:
- service: mqtt.publish
data:
topic: hermes/dialogueManager/endSession
payload_template: '{"sessionId": "{{ trigger.event.data._intent.sessionId }}", "text": "Today is {{ states.sensor.date.state }}"}'
Weāll go through each line now, to explain whatās happening here (if there is more to explain, weāll come to that later):
- id Give your automation a āspeakingā id, if you move on, youāll likely get a lot of automations for Rhasspy, and it is easy to loose the big picture. So choose a good name, in my case I start all automations regarding Rhasspy with āRhasspyā. That makes it easier in the end, for example if you search for an automation in HAs automation window, youāll have all the Rhasspy entries āgroupedā together, as they all start with, you might guess it, āRhasspyā.
-
alias I just copy the
id
to thealias
, as this is an optional step, but it makes things clearer down the road. -
mode This is the
mode
in which your automation is run. In our casesingle
is the right choice, as you likely wonāt want the date told more than once. This will come in handy, if you have a command, that should be repeated. For example, if you later want to set your TV volume, you might want to run the automation a few times to increase the volume. Than this will change (donāt worry, we will come to an example later) -
trigger This is the part, where we will use our
command
from before-
platform: event
As you might remember, we configured Rhasspy to send anevent
instead of anintent
to HA, so we need to use theevent
platform in HA to recognize it -
event_data
For now we donāt need this, but it will come in handy later on, if your automations get more complicated. Just leave the two brackets empty. -
event_type
This is what identifies, what Rhasspy sends to HA. As you can see, it is the command we configured before,[GetDate]
. It is always prefixed with ārhasspy_ā and followed by the actual command āGetDateā. Makes in combinationrhasspy_GetDate
. Easy, isnāt it?
-
- In this example we donāt need HA to do much, as the answer to our question should be already available in HA, namely in the
sensor.date
action This is where we configure what HA should do, if this automation getās triggered (aka you spoke something that Rhasspy identified and sent to HA)-
service:mqtt_publish
We want HA to publish something (the answer) on the MQTT topic, so it is send back to Rhasspy -
data
-
topic
This is the topic Rhasspy listens to, in our case we want to close the session with an answer to our question. I added a few lines about seesions in Rhasspy at the end of this guide, if youāre interested whatās happening withsessionId
s and so on. -
payload_template
Here we tell Rhasspy in which session we are (yes, there could be more than one), and what we want Rhasspy to tell us back (aka the answer).
As you can see, we just setup a ātextā, and it will be sent back to Rhasspy
-
-
This is, in an essence, what we need for Rhasspy and HA to work together. This is a very simple example, but the way things go, should be clear:
- Rhasspy wakes up
- You tell your sentence
- Rhasspy tries to find out, what you want from it, and ātranslatesā your sentence into a command
- This command will be sent to HA over MQTT
- HA picks up the command and looks for an automation that fits (actually itās the other way around, but letās not get to techy here) => named after the command you sent
- HA is running the automation and publishes an āanswerā over MQTT
- Rhasspy identifies the session and speaks the text from the MQTT topic back to you
Now safe your automation, reload the automations in HA and move back to Rhasspy.
For testing purposes, the āHomeā page comes in handy. Call it by pressing the āHomeā button. If you take a look under the status bar, youāll see the line that starts with the āRecognizeā button. This is where weāll test our command and the connection with HA.
Type in one of the sentences exactly how you configured it. In our example type āgive me the dateā and push ārecognizeā. If everything works, you should be presented with the command you configured for this sentence in a red box, here it will be āGetDateā. This means, your sentence is recognized and is ātranslatedā correctly to a command. Yeah! Roght now, we didnāt send anything out, it is just āinsideā Rhasspy, to check, if a sentence works.
If you want to take a look, push the button āShow JSONā, and youāll see exactly, what Rhasspy is sending over MQTT.
For our guide we are happy right now, our first intent was recognized by Rhasspy. So letās move a step further, and check the box on the right that says āHandleā. If you now push āRecognizeā again, Rhasspy isnāt only recognizing your intent, it will additionally send out the command (the JSON you can take a look at) to HA. Move over to the automation list in HA and you should see, that the automation ārhasspy_GetDateā was executed. It should show a timestamp for the last execution (shouldnāt be too long ago, depending on how long you needed to switch over to HA).
Note: you wonāt hear a spoken answer from Rhasspy, this is purely to check the connection to HA!
If this works correctly, now is the time to check if your voice command and the answer are running as well. Leave the āHomeā page open and speak your wakeword followed by one of the sentences. You should now see your spoken sentence in the āRecognizeā field, followed by the command in the red box. And while youāre reading, you should hear your answer from HA spoken through Rhasspy.
Congratulations, your first voice command works, Rhasspy is doing itās job and HA is ready to answer your questions or to do something for you. Pad your shoulder, you did great!
You think weāre done here? Nope, thatās only half the way, but donāt worry, from here on itās merely an expanding than doing something totally new. The next steps are to refine the sentences and sent something to HA, that actually does something, like switching a light.