Hello!
For a while now, I’ve been working with and learning all about Rhasspy in order to build (you guessed it) a private voice assistant! I’ve decided to name the end result “Calico”. I was actually surprised when I saw a similar GitHub project of the same name. I promise I meant to be original!
Most of my tinkering has used the Rhasspy Docker image running on an Ubuntu VM with Oracle Box’s guest additions. Calico runs on the host OS (Ubuntu) and listens/posts to Rhasspy’s MQTT broker. Since I’m a very novice programmer, I used LLMs to get a lot of the logic right, but there was still plenty of troubleshooting!
Essentially, a main service acts as a listener, then calls the required skill class to execute anything actionable. For example, my test skill Ask Me Colors, demonstrates back and forth communication. Once triggered, Ask Me Colors will ask the user their favorite color, and then send a command to Rhasspy to start a dialogue session that only listens to the Answer Colors intent. Once the user tells Rhasspy their favorite color, Calico will collect the user’s answer and send another dialogue request (that doesn’t require user input) with feedback.
These “answer intents” are how I’m separating intents that start skills (such as Ask Me Colors, an intent of the same name) from ones that are only used for follow up questions. Since these answer intents don’t have their own skill classes, they’re ignored by Calico.
There are currently only five skills: Ask Me Colors, Tell Time, Local Temperature (uses weather data), Open Gmail, and Open Settings. The last skill opens a small settings GUI where the user can enter locality information, a preference between Fahrenheit and Celsius, and select metric or imperial for all remaining measurement units.
I’m planning to implement more skills, such as one to give the forecast for the current or following day, which will be the first skill to utilize the metric/imperial setting as it’s currently not in use. Further updates would include a launcher GUI, a service that listens for weather alerts, and one for amber alerts - both with non-heart attack inducing alert tones.
Below is the link to Calico’s GitHub repo. The readme sucks, but basically if you’re on Ubuntu and have Docker CLI installed already, you just need to run the Start-Calico.sh script to get the other dependencies installed. I’m also deplorable and have been working out of the documents directory, so you’ll have to put a “Calico” folder there to place the files in.
I’ve been working on building my end goal hardware setup on a Raspberry Pi 5 (8 GB). This was going to be a smart speaker, but I’ve decided to go with a smart display for now.
Some of the components have had to change, for example, Open Gmail can’t access the environment variable for the default browser, so it calls on chromium directly. Not sure if Raspberry Pi OS just doesn’t pass the variable or if I’m just not accessing it correctly. Also, Porcupine doesn’t work on the Pi, and I have yet to investigate why. I’ve switched to Raven for now and it works fairly well with the 18 reference recordings I made.
One last bit that I have yet to figure out is, if I mess with the system volume or play any sounds on Ubuntu that aren’t from Rhasspy, audio devices become temporarily unavailable for either Docker or Rhasspy itself. I haven’t checked to see if it happens on the Pi just yet, but I’m sure I’ll find out this evening.
Anyways, thanks for coming and hearing about my project!
Best,
John P.