I have an Intel Nuc with proxmox.
Proxmox is virtualisation software running on a linux base. Easy to install on a device like a NUC or the Dell hardware you have.
You can read something about it here
100 questions incoming!
Are you still using rhasspy for wake word? Also, what sort of processing speeds are you getting with HA assist? Are you using vosk? Gpu?
Thanks man! I’m trying to get a sense for the best way to set this up. Right now I’m running rhasspy and it’s great. I’ve just tried HA assist and it was 13 second waits. Probably because I have a non AVX CPU. But even so, I’ve heard it is slow. If I could figure out where to put my money into something that can run HA assist fast, I would…but I also don’t want to be forwarding gpus through proxmox and what not.
A lot has happened in the past 11 months with Home Assistant and Voice Assist.
Rhasspy 3 … not so much, but just last week Mike re-stated his intention to update Rhasspy 3 with things he has learnt from Voice Assist.
If you are using Home Assistant I suggest you do searches on the HA forum.
I started with HA and Rhasspy all on a RasPi 4 (which runs surprisingly well), but am now running HAOS in a VM under proxmox on an old i5 business PC (with old basic video card because it runs headless); and for a satellite using the same Raspberry Pi 3B with USB mic and speaker I used with Rhasspy 2.11 (now running wyoming-satellite and wyoming-openwakeword). I also have to wait almost 15 seconds after I stop speaking before Voice Assist recognises the silence and starts to process my command - but after that it’s pretty fast. I assume there is a Voice Activity Detection setting I have missed.
I just started with Rhasspy and I love it. I have tried to handle the “No intent recognized” but can’t. Why I want to do that? I want to call chatgpt from HA when the intent has not being recognized by Rhasspy. Anyone knows how can I do it? It is possible? I do not see any config to send the intent to HomeAssistant if it is not in Rhasspy. Thanks!
Are you using Rhasspy 2.5, Rhasspy 3, or Home Assistant’s Voice Assist version ?
I recall forwarding unrecognised intents to ChatGPT being discussed previously, but it wasn’t something I personally followed.
I am using Rhasspy 2.5 in a Raspberry Pi. Didn’t find the unrecognised intents topic in here. Did you find it? Thanks
That’s probably because it was over in Home Assistant’s community
If you are using Rhasspy with Home Assistant, I strongly recommend you change your focus to HA’s Voice Assist.
Rhasspy 2.5 is great (and i still use it for my production Home Assistant setup) … but Mike has applied his Rhasspy v3 improvements (and more) to Voice Assist … and that’s where the bulk of development effort is currently.
Ok thank you for taking the time to explain to me all of this. It really helps. I was about to setup everything with Rhasspy because it looks stable and very fast. I have tried to use HA Voice Assistant and never worked fine. Defining fine: I use it in Spanish instead of english. No idea if that is related to the issue but most of the times, Voice Assistant does not respond. I have HA installed in my Raspberry PI 4. Everything works like a charm. I use HA for around 8 years or more. Sadly Voice Assistant never worked fine to me. I will give a new try then. But again, Voice Assistant and all the setup it needs for recognize a voice and etc has been a pain in the time…
And Rhasspy 3 is only in english. Which does not make any sense.
Rhasspy 2.5 is stable, fast and works well … but is now a few years old.
Mike started developing Rhasspy v3 incorporating new ideas he had, and it was only released as a preview … I would recommend that you skip it.
Most Rhasspy users were using it with Home Assistant, so it was a “no brainer” when Mike was employed by Nabu Casa to integrate Rhasspy into Home Assistant. I for one have been surprised how much time was spent building the framework in HA for Voice Assist - but the tight integration has been worth the wait (even though it has meant more learning).
A pity that you have had problems with Voice Assist … but I am also not particularly surprised. Home Assistant supports so many devices and methods - and Voice Assist is no different. It doesn’t help that HA’s documentation standard is more like developers internal notes, than intended for users
Personally i consider Chapter 5 early this year was when Voice Assist became a viable alternative to Rhasspy 2.5. Still more work to do - and plenty of extra features have been suggested (like the fallback to chatGPT, and multiple concurrent languages).
How long since you tried Voice Assist ? I certainly found it confusing trying to set up … hence my post to add bits of explanation I felt were missing. Are you wanting to run Home Assistant and Voice Assist on the same Raspberry Pi 4 ? I understand they can run together … and there are Voice Assist options which don’t require so much processor or memory.
Languages other than English do add complexities - though I expect that Spanish would have a reasonable size user base to help identify and fix any issues - but I think Voice Assist not responding may be more likely due to the microphone and recording setup.
Thank you once again for taking the time to write and respond. I have tried using a Voice Assistant, but it doesn’t seem to recognize Spanish very well. For example, when I say “turn the room light off” (in Spanish), the command fails. The debug output shows a misinterpretation: “turn therum light off” (in Spanish). I have experimented with different phrasings, but none have been successful. Text input works fine, but unfortunately, voice recognition in Spanish leaves much to be desired.
Despite being a developer, I’m looking for a more out-of-the-box solution. Home Assistant is fantastic, but I think there needs to be a stronger focus on improving the UI/UX—the user experience is not as good as it could be. Nonetheless, I adore Home Assistant. It has significantly enhanced my life, especially with its stability over time, which I greatly appreciate.
What I really want is an intelligent assistant that doesn’t require setting up each individual intent. Intents are outdated—Alexa uses them, and they feel deprecated. We should have an AI assistant that understands commands effortlessly, whether it’s turning on the bedroom lights (and discerning which bedroom if there are two by using proximity sensors, phone location, or even voice detection) or fetching a recipe, directions, or even trivia like who the best soccer player in the world is (obviously, Messi).
So far, as a developer, I’ve created a middleware that bridges Rhasspy with Home Assistant (HA). Instead of using HA intents, I send commands directly to my middleware, which is connected to an AI (ollama) and has full knowledge of all my devices through the HA status endpoint. For example, if I say “turn on my garden lights,” the AI understands this command as referring to both ‘garden1’ and ‘garden2’ entities—I don’t need to create numerous specific intents.
I chose Rhasspy because, despite my attempts, I struggled with Python; I am more proficient with Node.js. Unfortunately, the Python code I wrote had some issues, so I stopped pursuing it. Rhasspy handles what I need it to do, though it cannot process commands without predefined intents yet. I will continue working on this integration, aiming to make it simpler and smarter.
Please let me know if you have any questions or concerns.
Your approach is what I think a home assistant should do. Congrats for having succeeded in this way, a home AI. Do you plan to describe what you have done and how you proceed exactly ?
Hello @JanWolf - Here’s a detailed overview of my latest project. I’ve successfully integrated Rhasspy with a custom-built middleware called RhasHomeBridge, enhancing our voice-controlled environment. This middleware acts as an intermediary, managing both Intent Handling and Intent Recognition processes through remote HTTP requests directed at RhasHomeBridge.
The system enables complex commands, ranging from basic requests like “turn on my living lights” to more intricate interactions such as narrating a horror story. Depending on the command’s nature, RhasHomeBridge determines whether to forward it to Home Assistant (HA), Ollama, or ChatGPT.
One of the innovative features is the system’s feedback mechanism. If a response from Ollama is satisfactory, I can command the system to retain this information. Conversely, if an action fails, the system is instructed to log the error and adapt to correct itself in future interactions.
Significantly, this setup eliminates the need for manually writing out intents or managing multilingual translations—Ollama handles these complexities seamlessly. The configuration is streamlined to avoid extensive setup, making the system more user-friendly.
My aim is to reduce the need for constant configuration and to move towards a more plug-and-play approach. I envision a future where AI handles more of our routine tasks, minimizing the need for detailed configuration files. For optimal performance, while Ollama can run on a Raspberry Pi, a more robust hardware setup is advisable to enhance processing speed and efficiency.
Ultimately, this project represents a step towards creating a personal AI ‘pet’ that seamlessly integrates with various devices, providing a smarter, more connected home environment.
Nice but could you publish details on your middleware you called RhasHomeBridge. Maybe you plan to make it available on github ?
Yes, I will once I finish the integration. Now I am trying to make the ReSpeaker to work with Rhasspy which I couldn’t yet. Any guidence on that you have? Thanks!
@donburch I went through your tutorial but I it did not work for me. Can make the system listen the wakeup word. The part of mqtt don’t get it. Internal or External. I do not have satellites. I am just testing with one raspberry.
That’s the problem with supporting so many different hardware and integrations … it can be so confusing trying to work out which bits are relevant to your own setup … especially when people write documentation and user guides without acknowledging that they apply to their own specific combination.
I use NodeRed for automations (before HA did a major revision) and have 3 satellites - so use the HA Mosquitto Broker add-on as an external MQTT server. If you are running everything on one machine the Internal MQTT should be fine for you.
Are you using one of the reSpeaker RasPi HAT models; or one of their USB models ?
I am using one of the reSpeaker RasPi HAT (2 mics) models in a Raspberry Pi 3+.
I found an image in here: GitHub - respeaker/seeed-voicecard: 2 Mic Hat, 4 Mic Array, 6-Mic Circular Array Kit, and 4-Mic Linear Array Kit for Raspberry Pi that should support the reSpeaker. I have installed and I could make the output audio to work but not the microphone.
Also, Rhasspy is so unstable in the way that sometimes it detects your devices and sometimes it does not. You need to click refresh button in order to do it. It is not intuitive at all.
So maybe you can help me with my setup.
- Do you have or know a working image that I can directly install in my RBPi?
- If not, and lets assume that the one I have downloaded from that link I just shared with you, which is this one: https://files.seeedstudio.com/linux/Raspberry%20Pi%204%20reSpeaker/2021-05-07-raspios-buster-armhf-lite-respeaker.img.xz, works, can you help me to configured it? I have followed your steps but they didn’t exactly worked for me. Also I have tried the Home Assistant Plugin Assistant with an USB connected to the RaspberryPi 4 and didn’t worked either.
I did:
sudo apt-get update
sudo apt-get install git
git clone https://github.com/respeaker/seeed-voicecard.git
cd seeed-voicecard/
sudo ./install.sh
sudo reboot
Then I tested it and worked, the mic and the audio:
aplay -l
arecord -l
arecord -D "plughw:2,0" -f S16_LE -r 16000 -d 5 -t wav test.wav
aplay -D "plughw:2,0" test.wav
Then I have downloaded Rhasspy
wget https://github.com/rhasspy/rhasspy/releases/download/v2.5.11/rhasspy_armhf.deb
sudo dpkg -i rhasspy_armhf.deb
sudo apt-get install -y jq libopenblas-base
sudo apt --fix-broken install
sudo dpkg --configure rhasspy
sudo apt-get install python3-venv
python3 -m venv ~/rhasspy_venv
source ~/rhasspy_venv/bin/activate
And finally executing it:
/usr/bin/rhasspy --profile es
Here said that mosquitto was missing, so:
sudo apt install mosquitto mosquitto-clients
And go again:
/usr/bin/rhasspy --profile es
Audio Card found:
Statistics Working:
when clicking on wake up button, the popup never disappears. I see some errors but the problem is not the errors. The issue is that is really hard to install this app. I am looking for an image ready plug and play.
At a quick read it looks as though that should work … but leaves me with a few questions.
Which tutorial did you follow ? This one ? Otherwise please post a link to it.
When I set mine up, the .IMG file on seeed website was already over 4 years old, and so I did not use it. It looks as though someone has updated the image to “buster” version (2021), so it should be usable.
When i was setting up my own Rhasspy satellites I had similar problem and so made a point in my tutorial of testing the audio input and output with arecord and aplay. No point carrying on until you know the hardware works.
Note that “arecord -l” lists only the currently selected recording device, whereas “arecord -L” lists all the available devices. When you did the arecord/aplay of
arecord -D "plughw:2,0" -f S16_LE -r 16000 -d 5 -t wav test.wav
aplay -D "plughw:2,0" test.wav
did you speak and hear your speech ? If not, you will have to get this working first. Maybe “plughw:2,0” is not the correct device.
I don’t recall installing libopenblas-base or having to do a --fix-broken install … so I’m wondering if there was a problem here. Is rhasspy_armhf.deb the correct version for RasPi 4 (my RasPi Zero and RasPi 3A used different packages files) ?
Both my current Rhasspy satellites use arecord for Audio recording … though I remember being surprised a while back to discover that one was using a pyAudio … so it may not make a difference.
If you are seeing the figures under “Audio Statistics” change with sound in the room, then Rhasspy is detecting sound.
The “Wake up” pop-up should time out after a minute or so. This suggests that there is a problem … and probably an error message somewhere.
yeah … NO ! Error messages always indicate a problem, especially when we don’t understand what the message means.
Tell us the error messages, and when and where they occur.
I understand that you’re frustrated … it took me 3 attempts over 12 months before I got Home Assistant running, and Rhasspy was almost as many learning curves. The official rhasspy documentation is better than a lot of FOSS projects - but still still noticably “notes by the developer for other developers”. User documentation is a whole different kettle of fish.
Google and Amazon provide ready plug and play devices. They even have a lot of support in Home Assistant.
Thank you for pointing out the tutorial. I apologize for not sharing the URL earlier; I thought I had included it.
Here’s some fresh and good news: after numerous attempts, I finally got the microphone and speaker to work. This task is now accomplished. How did I do it? I followed some steps in a different order, and it worked. You were right about the ERRORS indicating something specific: a library needed updating, and I needed to understand better how the Test button functions when testing the mic. It seems you have to click it until the system recognizes it and displays a “working” message.
Now, I have another challenge that hasn’t gone as planned, so I might consider moving on from Rhasspy and continue with my previous coding project.
Here’s a summary of my current goals:
- I don’t want to create intents with device names. It involves too much writing for anyone integrating an assistant with Home Assistant (HA). Additionally, it’s not just about the intents but also writing down the device names.
- I want an app that recognizes smart events and can handle conversations without setup, like in Home Assistant. I want to send any input and have a smart engine understand it. I’ve achieved this using Ollama and the ChatGPT API.
Here’s what I’ve done with Rhasspy so far:
- Wake Word Configuration: I’m using Porcupine for the wake word. Rhasspy needs improvement in this area. For instance, it should be listening asynchronously all the time using threads and queues. Currently, it’s a synchronous app, which is not ideal for voice assistance. Continuous listening is crucial, as anyone who uses Alexa will understand.
- Speech to Text: I’ve tried Pocketsphinx and Kaldi. They work fine unless the option [Open transcription mode] is set to true. With this option, they use the entire word library, which is my intention. However, it causes issues:
- It takes 30-60 seconds to recognize simple commands like “turn off the lamp” and often doesn’t recognize them correctly.
- It takes around 10 seconds to process another voice command, likely due to its synchronous nature. It’s not listening continuously like Alexa. I’ve implemented continuous listening in another app I’m working on.
- Intent Recognition and Handling: For intent recognition, I just get the text and return it to Rhasspy, which then calls my middleware for intent handling. The middleware processes commands via HA, Ollama, or ChatGPT, depending on the received command. My next step is creating an Assistant for Ollama. Currently, I have one for ChatGPT that understands and processes commands without prior setup.
Next Steps:
- Edit Rhasspy Code: I could make Rhasspy asynchronous. However, this might take too much time, so I might focus on completing my own async listener.
- Improve Voice Recognition: Continue with synchronous Rhasspy but implement Amazon Transcribe for better voice recognition and faster processing.
- Upgrade Hardware: Install Rhasspy on a more powerful computer to see if using Kaldi or PocketSphinx improves recognition speed and accuracy.
I have two apps: an incomplete async listener and a middleware that’s mostly finished but might have some bugs. I’m willing to open the repository code to everyone, hoping Python developers can help improve both libraries.
Once I finish integrating Amazon Transcribe with Rhasspy (using execute local command), I’ll share the middleware code. I need to work more on the listener before releasing it.
If any developer is interested in helping improve this library, please send me a PM.
Thanks for reading!
Well finally did what I wanted to do. I noticed that Rhasspy is too slow to work with any dictionary and even using Kaldi, for some reason, isn’t really accurate. So I Have created a new endpoint which process the wav audio generated by Rhasspy much much faster. I am using Vosk package and Kaldin Recognizer. With the text I get from the WAV file I just call my MiddleWare which calls HA.
Now I will work on creating one library instead two and I will publish it for everyone to use it.