Rhasspy with simple Home Assistant "Skill" on PI3A+

orca8119 · September 3, 2020, 9:05am

Hello All,

I just want to share a little of my experience with rhasspy - and maybe get a little feedback and tips.

First of all, I want to thank all people, who made this project possible - synesthesiam and the rest of the rhasspy contributors, kiboost for his topic on project HermesLedControl and all the community members sharing tips and Ideas.

Secondly, please forgive my misstakes, but I’m not a native speaker.

So, now my setup (even if it is not that impressive):

The basic setup is a ReSpeaker 4 Mic Hat on a Pi3A+. The ReSpeaker has these nice LED ring and I found it cool to have (and I still find it cool). Elsewise the ReSpeaker is not that great in my oppinion. But more later. By the way, it does not have a case yet

Rhasspy Installation
I run rhasspy in an virtual environment with the sources cloned from github and I must say, since Rhasspy 2.5 installing and running in an venv workes fine for me.

Microphone
I find the ReSpeaker is a rather noisy microphone and the wake word detection and stt is average. So I experimented a little come up with a little custom denoise scipt to improve the audio stream (denoising, volume leveling and so on). So, if you have problems on ReSpeaker mics, you might want to try a custom audio record script. Also I stream the audio locally via UDP because I personally find the permanent audio stream is to much on the MQTT broker.

wake word
For wake word detection I use snowboy with JARVIS. The wake words from porcupine arn’t that great (in my oppinion) and I cannot get MyCroft Precise (I want Marvin of cause) to trigger stable. So I named my Project JARVIS for now. Unfortunaltely with my denoise script, snowboy gives me even more false activations (with sensitivity 0.66).

LED Ring
I use project-alice-assistant / HermesLedControl with the Alexa pattern for the ReSpeaker LED Ring - thanks to KiboOst for pointing me to that.

Intent Handling
Core of my project is my self build Home Assistant “skill” (along with some other basic intent handlers). This allows me to define the things I want to do directly in the sentences.ini (so no need to change my Home Assistant Config to handle the intents). This is for simple cases of cause, but for me, it is enough to switch lights on and off, activating a scene and so on.
So I can define:

[homeassistant-ChangeLightState]
switch on the light (:){entity_id:light_1}

without tinkering with my Home Assistant or restart an Service. Retrain Rhasspy is enough.

I’ve uploaded the sources on GitHub # mk-81 / rhasspy-simple-intent
So, if you are interested, give it a look. It is not complete of cause.

The rest is pretty straight forward I assume:

NanoTTS for text output
Kaldi for stt (Mozilla Deepspeach didn’t work and is propably out of maintainance by now).
Fsticuffs for intent recognition - didn’t get Snips working on my PI3, MyCroft is not supported by now and Fuzzywuzzy was way to fuzzy for me

Hope you enjoied my setup a bit and if you have any tips, you are welcome to share. Especially for improving wake word detection and reduce false activations or getting MyCroft Precice to recognize me.

rolyan_trauts · September 4, 2020, 2:30am

Yeah I have the 4-mic and also the linear.

I wouldn’t worry to much about noise as a recognition mic has a different focus than a studio mic.
Part of the MFCC process drops low energy elements and that noise for recognition is often not present and has little effect.

I am not a fan of any array mic that doesn’t have beamforming because without beamforming the extra mics are just a duplication and provide zero extra functionality.
They are really inflexible as the form factor is already set but the reality is its just a omnidirectional planar mic that will often be positioned on a shelf next to a wall.

Until someone provides a decent DOA & Beamforming lib all the wonderful and great array hats range somewhere between over optimistic and snakeoil.
Even the supposed DSP embedded EC models (often usb) are far from stellar performers.

Also if playback & capture is not on the same card with the same clock then the only good EC via speexDSP will not work. There is a version that uses hardware loopback that might be but never got it working.

Maybe you could give ec_hw a go.

Pi3A+ is cool though bang4$ best Soc Raspberry do.
With ML such as tensorflow its approx x40 perf of a zero but $10 more.

orca8119 · September 4, 2020, 6:15pm

Hello Rolyan,

thanks for the tip. I’m not sure if it all worked as it should be, but I’ve followed the steps from AGC/Denoise and EC. Now I have a whole lot more devices but at least EC runs. I’ll try the system the next days.