DIY Alexa on ESP32 with INMP441


Just started with Home assistant and already amazed with how advanced things are. Was away from electronics hobby for a few years and was still in the arduino / PIC dark ages.
I was wondering how to setup a homemade alexa without having the cloud interaction. Rhasspy seems to be a winner. Thanks to all the good work made by @synesthesian.
In my case my home assistant is one virtual apliance install in one VM without access to hardware. Also i don’t find pratical having to speak near the server. So the ESP32 seems a good idea for having a satelite solution to pass the voice commands to Rhasspy from more that one place.

Then i stumble with this video from Atomic14 in youtube. Not sure i can post it here. ( )

This is half the work done, already has the wake up voice detection and passing the wav to a destination. Left a comment for Atomic14 to have a look into this forum. Hope this is helpfull for everyone following this topic.

I feel grateful for getting in touch with people that have the same hobbys as i have.

Hope i’m not building castles in the air and this is all doable.

I just received my ESP32 dev kit + INMP441and setting up the Visual code install.

Count me in for any testing/developing on this branch of the project. Not a pro programmer, but will do my best.

Thank you!


Welcome and good luck!

Hi beared,

I had the same Idea and found your post on the search for solutions. Thanks for the hint at the Atomic14-Repo. At the moment I’m setting up Rhasspy on a faster desktop with pi1s I had lying around as satellites/wakeword-clients with microphone as a replacement for my snips-system.

As I only have 3 pi1s lying around and around 6 Rooms to cover with voice-control I’m always looking around for some costworthy alternatives to the pi. With this setup a satellite would only cost around 20 Dollars, which would allow me to even equip our basement with voice-control ;-).

I think these are no castles in the air, but the job could be done about 90% by Atomic14.
Will also try to get the hardware components so I can also try to get it working with rhasspy and contriubte some code-snippets. (I’m a Java-Dev, so C++ is writable but not my speciality) Will post the code-Repo here as soon as I have the hardware and the time to make some progress.

I also toyed with the idea a while ago:

For instance, I have this little piece of hardware lying on my desk:

Unfortunately I haven’t found the time yet to try to implement something. But I’d definitely like to have something cheaper and less power-hungry than a Raspberry Pi to work as a Rhasspy satellite.

@koan that is a nice device!
I wanted to have a small device as well and this seems exactly what I need.
My plans are to have a device like this each room, with some wakewords.
For instance, when my daughter enters her bedroom, she can say “lights on” and the light in her room switches on.
There would be this deviceL

  • lights on wakeword
  • lights off wakeword

Both of them taking action right away, so it is not a voice assistant but rather a very simple, very view command system.

Why? Because when you enter a small room, first activating the assistant and then ask it to switch a light takes too much time. It is much faster to flick a switch.

The Matrix Voice is not suited for such use cases, mine is not even in use at this moment since I have no nice case for it.

So you would train it then to recognize just a limited set of commands/wakewords? That’s actually a nice idea, a sort of ‘audio switch’ instead of a tactile switch.

Yes indeed, most of you want to turn on a couple of lights or activate a scene.
So, if you walk into your livingroom you can say “lights on” or something like that.
Rhasspy can be trained with Raven and you should be able to get something going with MQTT, Node-red to create actions directly responding to wakeword activation. In this case turn a set of light on or off.
Even better would be on device wakeword, but that is harder to do