Trying to figure out what hardware and microphone to get

I want to build one or a later stage multiple voice assistants primary to control lights and other devices in Home Assistant. I also would lilke the option to use them to play back music or other media. I want to use it in German.

I am currently running Home Assistant on a Pi 3 (venv install) but I am planing to move that to a x64 server with docker when I get the new MQTT-ZWave running on the Pi.

CPU/Memory wise it seems like a good idea to go the Rhasspy on server + satellite route. But on the other hand if I need at least a Pi 3 for good wake word detection and low latency anway would not do the speech recognition on the device too?

Another question is, what microphone to get. I found that really confusing.

At first I thought about getting a Matrix Voice ESP32 and use it a standalone satellite. The advantage would be to have a small device with low power usage that I could easly put insinde a case with a speaker.

BUT: It doesn’t seem to do any kind of audio processing like echo cancellation or beam forming because Matrix never finished those advertised features. It also can not play back audio in good quality by itself over MQTT.

AEC in general seems really important if you want to play back audio and still be able use wake word detection.

Another option are the ReSpeaker microphones. Do they all support hardware AEC and other audio processing? From what I understand that to use it you are limited to 16000 Hz playback. That doesn’t seem great for music. It seems like I need to wait for a feature device to support more than that or do AEC in software?

Which of the ReSpeaker products would be my best option?

I also looked at USB microphones like the Jabra Speak 510. You canget a speaker+microphone in a professional case and can hide a Pi somewhere else. But: How good do they work in practice, when you are a few meters away?

I does not do AEC indeed. Audio playback is not very good quality, but it is ok for basic feedback.
For music playback: no.
Also if you want a semi-consumer product: no

Only the more expensive respeaker USB ones have hardware AEC.
Yeah 16000hz with the USB ones.
16000hz actually aint that bad as its the same as wideband audio as sold on audio quality bluetooth…
Its the price of the complete unit that starts to get way more than commercial units if you go that route.
But again heard many mixed reviews on overall audio quality.

If you have playback and capture on the same clock you can do software AEC which to be honest don’t know how it compares but it should be good enough for ‘barge in’.

$65 just for the mic and leds is just shy of complete midrange commercial units.

The Jabra 510 supposedly a conference mic/speaker so should have some distance.

Respeaker 2 mic has playback & capture on the same card and clock so no clock drift and software AEC works ok but the software drivers are pretty mweh.

Edimax USB Stereo Sound Card is a stereo ADC as is the Syba sd-aud20101 that again same card no clock drift problem with working software aec.
You can add passive or even better powered mic modules to a USB card and get a pretty good sound output and sensitive mic array.
Sound cards are $10 -$20 and mic modules $2-6

Maybe start low and build up with just 2x I2S mic modules and the adafruit wiring and software libs.

The same gpio wires just double up to each mic as left and right is a high/low word in a serial stream.
(5 wires)
Then use the Pi 3.5mm as your source will still be mono without physical spacing irrespective of having to speakers as sound just doesn’t really work that way.
3.5mm outpur isn’t great quality but there are some tweaks you can do to improve but it gets you going with minimum cost and is a base camp to maybe stay or press onto whats next.
I have read you can use the HIFIBERRY DAC dtb and also run 2x I2S mics but that is a project I still have to confirm. A google will get you the info though.

PI3A+ is my basic minium for $25 but a PI4-2gb is also only $35 and I would buy a 3B+ anymore.

What you haven’t mentioned is amplifier and speaker.
I like these and run off 24V with a 5.1V buck for the Pi
Its cheap but also has standby by pulling to ground as often you can get DC hum when not in use but when in use it certainly can bang out the volume.
Speakers are strange as prob due to weight and bulk but often best buys are local and usually there is a brand quite common.
In the UK & EU Visaton FRS 8 - 4 Ohm are a good buy

I found getting a case problematic and because its a really dense engineering material ended up using 210mm water pipe from a shop that sold short lengths and cut them to make pucks.
Cut to size plastic discs are easy and cheap to get on ebay and voila my s-pipe voice-ai as I fondly call them.

I really wish we had a simple satelite system that also is a wireless speaker system as I find that a natural and common use.
Snapcast is amazing for the speaker system and in reverse can supply mic inputs but there is no interface to a rhasspy server.
I would love to see a rhaspy GUI and have it set up to the main TV as then you could get real clever and start to AEC all room audio systems by being connected and have one hell of a VoiceAi that is far better than any commercial offering I know.
If your audio system is your voice-ai then your already there with the AEC you have.

Thank you for your very long answer. It seems that there are no simple ready made solutions for this yet… They idea to start small is worth considering.

It’s all really confusing.

And yes a fully integrated solution including the TV all other audio would be nice.

If you just want to get started and just give things a trial then aliexpress some budget I2S mics as you should find some about $2 sometimes a little less.
I have a adafruit one here just haven’t got round to testing if there is much difference but it is them who provide the know how.

If the hifiberry hack works or not, not sure but if so that would be my defacto route as the Class D amps and speakers are pretty mighty its still not audiophile and makes little sense in getting audiophile dacs.

Get a Pi3A+ 2x I2S a 24v PSU maybe even an audio grade capacitor to go across the output as it will certainly do no harm to the Pi and help as the SMPS you can get can be a bit hit and miss.
Approx $10 for amp and again for speaker and just give that a whirl.
PS just give snapcast a whirl as well if you have a PC that you can dual boot and use as a makeshift server.
Again a Intel Nuc or ITX would be lovely but a Pi4 is relatively capable and much cheaper strangely the addons such as sound cards are simularly priced or maybe not so strangely.
Then off to the plumbers for some S-Pipe :slight_smile: PS I meant 120mm not 220mm.

You can make a prerrty damn good wifi speaker for about $60 that is extremely competitive commercially. Even one is pretty damn loud 2x you prob never go above 1/3rd to half volume, x4 & sub bass you may start getting complaints from the nieghbours.

I will propably start with the voice input and some TTS feedback with a very basic speaker setup (some old pc speakers?) and can always figure out better audio output later.

The problem with the I2S mics from ali express is the waiting time… Also it would be nice not to depend on GPIO pins of a Pi, allowing me to switch to something else later. (another problem with some of the ReSpeaker solutions).

What I really would like for start is something similar priced to an Amazon Echo that includes speaker, mic, CPU/Wifi and if possible audio out in a single nice case and can either run Rhasspy direct or can do offline wake word detection and be connected to a Rhasspy instance running on my server as a satellite.

Have you got a USB sound card? They are cheap.

Has a stereo Mic in but dream bass is far too pronounced for me.
If it doesn’t work out you can use it elsewhere as they are always handy.

I don’t know of a single nice case that has a speaker mount Pi cases fit a Pi and thats about it.
There is a singular puck shaped case I know of that still doesn’t fit a speaker.

Its $15 for a bit of plastic but you can shop around as the price varies.

If you find a case please post.

Doh my memory its 110mm as the 4" 100mm I got was a extremely tight fit with some 3.5" cones.