Echo dot => No Alexa and Yes Rhasspy


Je suis très très mauvais en anglais donc j’espère que quelqu’un sur le forum parle français.

Je viens d’installer Rhasspy sur une VM avec Docker. Cet équipement ne possède cependant pas d’enceinte et de micro.

Est-il possible d’utiliser des Echo d’Amazon et de les utiliser pour Rhasspy en supprimant tout d’Amazon dessus afin que cela reste purement local ?

En espérant que vous puissiez m’aider,
Merci par avance!


I am very very bad at English so I hope someone on the forum speaks French.

I just installed Rhasspy on a VM with Docker. However, this equipment does not have a speaker and a microphone.

Is it possible to use Echo’s from Amazon and use them for Rhasspy by deleting everything from Amazon on them so it’s purely local?

Hoping you can help me,
Thanks in advance!

We wish ! Google and Amazon have invested a lot to make very good hardware … but it is locked to their cloud services.
With their low prices for the hardware I wonder how they make a profit :thinking:

Nous souhaitons ! Google et Amazon ont beaucoup investi pour fabriquer du très bon matériel… mais il est verrouillé sur leurs services cloud.
Avec leurs bas prix pour le matériel, je me demande comment ils font du profit :thinking:

1 Like

A mon avis (je n’ai pas Alexa) non, sauf décortiquer pour les composants! Main, au moins pour tester, un casque + micro en attendant "mieux?

1 Like

Its been strange watching Alexa as they where 1st, but they sort of came to a halt with technology and added more to there hardware in terms mics, speakers and amp.

Google dropped the number of mics to x2 but had a huge advantage in the IP of the ML they own and now process audio completely different under much less load than Alexa.
Where Google likely dropped beamforming for VoiceFilterLite so in the price war between the 2 Google since the 1st models has reduced manufacturer cost whilst Amazons have increased.
But this has took an interesting turn as the war was to garner services as in VoiceAI Amazon almost became another Microsoft with a platform monopoly.
Google is now also the leader in cutting edge offline ASR that runs ridiculously powerful ML on a few watts on there tensor chip totally offline.

Even though prices have comedown with black friday deals offering a Google Pixel6 for £200 and I was tempted purely just to test but that IP is budget levels and will get lower.
But also Google has been handing Amazon a beating in the cloud again with ML where the Google services offer far more for $/watt where Google across awider range is only losing $1.6Bn but doesn’t care with its $65.1 billion in revenue it brought in total. Where Amazon has been late to the party with there Graviton 3 Arm based chips.

If opensource is going to compete in anyway it needs to follow a similar infrastructure of heavy central ML reliance where my new Toy of a Rock5b has near similar performance to a Pixel 6 Phone.
Its an octa core A76/A55 @ 2.2/1.8 Ghz that approx gives 2Tops worth of ML perf in about 5watts, with a MaliG610Mp4 GPU that likely gives the same 2Tops of ML but with 1watt and also a 3x 2 Top core npu giving 6 Tops total which was on OKDO for £140 with free delivery with the black friday code.
Mics are just mics and you need multiple low cost micro-controllers to service multiple zone and feed a single low energy ML server that in terms of ML compute could be roughly 30x that of a Pi4.

If you want the best cutting edge ASR then buy a Pixel6 phone as Google are making little to no money on it but staking a claim in the future of embedded AI.

Apple also with there new M1 Arm based computers as the wattage/ml perf is out of this world they have even created a new Arm instruction set called an AMX-2 so unlike a NPU & GPU it works in the same memory space as it is the CPU, but with Apple prices. The Neon is a co processor that works in a different memory space so the copy to and back doesn’t happen with AMX-2 and that is likely the x2 over Neon.

Benchmark results · Issue #89 · ggerganov/whisper.cpp · GitHub is a really interesting benchmark thread where users with Macs, Graviton, Raspberries & myself have been posting results where the Macs & Graviton are obviously posting huge figures.

Je vais voir pour cela.

Sur un autre forum, on m’a conseillé un raspi et un orange pi, je ne sais pas trop entre les 2 car je compte mettre pas mal de satellite, et une alimentation en PoE serait clairement un plus. Entre les 2 j’attends des avis pour prendre soit l’un soit l’autre tant qu’il permet le PoE.

On m’a également parlé d’un autre site où j’ai trouvé : 6 Mic Array for Pi - ReSpeaker
6 micro ce serait pas mal ! Mais on m’a dit qu’il y avait des soucis niveau driver… Donc… j’attends d’avoir des infos à ce sujet également ^^

6 mic is bad and seems TDM is bad on the Pi generally. What TDM does is swap out max sample rate for channels so the normal think 192khz max I2S stereo is 6 channel which could be 64Khz but they are doing it at a more normal audio 48Khz.
The problem seems to be there is no sync so anyone 1 of what is really x3 stereo pairs could be pulled as the 1st word pair.
So what that means simply is the channels come in completely random so if there was a beamforming alg available depending on alg it may not work.

If you want to buy one buy via paypal as likely you will want a refund like the one I have gathering dust on my desk.

The only hat with a working beamforming software alg that I know of is the 2 mic as I am the only one to create a working beamformer of any use on the Pi.

ProjectEars/ds at main · StuartIanNaylor/ProjectEars · GitHub but like all the other hardware beamformers its not very good unless you sync KWS & beamformer to lock onto a command sentence.
I never did implement that part.

Also with the resources of a PI the only type of beamformer that will run is a GCC-PHAT Delay Sum as tried others and slower than realtime.
This has further problems as Delay Sum needs specific geometry whilst the 6mic only design was to look like a beamformer as its geometry for Delay Sum is totally wrong and it can not run another.

Its one of those wonderful pieces of tech that was created not because it can but because someone could.

D’accord, donc… si j’ai bien tout compris, on oublie clairement le 6 mic.
J’ai cru comprendre qu’il y a plusieurs types de 2 mic, lequel me conseillez-vous ? Auriez-vous un lien ?

1 Like

To use on a Pi Florian?

As they are essentially all the same with a wm8964? something like that whatever the chip is its actually the same and they are all sharing the same drivers.
A clone one as to be honest I don’t think it matters.
Some say they Respeaker has better mics than the Keyes studio but have my doubts and think likely the same.

The geometry isn’t perfect but really not that far off as they are OK, but like all hats the onboard makes things sort of awkward as its better to have the ports facing the voice actor than horizontal as that will garner some natural reduction in incoming rear sound than horizontal.

Also again being onboard it also makes it hard to isolate against vibration and the audio output in fact near impossible with the AEC/NS we have.

Why I feel using Mic only satelites or broadcasting to a wireless audio system such as RaspiAudio LMS, Airplay or Snapcast is just better. The simple physics of some distance to the mic.
But many get them as the interest is to play with opensource voice systems than have a production ready system.

Likely if you are just going to have a Mic or ‘Ear’ as I call them I2S mic modules off aliexpress with the best params you can find can be used with the adafruit driver which I think I can fix but also never got round to (I think it records in 32bit they are not they 24bit and prob why people complain the are quiet)

Works with any I2S (Not PDM) mic I know.
They used to be really cheap on aliexpress but like everything have increased in price and I have forgot which have the best sensitivity and SNR but all that info is avail.
You can set to exact distance and likely mount and isolate much better in any direction to the board.
But you will not have any audio out unless maybe a hdmi audio extractor but they only work when they are plugged into a hdmi monitor or screen.

PS I have a tendency to use these for Pi GPIO 2x +/- rows that help to quickly make multiple connections to a single pin with dupont jumper leads, otherwise solder up a cable.

Cheap I usually just get x5 at a time and have them around

I find it hard to recommend any particular hardware for a satellite at the moment.

  • Raspberry Pi was popular for Rhasspy, mostly because it used to be easily available and fairly cheap - but now they are hard to find and expensive :frowning: I believe Orange Pi is essentially a copy of Raspberry Pi. RasPi Zero is a good size for a satellite, and it works OK, but I think a bit slow. The Zero 2 W or a RasPi 3A+ are better options … but hard to get.

  • for microphones, I have a reSpeaker 2-mic HAT, reSpeaker 4-mic HAT, and Adafruit 2-mic HAT boards - but I agree with rolyan that their differences are minor; and their driver does not make good use of the hardware.
    The Raspberry Pi IQaudio Codec Zero uses a different chip to the reSpeaker devices, and has a different driver. I have not used this board, so cannot comment on it.
    My latest satellite uses just a cheap USB microphone and gives much the same result. Of course I am just a user of this stuff, but rolyan is obviously an expert in the audio field.

  • There are other multi-microphone units (such as with firmware providing features like Voice Activity Detection, Direction of Arrival, Beamforming, Noise Suppression, De-reverberation, Acoustic Echo Cancellation … but at a price.
    Several conferencing microphones are similar.

  • rolyan has pointed out the ESP32-S3 chip as being a much better choice for a voice assistant … but i believe we are still waiting for the software. I am hoping that with @synesthesiam joining Nabu Casa (who are also behind ESPhome) this might change next year.

Yeah I should of mentioned what Don mentioned and almost any cheapo usb sound card (there are a few bad ones) and a unidirectional mic.
Also plugable does one that I think is now the only reasonable priced stereo mic usb.

What Raspberry are doing by turning thier backs on makers whilst they supply commercial is completely utterly crazy as projects are starved and look at other solutions.
ESP32-S3 hasn’t got enough Ooomf for a all-in-one voice assistant as Espressif tried that with there Esp32-Box and as ASR I will use the technical term of crap (Just too low powered, but they did cram it all in)
It makes a perfect wireless mic that could run a pretty hefty KWS model and have paradigm shift on a single home central brain, distributed mics in a room with a wireless audio player. For a single room its pretty expensive for multiple rooms its extremely competitive and have been banging on about it for now what seems ages.
Whisper.cpp Benchmarks

CPU OS Config Model Threads Load [ms] Encode [ms]
RK3588 Ubuntu20.04 NEON tiny 8 226.48 ms 2681.05 ms
Raspberry Pi 4 - 2GB OpenVoiceOS NEON tiny 4 743.37 10122.80

Also someone will have to try this with a Pi4 as don’t have one
If I compile with march=native -ffast-math but maybe the Pi also gains much the same and run just on the big cores

CPU OS Config Model Threads Load [ms] Encode [ms]
RK3588 Debian11 NEON tiny 4 228.24 1177.55

The cpu alone is x3.775 Pi4 so it makes its $150 approx price quite good value compared to the 8gb Pi4 if you could get one, as will ignore the native compile which is x8.59.

With the right models it could service a whole house worth of wireless esp32-s3 mics.
Mainly because chance of collision is pretty rare due to the nature of voice commands.

So could an Odroid that is prob somewhere between the two (nearer the Pi4 than Rk3588).

Ameridroid also do the Rock-5b

If Aus prob China is your best bet with Allnet?

1 Like

PS there is a Strange Fruit avail at a ridiculously low price.

That is a real deal as the distro images are going to have to be real bad to make that a bad price.