Simple cheap USB Microphone / Soundcard

That is your opinion but doesn’t a project like Rhasspy also have to look at what the userbase is actually using and listen/ aim its development at that.
Because otherwise you will loose a huge chunk of people.
And no there is a few people including me who actually use tools like Rhasspy or Voice2json in production/ day to day life on that toy hardware you call it. It just turned of my tv and my girlfriend set a timer for the tea we made.
Than I asked what the weather tomorrow will be like and that is all working rather nicely.
My mom wasn’t involved to look at it and say what a good boy i am unfortunately…

There is no user base apart from you, there is a lack of skills and people like Kibo who think compared to Snips this sucks and is currently useless.
I haven’t used a Mycroft or Rhasppy for a long time because they are so poor.
I keep looking for alternatives and hardware that might make a difference but I am not going to convey anything other than the truth.

The hobbyist programmers should enjoy what they are doing but need to reign in claims of effectiveness.

Wow you just disenfranchised a lot of people here. FYI for me it performs on par to snips which i also used for a year before it was shut down.
Don’t you think the project would be dead and the forum full of shouting people if I was the only one successfully using it?
You should really leave the sinking ship and jump to the next project.

Look at post count your about 80% of community and if truth disenfranchises I have no care.
I came to create working opensource voice ai not make friends.

Opensource is about user driven software and currently this needs to be driven hard and you seem at times disingenuous as after never using snips the only vibe I get is Rhasspy is no Snips.

It could be worse as my opinion of Mycroft is utter snakeoil :slight_smile:

No opensource is about contribution. Its about all the people who quietly contribute, be it you that helps people with audio hardware problems. Somebody designing a nice printable case. Small pull requests and so on. There is no them to drive its an us.
But i guess thats just my point of view.

From Stallman to Eric Raymond, Apache, Libreoffice to the Linux Kernel its about how contribution can make effective user driven shared ownership software.
Contributing dross in masse has little use.

PS A armour case with a 12v 40mm fan on 5v makes a super easy and low cost Pi4 2.0Ghz machine.

We do need to sort out the start of the input chain with voiceai and audio processing.
Garbage in, garbage out and currently things are not good in common domestic environs.

I know ive been running mine overclocked for half a year now in a passive heatsink case.

I tried passive with the Pi4 but under stress it just throttles.
Even with the fan it may eventually throttle maybe not but with constant load it quickly gets up to 65c.

Just found the fan just stuck on 5v is silent practically and gives far more load headroom.
Double sided sticky tape rules :slight_smile:

I never said rhasspy is useless.
There have been a long road, with api integration, bug finding/fixing after 2.5 remake, wakeword etc.

But since the beginning the intent definition and asr is better than snips. Raven has close the gap to provide a workable open source wakeword.

I have snips working for two years in house by all family, plugged into Jeedom and everyone here use it everyday for lot of stuff.

Actually rhasspy is ready, plugged into my production Jeedom or test Jeedom with a switch. And it works better than snips. BUT only things preventing me to get ride of snips is this particular problem of infinite listening with background noise/music. And there is some nice ideas here to get this improved. A max duration settings in stt listening would help a lot also. And I have no doubt @synesthesiam will soon find solutions :grinning:

Like ever said snips was a team of lot of dev and rhasspy is driven by one man and a few helpers. And apart this listening problem I ever think it is better than snips.

The world is full of people saying it’s impossible while some are doing it …

One day we will all be able to ditch snips and I would never thanks enough @synesthesiam for that.

4 Likes

We should ditch snips as its completely derailed the excellent work provided by @synesthesiam and took Rhasspy the wrong way.

PS if you are siting next to a Triangle Antal then maybe just go out and buy something that does work :slight_smile:

If you get your hands on one really do try the respeaker usb 2.0 array. Its a bit over priced for what it is but the quality jump from the 2 and 4 mic pi hat is very noticeable and that might be what you need.

hey i would like to ask and suggestion about this below product, if anyone knows that does it work with pi or it is reliable for give it try … ?

Thats just the mic board the I2S board is another layer

thanks @rolyan_trauts:+1: , but i have a question about its datasheet details says that it supports beam forming so can this be efficiently useful for voice based projects?

Yes that its 7 i2s mics on a board and all you have to do is develop your beamforming algs.

The only opensource beamforming I know of is ODAS & https://distantspeechrecognition.sourceforge.io/

I keep meaning to have a go with the latter as got as far as compiling and that does work.

Speechbrain also say they are going to make ‘beamforming’ part of a all-in-one speech kit but still to be released.

Commercial entities don’t seem to release freeware beamforming libs and you can try to find some sispeed ones but my hunch is they do not exist.

thanks @rolyan_trauts:+1:

I haven’t worked out https://distantspeechrecognition.sourceforge.io/ and even though its brilliant stuff it is quite complex so have my fingers crossed for speechbrain as if beamforming does become part of a toolkit that solves a problem even if with the DSR Toolkit via opensource it just takes one of us to work it out and then share.

It does compile I can honestly tell you that and I spent a short time trying to work out how to set the mic geometries which actually from the examples in the utils section of the defaults its probably doable.

Invensense do a great application note AN-1140 on just basic beamforming of the simplest types of Broadside & Endfire which are sort of self descriptive but the read is brilliant simple without science overkill.

There is this huge urban myth that arrays of omni-directional mems are good and the truth couldn’t be anymore different when you lack the software to turn that array into a beamformer.
Its actually better to use a single mic and channel but many are summing and I was guilty also and it creates depending on orientation of incoming audio 1st order high pass filters that means placement can give a totally different tonal pattern purely on placement.

The simple math is in the above and my head can not get past high order broadside or endfire arrays.
I haven’t a clue with the geometry of what you propose there are a smattering of complex examples in other projects (ODAS) that maybe could be hacked but for me is pure guesswork.
I did run ODAS a while ago and the DOA alone seemed to max out a Pi3 it might of even been my Pi4 but can not remember exactly.

If you can run software, develop your own algs or purchase running beamforming its why I created this thread and often comment so.
If you have a dumb array you really need to know what you are doing and that really you might have multiple mics and its much worse than they count for nothing its that they can produce terrible results if you just sum them and at best just cause confusion over varying results.

Have a read of the app note as its not the shortest but once you understand what it contains much becomes apparent.

I have a big review to do on my mic collection and you can do some simple stuff with FFmpeg as it does have a delay that you can set by samples.

The simplest beamformer the broadside the PS3eye style basically adds a delay of the distance of the mics that equates to the speed of sound.
At that point it creates at a specific frequency it creates a big null notch and from that every octave (half frequency) you get -3db of attenuation whilst side on as its a high pass filter.

The initial logic is we can just sum them and 2x mic is twice as much but that couldn’t be further from the truth as the sine waves your adding together are out of phase.
Broadsides front and back do nothing its only the sides that are effected so as you approch (90,270) the effects can be detrimental.

An endfire is just a PS3eye broadside side on at 90’degrees but rather than sum the signal of the rear is inverted so its subtracted when summed via a delay. This creates a 2nd order high pass filter of 6db per octave and the cardioid pattern of a uni-directional where most of the attenuation is at the back.

As said you can use FFmpeg with basic array types but after that it quickly gets akin to rocket science.
But for many for what you gain a simple cheap singular mic /soundcard might be a better option.

Pretty sure even before I run tests the Boya mic is by far the best, its built like a tank good enough that also it screws on a mini tripod and also makes a good broadcast mic or put on the deadcat and use it as a shotgun mic on a camera.
A bit bigger about length of a Pi and a tad more than £15 than my budget hoped but because its a 1/2" camera mount there are a whole manner or tripods and clamps that you can position it with and your pi can go somewhere more discrete.
Its the cheapest thing by far Boya make and for price from what I have tested its really good.
But yeah if your Rhasspy doesn’t work out then camera or PC it is good for other things and why its become my choice.