Simple cheap USB Microphone / Soundcard

You see that is something that has been said before and is slightly paradoxical.
As the argument of feeding denoised input into a model that has not been denoised is obviously going to be prone to artefact noise.

If you are creating models you must use the the tools you record with (aka run your dataset through denoise) or you force your tools to be as exact as the original.

rnnoise is old tech but if in spring the Pi4a does make an appearance its quite possible to make noise resilient denoised KWS models but really that infrastructure is a crazy model as all with AI quality and results = raw power and likely a single central processor with more horse power than a Pi due to the inherrent nature of sporadic voice commands could serve a much better multi user / multi room role.

I mean this is a Pi4 with rnnoise but you should hear the results of RTX voice with a newer GPU or the facebook research pytorch cuda based voice technologies.
Same goes for tacotron2 + waveglow its actually awesome but on the Pi just forget about it.

Yes that is true for keyword models but it would be a massive effort for some kaldi asr models as you need a lot more of speech plus transcribed text corpora to train those. And the available open source data sets are just not recorded that way mostly.

Yeah why I see distributed models on low end hardware as probably pointless. But you could.

Yes but most people will not have that. They will have one of the many available single board conputers like the pi or a rockpro or an odroid.

Actually no in comparison to the availability of x86 and gpu’s from the GTX780 up more people have those.
Its you who has a Pi and actually in market share its much less.

I think you have to look at the people who are actually interested in diy assistants and home automation and so on. Nodered for example just had a big community survey and a big majority is running all their projects like this on single board computers.
The crowd who run x86 systems with beefy graphic cards as a 24/7 server is very small.
And I would bet that if you did a survey here it would be the same result.
I know the same is true for the user base of things like openhab where i used to be active and home assistant where alot of users for Rhasspy come from.
There might be a few people who also have a beefy machine at home but even those often run their server things on raspberry pis.
I really do talk from experience on this. It is the most popular hardware choice, just look at the questions in the forum.

My point is when it comes to voice AI the best you can produce is for those who want to say “look mum what I have built”.

Otherwise its extremely poor in comparison to $30 big data silicon and the only option is horsepower if you wish to be private and have something that rivals maybe even beats the big guys.

So enjoy but for me your talking toys.

Node red is IoT and that is far more wide ranging than poor voice AI.

That is your opinion but doesn’t a project like Rhasspy also have to look at what the userbase is actually using and listen/ aim its development at that.
Because otherwise you will loose a huge chunk of people.
And no there is a few people including me who actually use tools like Rhasspy or Voice2json in production/ day to day life on that toy hardware you call it. It just turned of my tv and my girlfriend set a timer for the tea we made.
Than I asked what the weather tomorrow will be like and that is all working rather nicely.
My mom wasn’t involved to look at it and say what a good boy i am unfortunately…

There is no user base apart from you, there is a lack of skills and people like Kibo who think compared to Snips this sucks and is currently useless.
I haven’t used a Mycroft or Rhasppy for a long time because they are so poor.
I keep looking for alternatives and hardware that might make a difference but I am not going to convey anything other than the truth.

The hobbyist programmers should enjoy what they are doing but need to reign in claims of effectiveness.

Wow you just disenfranchised a lot of people here. FYI for me it performs on par to snips which i also used for a year before it was shut down.
Don’t you think the project would be dead and the forum full of shouting people if I was the only one successfully using it?
You should really leave the sinking ship and jump to the next project.

Look at post count your about 80% of community and if truth disenfranchises I have no care.
I came to create working opensource voice ai not make friends.

Opensource is about user driven software and currently this needs to be driven hard and you seem at times disingenuous as after never using snips the only vibe I get is Rhasspy is no Snips.

It could be worse as my opinion of Mycroft is utter snakeoil :slight_smile:

No opensource is about contribution. Its about all the people who quietly contribute, be it you that helps people with audio hardware problems. Somebody designing a nice printable case. Small pull requests and so on. There is no them to drive its an us.
But i guess thats just my point of view.

From Stallman to Eric Raymond, Apache, Libreoffice to the Linux Kernel its about how contribution can make effective user driven shared ownership software.
Contributing dross in masse has little use.

PS A armour case with a 12v 40mm fan on 5v makes a super easy and low cost Pi4 2.0Ghz machine.

We do need to sort out the start of the input chain with voiceai and audio processing.
Garbage in, garbage out and currently things are not good in common domestic environs.

I know ive been running mine overclocked for half a year now in a passive heatsink case.

I tried passive with the Pi4 but under stress it just throttles.
Even with the fan it may eventually throttle maybe not but with constant load it quickly gets up to 65c.

Just found the fan just stuck on 5v is silent practically and gives far more load headroom.
Double sided sticky tape rules :slight_smile:

I never said rhasspy is useless.
There have been a long road, with api integration, bug finding/fixing after 2.5 remake, wakeword etc.

But since the beginning the intent definition and asr is better than snips. Raven has close the gap to provide a workable open source wakeword.

I have snips working for two years in house by all family, plugged into Jeedom and everyone here use it everyday for lot of stuff.

Actually rhasspy is ready, plugged into my production Jeedom or test Jeedom with a switch. And it works better than snips. BUT only things preventing me to get ride of snips is this particular problem of infinite listening with background noise/music. And there is some nice ideas here to get this improved. A max duration settings in stt listening would help a lot also. And I have no doubt @synesthesiam will soon find solutions :grinning:

Like ever said snips was a team of lot of dev and rhasspy is driven by one man and a few helpers. And apart this listening problem I ever think it is better than snips.

The world is full of people saying it’s impossible while some are doing it …

One day we will all be able to ditch snips and I would never thanks enough @synesthesiam for that.

4 Likes

We should ditch snips as its completely derailed the excellent work provided by @synesthesiam and took Rhasspy the wrong way.

PS if you are siting next to a Triangle Antal then maybe just go out and buy something that does work :slight_smile:

If you get your hands on one really do try the respeaker usb 2.0 array. Its a bit over priced for what it is but the quality jump from the 2 and 4 mic pi hat is very noticeable and that might be what you need.

hey i would like to ask and suggestion about this below product, if anyone knows that does it work with pi or it is reliable for give it try … ?

Thats just the mic board the I2S board is another layer