Better usb sound card + separate microphone and speaker. https://www.amazon.it/gp/product/B01N905VOY/ref=ox_sc_act_title_4?smid=AXZ3JQ1GVFPIF&psc=1
Or a microphone like that? https://www.amazon.it/gp/product/B08TWTBRW3/ref=ox_sc_act_title_2?smid=A25N751GH96RMO&psc=1
This is too unreliable? https://www.amazon.it/gp/product/B0757JT9S7/ref=ox_sc_act_title_1?smid=A2TG03Y1TO9F5Y&psc=1
If you want to use any of the EC or NS utils then near all use a synced hardware clock on a device that has input & output so it shares a singular hardware clock.
So basically usb mics with audio out or any combination of audio out only and audio input only or a mixture where you are on using audio out & audio in on separate devices does have some drawbacks.
There are a rake of Boya clones like the original Boya 3.5mm such as https://www.ebay.co.uk/itm/274677302882
I quite like my boya as when not voice Ai its on a little mini tripod as a broadcast mic but really if you look at the specs they are very much the same.
It can be a bit of a lottery as sometimes the rated specs and decsription can be a bit dubious.
But for £3.50 and less you can get 3.5mm cardioid mics that as I have one are very hard to tell the difference and so much smaller and handy for voice-ai
Soundcards are either a choice of channels or to pick one with a good AGC & gain and to be honest as I have a Ugreen if I remember rightly it did not have hardware AGC which often have much lower SNR then by using software versions.
https://www.ebay.co.uk/itm/173262756574 does have good AGC & gain much better than many and again is often around £3.50 or you can buy from sparkfun or is it pihut? You will have to google.
The main thing due to not having beamforming algs on the Pi is that the cardioid pattern is a set beamforming pattern that has an element of noise rejection by placement and direction where omnidirectional have none and like EC & NS can be a big consideration.
Also with high end mics you need to be careful that there is Linux software as some like Blue Yeti the software with all the goodies is just on M$ & Mac$.
You can usually find on Amazaon what is on ebay I just already had those links.
The CM108 modules are not bad and you definately know what your getting as its not in a case as the white USB adapters sometimes have a worthless intel chipset in them when you want what for cost is really excellent but via lsusb 1b3f:2008 Generalplus Technology Inc.
not intel.
Quite handy to have USB with flex and shows size diff of the boya type to a simple 3.5mm cardioid mic with foam and dead cat removed.
If you go the DiY route then using a mic module and electret will give much better AGC SNR noise than any I have tested.
The max9814 prob the onboard regulator split from the Pi 5v helps much and the elevated signal level so you can input to the soundcard with gain turned down low gives great AGC with much lower SNR and better far field.
https://www.ebay.co.uk/itm/152293733901 you need to source an electret and do a bit of cable work but it works extremely well and the electret be it rubber grommet or whatever very small and quite easy mount in a more professional finished manner as being round is just a drill hole. But more of a maker project than just plugin but you can get extremely good far-field.
Stereo ADC with AGC https://www.scan.co.uk/products/enermax-ap001e-dreambass-usb-soundcard-plus-earphones-genie-with-integrated-80-hz-plus6-db-bass-boos
AGC isn’t as good as the cheaper white ones but the only stereo ADC with AGC card I know
thanks for very exhaustive explain! I chose this: https://www.amazon.it/gp/product/B087V62BYH/ref=ox_sc_act_title_2?smid=A2YV8RMOY2PTG4&psc=1 And this microphone: https://www.amazon.it/gp/product/B07NSBBL2S/ref=ox_sc_act_title_1?smid=AZMZUWUKWXIPR&psc=1 Ok to begin with?
Yeah alsamixer
F6 to select device then F5 to show capture PCMs.
Turn up the gain to full and drop one or 2 steps.
The toggles such as AGC strangely often can not be set by alsamixer
amixer -cX controls
to list the controls
amixer -cX cset numid=Y 1
x = card y= numid corresponding to the agc control
alsactl store x
to make persistant may have to use sudo to overwrite
Don’t worry about the high volume and the level of noise as its perfect ok for recognition and your voice signal level will be using the full scale to near 0dB.
The mic stand screws onto any 1/4" camera tripod / adapter… (think its 1/4" the standard camera mount anyway)
Looks identical to the Boya
Thanks, I’m starting to understand how it works. Actually i use integrated micro for thinkpad x230, response is ok, process ok, also at distance and with noise. Goal is use rhasspy whit qplus (all winner 64) tv box, converted to armbian distro. On the thinkpad AGC controls not present, thanks for give me a method for control this. I am wait for new hardware for other test. Sorry for my terribly English.
English is better than mine.
The cardioid electrets just have holes in the back to allow sound pressure to cancel that of the front.
Shame we don’t have adaptive beamforming for this as you will prob have to play with placement but in front of the noise source such as TV facing voice actor will provide only a small amount of attenuation but provide a big difference in recognition.
The AEC algs we have need to play the media to subtract that from mic signal what is played but often what is played is another source.
I think the qplus is approx similar to the pi3 4x core A53? So things can quickly add up the load.
Arrived!!! The microphone is very clean. I start with the first tests then I’ll let you know.
Yeah the Mic should be OK as they all seem very similar I did get another white USB and yeah it just seems a lottery what is inside as got another mweh and useless Intel one. (Terrible mic volume & AGC)
I use a Sony Playstation PS3 Eye and it work very well.
It has been tested with SNIPS and is very good :
And you can get it for cheap (used or new) on amazon:
Rhasspy istalled on Qplus with armbian 64. With rhaaspy cm108 it works perfectly and is immediately recognized. AGC also works with alsamixer. With the TV on and a distance of more than 5 meters in a large room, it correctly recognizes the wake word and commands. Perfect! The Qplus is a multimedia player costing 40 euros, with power supply included and nice case and multicolor LEDs, and a 40Gb disk, 4 2 Ghz cpu, 4Gb ram. With HA and Rhaaspy + graphics environment+ mqtt server + zigbee2mqtt, it consumes 1.4 Gb ram, 10% processor. Not bad! I use precise+kaldi+fsticuffs+larynx. Cm108 only function with arecord.
The eye was among my choices. But the qplus does not have a microphone input, I was forced to buy an external audio usb to also have an output.
The above Snips article is a hilarious collection of zero science based audio engineering.
It collates a collection of microphones and tests and the results it gets basically proves that the article is far from factual.
Audio and sine waves just don’t work like that and any audio engineer will be screaming especially on a linear array such as a PS3eye the mics will be out of phase by the speed of sound between the distance of the mics.
I have forgot if a broadside array @ 90’ provide a n-mic or n-1-mic order high pass filter but one of them it does.
I guess the DSP do phase latency matching across all frequency bands but the complete failure here to compare totally different microphone hardware without even the basic knowledge that maybe it should be mentioned is absolutely brilliant.
I say brilliant as it was absolute brilliance that Snips managed to sell out to Sonos but to be fair Snips had moved on from this early article and the respeaker libresoftware that included simple beamforming was later used, but the above article has much comedic value.
Arrays have nothing at all to do with far-field absolutely nothing, nada, zilch, zero as its simple. Sound is attenuated via distance so amplify it and to cope with different distances use a AGC.
I never did work out if it was just embarrassingly in-factual or it was deliberate snakeoil, but hey good on them for making a $ or 2 out of Sonos.
Success rate of hotword detection from a fixed distance (1.5m) and varying tilt, in a silent room.
Says it all to me as boy yes definitely there is a large amount of tilt included!
I have an eye. How did you get it to work?
I followed the instructions in this article on raspian buster PI 4:
alsamixer do not work for setting the capture input volume, but amixer works.
here is my config:
hope this will help you.
sudo vi /etc/udev/rules.d/70-alsa-permanent.rules
SUBSYSTEM!="sound", GOTO="my_usb_audio_end"
ACTION!="add", GOTO="my_usb_audio_end"
ATTRS{idVendor}=="1415", ATTRS{idProduct}=="2000", ATTR{id}="VOICE"
sudo vi /etc/asound.conf
pcm.array {
type hw
card VOICE
}
pcm.array_gain {
type softvol
slave {
pcm "array"
}
control {
name "Mic Gain"
count 2
card 0
}
min_dB -40.0
max_dB 10.0
resolution 80
}
pcm.cap1 {
type plug
slave {
pcm "array_gain"
channels 4
}
route_policy sum
}
sudo systemctl restart alsa-*
arecord -f cd -D'cap1' > a.wav
aplay -D'hw:CARD=seeed2micvoicec,DEV=0' a.wav
amixer
pi@rasp4-1:~ $ amixer scontrols
Simple mixer control ‘Headphone’,0
Simple mixer control ‘Mic Gain’,0
pi@rasp4-1:~ $ amixer scontents
Simple mixer control ‘Headphone’,0
Capabilities: pvolume pvolume-joined pswitch pswitch-joined
Playback channels: Mono
Limits: Playback -10239 - 400
Mono: Playback -2000 [77%] [-20.00dB] [on]
Simple mixer control ‘Mic Gain’,0
Capabilities: volume
Playback channels: Front Left - Front Right
Capture channels: Front Left - Front Right
Limits: 0 - 79
Front Left: 63 [80%]
Front Right: 63 [80%]
pi@rasp4-1:~ $ amixer get 'Mic Gain'
Simple mixer control ‘Mic Gain’,0
Capabilities: volume
Playback channels: Front Left - Front Right
Capture channels: Front Left - Front Right
Limits: 0 - 79
Front Left: 63 [80%]
Front Right: 63 [80%]
pi@rasp4-1:~ $ amixer set 'Mic Gain' 90%
Simple mixer control ‘Mic Gain’,0
Capabilities: volume
Playback channels: Front Left - Front Right
Capture channels: Front Left - Front Right
Limits: 0 - 79
Front Left: 72 [91%]
Front Right: 72 [91%]
pi@rasp4-1:~ $ amixer get 'Mic Gain'
Simple mixer control ‘Mic Gain’,0
Capabilities: volume
Playback channels: Front Left - Front Right
Capture channels: Front Left - Front Right
Limits: 0 - 79
Front Left: 72 [91%]
Alsamixer will also work or should but you just have to use the device once so that it becomes visible.
The ps3eye does work but its a whole load of pointless is the point and simple summing of arrays creates filter effects due to being out of phase.
The clever bit of the PS3eye was the DSP in the PS3 and without its no better than any single mic and summed off centre sound will not be as clear as a single simple mic.
It works without the above alsa hassle on just a single channel and probably better that way as not summing out of phase signals… without dsp …
Thanks for this information, appreciated.
OK I c. Thanks for this insight.
I had a look at pulseaudio beamforming again as its a been a long time since I ruled it out due whats the point of a non steerable beamformer.
pactl load-module module-echo-cancel use_master_format=1 aec_method='webrtc' aec_args='"beamforming=1 mic_geometry=-0,0.029,0,0,0.029,0"'
Apart from audio artefacts it seems to work quite well with 2x unidirectional but the above is the 58mm spacing of a respeaker 2mic. PS3 eye mic_geometry=-0.03,0,0,-0.01,0,0,0.01,0,0,0.03,0,0
The respeaker 2mic quickly fills the journal with complaints but continues to work and remember to have your board vertical as the pcb and directing the ports will increase rear rejection. Seems to be a respeaker thing as all else seems to work without complaint.
You could sort of work backwards and use the alsa pulse plugin to expose it to alsa.
https://wiki.archlinux.org/title/PulseAudio#Expose_PulseAudio_sources,_sinks_and_mixers_to_ALSA
It will give you far more control and even if the AEC just fails over a relatively low threshold the digital AGC is quite good.
https://wiki.archlinux.org/title/PulseAudio#Microphone_echo/noise_cancellation