Yeah the Forrest Rhasspy can be a problem as you scream “Stop!”
On raspbian libspeexdsp is an older version than alsa-plugins requires to compile.
You get the speexlib but the dsp with ec doesn’t compile.
The speexdsp alsa-plugins are missing on raspbian or you can jump to ArchLinux Arm or
compile http://downloads.us.xiph.org/releases/speex/speexdsp-1.2.0.tar.gz
Grab the Alsa-plugins 1.8.1 (from mem on raspbian) recompile and you will get alsa speexdsp
so you can do things like the following and define in asound.conf
pcm.!default {
type asym
playback.pcm "plughw:CARD=ALSA,DEV=0"
capture.pcm "cap"
}
pcm.array {
type hw
card 1
}
pcm.cap {
type plug
slave {
pcm "array"
channels 4
}
route_policy sum
}
pcm.echo {
type speex
slave.pcm "cap"
echo yes
frames 256
filter_length 1600
denoise false
}
pcm.agc {
type speex
slave.pcm "echo"
agc 1
denoise yes
dereverb yes
}
But still making a big mistake as if you read the speex manual clock drift across seperate cards will stop speexdsp from working. You need an all-in-one soundcard that has playback & capture and no async like I have.
Its BS that you can not do software EC but quite likely the software we have available isn’t implemented correctly as software EC has been done.
How well it works and load it produces is another matter, but yeah for one reason or another especially on the Pi it seems the software we have doesn’t work.
Pulseaudio with webtrc aec & speex dsp, seem to of added drift compensation that I am not sure if it works on Arm embedded.
They seemed to of fudged some things due to ‘incorrect latency’ reporting https://gitlab.freedesktop.org/pulseaudio/webrtc-audio-processing/-/blob/master/webrtc/modules/audio_processing/echo_cancellation_impl.cc#L63
I am not sure if the Pi fits any of the latency profiles they have provided.
// Measured delays [ms]
// Device Chrome GTP
// MacBook Air 10
// MacBook Retina 10 100
// MacPro 30?
//
// Win7 Desktop 70 80?
// Win7 T430s 110
// Win8 T420s 70
//
// Daisy 50
// Pixel (w/ preproc?) 240
// Pixel (w/o preproc?) 110 110
// The extended filter mode gives us the flexibility to ignore the system's
// reported delays. We do this for platforms which we believe provide results
// which are incompatible with the AEC's expectations. Based on measurements
// (some provided above) we set a conservative (i.e. lower than measured)
// fixed delay.
//
// WEBRTC_UNTRUSTED_DELAY will only have an impact when |extended_filter_mode|
// is enabled. See the note along with |DelayCorrection| in
// echo_cancellation_impl.h for more details on the mode.
//
// Justification:
// Chromium/Mac: Here, the true latency is so low (~10-20 ms), that it plays
// havoc with the AEC's buffering. To avoid this, we set a fixed delay of 20 ms
// and then compensate by rewinding by 10 ms (in wideband) through
// kDelayDiffOffsetSamples. This trick does not seem to work for larger rewind
// values, but fortunately this is sufficient.
//
// Chromium/Linux(ChromeOS): The values we get on this platform don't correspond
// well to reality. The variance doesn't match the AEC's buffer changes, and the
// bulk values tend to be too low. However, the range across different hardware
// appears to be too large to choose a single value.
//
// GTP/Linux(ChromeOS): TBD, but for the moment we will trust the values.
#if defined(WEBRTC_CHROMIUM_BUILD) && defined(WEBRTC_MAC)
#define WEBRTC_UNTRUSTED_DELAY
#endif
Pulseaudio is a bit of a stinker to setup in a docker container anyway and the advantage of docker outweighs pulseaudio use, but once again the software might not work on arm embedded.
The $75 mic array alone that is near as damn the cost of the full size Amazon or Google complete units makes Pi based versions extremely expensive equivalent private AIs.
Software should work as has and does but your right the software on the Raspbian doesn’t seem to.
Probably SoCs like the Rockchip RK3308 & Allwinner R328 might surprisingly come to the Pis aid and act as satelite mic/speakers to a centralised Pi as with embedded dsp/codec they are extremely low cost.
Software should work and unfortunately the cost to benefit of Pi system with hardware EC for many doesnt justify benefit.
I quite like the idea of a centralised Pi with satelites as the far field claims of the array mics with singular prominant noise sources doesn’t work well.
In industrial environs of loud dispersed noise where the actor is a predominant signal then they work well.
Have a TV or HiFI in between you and AI that is the predominant source, beamforming wonders or not when it comes to recognition then good luck.
Multiple cheap satelites makes sense and also allow you to create multi-channel audio, but if the Allwinner R329 with built in NLP is as low cost as the R328 we might still be on debian but the Raspberry connection might not make cost sense.
Software can work for EC as the criteria you need for EC doesn’t mean its crystal clear recording quality it just needs to be adequate to cork/duck playing media.
But probably means someone will have to fix or hack what we have into something that is fit for purpose as yeah it doesn’t seem to work with the Pi.
I am still exploring some cheaper options of the respeaker 2mic and the linear 4 mic that does have hardware loopback with speex-alsa.
Still waiting for deliveries due to current situation so just don’t know.