Speaker Cancellation

duch · December 27, 2019, 5:34pm

Hi all,

Rhasspy + Buster + Respeaker 2 + speaker connected to Respeaker 2.

i’d like to play some audio through the speaker like a webradio mp3 stream for example. My problem is that if something comes out of speaker, Respseaker is completely deaf to voice commands.
It only works if i set playback to 20ish in alsamixer but of course it is not loud enough to listen to the radio.

This seems completely logical since the sound coming out of speaker is much closer than my voice.

Is there a way to cancel the stream going to speaker in ALSA?
To schematize, something like capture = capture - playback

I know Pulseaudio has some noise cancellation features but i’m not aware of this kind of cancellation and Pulseaudio gives me headaches.

fastjack · December 27, 2019, 7:03pm

For this you’ll need Acoustic Echo Cancellation (AEC).

The Respeaker Mic Array v2 do this natively using the XMOS chip. Though the playback audio quality is limited to 16khz.

Pulseaudio provides an AEC module but this requires more CPU resources and I was not able to get good results on a Raspnerry Pi (but it might be possible).

Otherwise there is a software from the guys at SeeedStudios that work with a custom ALSA pipe device they provide but I was not able to get it to work as good as the Respeaker builtin AEC.

Hope this helps.

duch · December 27, 2019, 7:33pm

one word : amazing

i’ll try the ALSA plugin, thx

fastjack · December 27, 2019, 7:48pm

If you manage to get good results I’ll be interested

With the Respeaker Mic Array V2 I’m able to detect the wakeword from across the room (5m+) with music playing pretty loud (the Respeaker/raspberry/speakers case design is paramount).

When the wakeword is detected you’ll have to lower/pause the music playback to get good ASR results.

duch · December 27, 2019, 7:53pm

I think i’m far to getting it right, i’ve installed ec et ALSA plugins but

in one terminal i launch
./ec -i 'plughw:CARD=seeed2micvoicec,DEV=0' -o 'plughw:CARD=seeed2micvoicec,DEV=0' -d 200

and in another one i launch
arecord -q -r 16000 -f S16_LE -c 1 -t raw -D 'fifo' test.wav

i stop it but at playback, i need to double the rate to get normal sound speed, Rhasspy won’t like that
aplay -q -r 32000 -f S16 -c 1 -t raw test.wav

furthermore i cannot play anything through mplayer as soon as ./ec is launched since it uses directly the hardware card plughw:CARD=seeed2micvoicec,DEV=0

feeling dumb…

fastjack · December 27, 2019, 8:15pm

ALSA can be quite frustrating sometimes…

I think you need to use a pipe device for playback as well.

From the README of ec

# terminal #1, run ec
./ec -h
./ec -i plughw:1 -o plughw:1 -s

# terminal #2, play 16k fs, 16 bits, 1 channel raw audio
cat 16k_s16le_mono_audio.raw > /tmp/ec.input

# terminal #3, record
cat /tmp/ec.output > 16k_s16le_stereo_audio.raw

You may have to create a dmix device in front of the pipe device to allow multiple process to access the same device.

duch · December 27, 2019, 8:37pm

here is my asound.conf

pcm.!default {
    type asym
    playback.pcm "playback"
    capture.pcm "capture"
}

pcm.playback {
    type plug
    slave.pcm "dmixed"
}

pcm.capture {
    type plug
    slave.pcm "array"
}

pcm.dmixed {
    type dmix
    slave.pcm "hw:seeed2micvoicec"
    ipc_key 555555
}

pcm.array {
    type dsnoop
    slave {
        pcm "hw:seeed2micvoicec"
        channels 2
    }
    ipc_key 666666
}

pcm.fifo {
    type fifo
    file "/tmp/ec.input"
    infile "/tmp/ec.output"
    rate 16000
    format S16_LE
}

i tried to add a couple of PCMs such as

pcm.mplayer {
    type plug
    slave.pcm "dmixed"
}

or

pcm.mplayer {
    type dmix
    slave.pcm "hw:seeed2micvoicec"
    ipc_key 444444
}

but i must admit : i don’t understand what i’m doing

if anyone has a working asound.conf…

fastjack · December 27, 2019, 9:02pm

Have you tried this?

github.com

voice-engine/ec/blob/master/asound.conf


pcm.!default {
    type asym
    playback.pcm "eci"
    capture.pcm "eco"
}


pcm.eci {
    type plug
    slave {
        format S16_LE
        rate 16000
        channels 1
        pcm {
            type file
            slave.pcm null
            file "/tmp/ec.input"
            format "raw"
        }

This file has been truncated. show original

I had trouble getting this to work for multiple playbacks at the same time (voice, music, etc) since dmix devices only accept a « hardware » device as slave. And the pipe is not considered « hardware » by ALSA. If you only want to play audio from a single mplayer then it might work. Otherwise you need a software mixer like Pulseaudio.

Using pulseaudio was easier to setup but prepare for high CPU resource usage (with a Rpi 4 it might be ok though). I was not able to get convincing results using Pulseaudio but I might have incorrectly configured it at the time.

That is why I chose the Respeaker Mic Array v2. It works flawlessly and do not consume resources. It’s expensive though

duch · December 27, 2019, 9:33pm

Respeaker Mic Array v2 has built-in AEC? No need for extra software or alsa config ?
If yes i’ll go for that straight away, thx

EDIT : just to be curious, why not choose ReSpeaker Core v2.0 directly ?

fastjack · December 27, 2019, 9:56pm

The Respeaker Core does not have AEC built in.

fishertimj · December 28, 2019, 1:51pm

Curious: What’s the difference in the Respeaker Mic Array v2 and the ReSpeaker 6-Mic Circular Array kit for Raspberry Pi?

rolyan_trauts · April 8, 2020, 11:32am

I actually got it working but if it works is another question

Anyway will share as maybe you can tell me if it makes a difference load isn’t bad though.
My main problem was when I stopped recording ec would stop.
I got past that by using a loopback device sudo modprobe snd-aloop

So firstly my /etc/asound.conf

 pcm.!default {
    type asym
    playback.pcm "eci"
    capture.pcm "plughw:CARD=Loopback,DEV=1"
}


pcm.eci {
    type plug
    slave {
        format S16_LE
        rate 16000
        channels 1
        pcm {
            type file
            slave.pcm null
            file "/tmp/ec.input"
            format "raw"
        }
    }
}

pcm.eco {
    type plug
    slave.pcm {
        type fifo
        infile "/tmp/ec.output"
        rate 16000
        format S16_LE
        channels 2
    }
}

pcm.cap {
 type plug
 slave {
   pcm "plughw:CARD=CameraB409241,DEV=0"
   channels 4
   }
 route_policy sum
}

start ec with
./ec -i 'cap' -o 'plughw:CARD=ALSA,DEV=0' &
then redirect the mic to the loopback
arecord -D eco -q -r 16000 -f S16_LE -c 2 | aplay -D plughw:CARD=Loopback,DEV=0 &

The big question is does it actually work as yeah in the cli I can see aec switching on & off on play.
But does it make a difference?

fastjack · April 8, 2020, 1:50pm

@rolyan_trauts AEC is essential is you plan on playing music or radio streams through your assistant and you want it to respond during playback. I was not able to make the EC software work correctly so congrats

@fishertimj The difference is that the Respeaker Mic Array v2.0 have all the audio processing algos (NS, BF, AEC, etc.) directly on the integrated XMOS chip (no CPU required) whereas the 6-Mic does not.

Seeed provides a software (closed sources) on their custom version of Raspbian.

I cannot vouch for this software performance but I’m very satisfied with the Mic Array v2.0.

My assistant can ear me from more than 5 meters away even while playing loud music (2x3W). The loudspeaker placement and case design are paramount though.

Hope this helps

rolyan_trauts · April 18, 2020, 9:51am

Yeah the Forrest Rhasspy can be a problem as you scream “Stop!”

On raspbian libspeexdsp is an older version than alsa-plugins requires to compile.
You get the speexlib but the dsp with ec doesn’t compile.
The speexdsp alsa-plugins are missing on raspbian or you can jump to ArchLinux Arm or
compile http://downloads.us.xiph.org/releases/speex/speexdsp-1.2.0.tar.gz
Grab the Alsa-plugins 1.8.1 (from mem on raspbian) recompile and you will get alsa speexdsp
so you can do things like the following and define in asound.conf

pcm.!default {
    type asym
    playback.pcm "plughw:CARD=ALSA,DEV=0"
    capture.pcm  "cap"
}

pcm.array {
 type hw
 card 1
}

pcm.cap {
 type plug
 slave {
   pcm "array"
   channels 4
   }
 route_policy sum
}

pcm.echo {
 type speex
 slave.pcm "cap"
 echo yes
 frames 256
 filter_length 1600
 denoise false
}

pcm.agc {
 type speex
 slave.pcm "echo"
 agc 1
 denoise yes
 dereverb yes
}

But still making a big mistake as if you read the speex manual clock drift across seperate cards will stop speexdsp from working. You need an all-in-one soundcard that has playback & capture and no async like I have.

Its BS that you can not do software EC but quite likely the software we have available isn’t implemented correctly as software EC has been done.
How well it works and load it produces is another matter, but yeah for one reason or another especially on the Pi it seems the software we have doesn’t work.

Pulseaudio with webtrc aec & speex dsp, seem to of added drift compensation that I am not sure if it works on Arm embedded.
They seemed to of fudged some things due to ‘incorrect latency’ reporting https://gitlab.freedesktop.org/pulseaudio/webrtc-audio-processing/-/blob/master/webrtc/modules/audio_processing/echo_cancellation_impl.cc#L63
I am not sure if the Pi fits any of the latency profiles they have provided.

// Measured delays [ms]
// Device                Chrome  GTP
// MacBook Air           10
// MacBook Retina        10      100
// MacPro                30?
//
// Win7 Desktop          70      80?
// Win7 T430s            110
// Win8 T420s            70
//
// Daisy                 50
// Pixel (w/ preproc?)           240
// Pixel (w/o preproc?)  110     110

// The extended filter mode gives us the flexibility to ignore the system's
// reported delays. We do this for platforms which we believe provide results
// which are incompatible with the AEC's expectations. Based on measurements
// (some provided above) we set a conservative (i.e. lower than measured)
// fixed delay.
//
// WEBRTC_UNTRUSTED_DELAY will only have an impact when |extended_filter_mode|
// is enabled. See the note along with |DelayCorrection| in
// echo_cancellation_impl.h for more details on the mode.
//
// Justification:
// Chromium/Mac: Here, the true latency is so low (~10-20 ms), that it plays
// havoc with the AEC's buffering. To avoid this, we set a fixed delay of 20 ms
// and then compensate by rewinding by 10 ms (in wideband) through
// kDelayDiffOffsetSamples. This trick does not seem to work for larger rewind
// values, but fortunately this is sufficient.
//
// Chromium/Linux(ChromeOS): The values we get on this platform don't correspond
// well to reality. The variance doesn't match the AEC's buffer changes, and the
// bulk values tend to be too low. However, the range across different hardware
// appears to be too large to choose a single value.
//
// GTP/Linux(ChromeOS): TBD, but for the moment we will trust the values.
#if defined(WEBRTC_CHROMIUM_BUILD) && defined(WEBRTC_MAC)
#define WEBRTC_UNTRUSTED_DELAY
#endif

Pulseaudio is a bit of a stinker to setup in a docker container anyway and the advantage of docker outweighs pulseaudio use, but once again the software might not work on arm embedded.
The $75 mic array alone that is near as damn the cost of the full size Amazon or Google complete units makes Pi based versions extremely expensive equivalent private AIs.
Software should work as has and does but your right the software on the Raspbian doesn’t seem to.

Probably SoCs like the Rockchip RK3308 & Allwinner R328 might surprisingly come to the Pis aid and act as satelite mic/speakers to a centralised Pi as with embedded dsp/codec they are extremely low cost.
Software should work and unfortunately the cost to benefit of Pi system with hardware EC for many doesnt justify benefit.

I quite like the idea of a centralised Pi with satelites as the far field claims of the array mics with singular prominant noise sources doesn’t work well.
In industrial environs of loud dispersed noise where the actor is a predominant signal then they work well.
Have a TV or HiFI in between you and AI that is the predominant source, beamforming wonders or not when it comes to recognition then good luck.

Multiple cheap satelites makes sense and also allow you to create multi-channel audio, but if the Allwinner R329 with built in NLP is as low cost as the R328 we might still be on debian but the Raspberry connection might not make cost sense.

Software can work for EC as the criteria you need for EC doesn’t mean its crystal clear recording quality it just needs to be adequate to cork/duck playing media.
But probably means someone will have to fix or hack what we have into something that is fit for purpose as yeah it doesn’t seem to work with the Pi.

I am still exploring some cheaper options of the respeaker 2mic and the linear 4 mic that does have hardware loopback with speex-alsa.
Still waiting for deliveries due to current situation so just don’t know.

rolyan_trauts · April 18, 2020, 10:30am

I ran out of links due to my noob status but https://github.com/OAID all looks interesting for embedded irrespective of SoC.
http://www.tengine.org.cn/

rolyan_trauts · April 18, 2020, 10:48am

Due to links still thinking either the rk3308 or r328 could make a great https://rhasspy.github.io/rhasspy-voltron/tutorials.html#server-with-satellites.

Still waiting for my https://wiki.radxa.com/RockpiS $13.99 but it might be painfull with a very new dts and debian image and looking like I have to figure the mic circuit out myself.
I know its called Rhasspy but the framework is so great an flexible that these might make a really great option.

fastjack · April 18, 2020, 11:01am

I read some good reviews of the Respeaker 2 mic hat regarding sensitivity and far field capture. If ALSA AEC can be achieved using the loopback of the hat (or maybe using ALSA loopback plugin?) it will be a huge progress for vocal assistant

Really looking forward to your feedback

Bozor · April 18, 2020, 11:18am

@fastjack how are you performing to play music over rhasspy and he is still listening for the wakeword? What Software are you using? I’ve been trying to play radio over mopidy and keep rhasspy to listen to the wakeword. Yet mopidy tells me that the resource is busy when I Start the radio stream. A busy resource makes sense to me. Thats why I wondering how you’re perforimg this task.

rolyan_trauts · April 18, 2020, 11:50am

@fastjack

Yeah the respeaker or clone 2mic is a great option due to cost and if EC could be made to a level where VAD could kick in then maybe something could be workable.
VAD is in Speex and AEC is in SpeexDsp but its a maybe.

I will let you know but been waiting for some time for delivery https://www.ebay.co.uk/itm/Blesiya-ReSpeaker-2-Mic-Pi-HAT-V1-0-Expansion-Board-I2C-For-Raspberry-Pi/253876990739
But the master/satelite mode of rhasspy gets really cost effective with the Rockchip/Allwinner as the are just the right combination on silicon and not much use for anything else than satelite mic/speakers.
But the Pi3A+ and the above £10 mic/soundcard aint that bad a price.

@Bozor

pcm.!default {
    type plug
    slave.pcm "dmix"
}

You prob just need to add another line to your asound.conf and use a dmix pcm that multiple applications can use rather than a sole PCM or device.
Its “dsnoop” for mics “dmix” for outputs just do a google.

Also I have a ps3eye but with the driver problems even though it has a great array I think its a really bad device to use as it can cause all sorts of confusion with alsa-util failures and for a couple of $ more a 2 mic even has a little amp built in with some leds.
So yeah if EC can be made to work its definately a top pick and also even without EC, but that is another problem.
I have seen these on sale in China and have my fingers crossed at a similar price they will turn up on ebay or aliexpress.
http://bbs.16rd.com/shop_product-1-381.html
But that might be Yaun not Yen

rolyan_trauts · April 18, 2020, 12:34pm

But supposedly similar as these https://detail.tmall.com/item.htm?id=569134471494
Is supposedly a R328 so fingers crossed wither way.