Has anyone tried using Libre Speaker from Seeed Technology

arpagor62970 · May 3, 2021, 7:24am

I have a 6 mic array respeaker and I would like to know if there is an improvement since they have many settings such as Echo suppression, AGC or determination of the direction of the source.

Thank’s
Arpagor

rolyan_trauts · May 3, 2021, 9:57am

@sskorol Manged to get it working and can prob help but don’t think the results where that great but check with him

sskorol · May 3, 2021, 2:33pm

You can check my lib based on librespeaker: https://github.com/sskorol/respeaker-websockets. It demonstrates a full feature-set of this library. Not a final version though.

I’ve tested mostly all the algorithms so far.

NS and AGC work good.
BF and DOA produce more or less acceptable results only if you have a single input source. In case of the additional distractors like TV, it won’t be able to accurately detect direction and focus on a speaker.
KWS (Snowboy) is good with the default words. However, it’s not synchronised with beamformer. Otherwise, focusing on a speaker would work as expected.
AEC seems fine if your playback volume is not 100%. 70% worked the best for me. But be aware of the fact that high basses are not suppressed. According to librespeaker devs, AEC doesn’t fully remove the output from the input stream. It just applies ~20dB suppression against the output stream. So you will still hear it in a background.

My observations regarding the pitfalls of BF and AEC were officially confirmed by SeeedStudio and Alango devs. So we can’t do anything with it. However, librespeaker uses a limited version of Alango VEP package. Maybe their full version would work much better. But unfortunately, they don’t provide support for non-commercial products. Moreover, there are no options for individuals to buy a personal licence. So either use librespeaker as is or don’t buy Respeaker at all.

rolyan_trauts · May 3, 2021, 3:16pm

NS is another wierd one as in some cases it can make things worse as I have a Anker Webcam PowerConf C300 & Powerconf speaker/phone but the models I had expected noise and actually performed worse than with NS turned off.
AGC can be similar as most are too fast for speech (attack/delay) and quickly ramp up noise when no speech and have a delay before ramp up again.

The answer is to provide custom models using hardware of use and then they are actually better but conversely if you get the wrong models they can actually makes things worse.

@sskorol seems to of got the best of the Respeaker offering but it is sort of basic but better than nothing.

Onboard mics that are near impossible to isolate and just give AEC such a hard chore as in a small enclosure much will bleed straight into a omnidirectional mic.

So NS is no big loss as for us it sounds better cutting the noise out of non speech parts but for ai recognition its not that much of an advantage and also increases artefacts.
The AEC would be fine if you could isolate the mics better but thats the problem of many a all-in-one mounted board and all common hats apart from the Pi Codec Zero don’t give alternative inputs.