Microphone noise filtering

So I have a PS3 eye plugged into a Pi Zero W that I’m using as a Rhasspy satellite, and overall it’s been a great experience testing Rhasspy 2.5-pre. However, I’ve noticed that when the gain is turned up on the mic, a fair amount of background hiss is produced which prevents Rhasspy from detecting silence and stopping recording after I finish speaking. I can reliably and consistently get it to stop if I reduce the gain of the mic in alsamixer when I finish speaking.

Even changing the VAD sensitivity to 3 for webrtcvad doesn’t seem to reliably detect silence after I finish speaking.

I attempted to set up Pulseaudio with the module-echo-cancel module, but something isn’t compiled right for the Pi Zero/armv6 on Raspbian Buster, so Pulseaudio crashes with illegal instruction errors.

My next idea was to try noise reduction with sox by generating a noise profile when the mic is silent:

# record a 5 second file with no speaking and just background noise
arecord /tmp/noise.wav -r 16000 -d 5

# Create background noise profile from mp3
sox /tmp/noise.wav -n noiseprof /tmp/noise.prof

# record a 5 second file with speaking
arecord /tmp/speech.wav -r 16000 -d 5

# Remove noise from wav using profile
sox /tmp/speech.wav /tmp/fixed.wav noisered /tmp/noise.prof 0.21

This seems to work very well and the fixed audio has much less background hiss.

So I have some questions for the gurus and for @synesthesiam:

  1. Are there settings I’m overlooking in webrtcvad that could fix this issue?
  2. If not, is it possible to modify the arecord pipeline in Rhasspy to include sox doing noise reduction in the middle? Something like this could work:
    arecord -t raw -r 16000 -f S16_LE -c 1 | sox -t raw -b 16 -e signed -c 1 -r 16k - -r 16k -t wav - noisered /tmp/noise.prof 0.21

I could buy better hardware, but I’d love to see if I could make this work before investing money. Thanks in advance for any ideas and thoughts!

1 Like

I don’t know much about this, but I would also be very interested in answers.

If you have an existing noise profile, you could probably record directly with sox and that output could be used in rhasspy - seems already possible in 2.4.19 as we can switch from pyaudio to arecord. @synesthesiam where would we need to hook in there? Would it make sense to specify any script for recording?

That’s a great idea to record with sox actually, since it’s already part of the Docker image for 2.5-pre. Perhaps it could be another option in the Audio Recording dropdown with a text box for additional options and/or the noise profile file location.

This would actually be quite well supported in 2.5 by setting audio input to “command” and then specifying sox as the recording program along with appropriate arguments.

1 Like

Of course! Completely forgot about the “Local Command” for audio input. I’ll give it a try. Thanks @synesthesiam!

Well, partial success, still need to tweak the values a bit more to work best with the room that the mic is in… This is the relevant config entry:

    "microphone": {
        "command": {
            "record_arguments": "-q --buffer 2048 -b 16 -c 1 -r 16000 -t alsa default -t raw - noisered /profiles/en/noise.prof 0.19",
            "record_program": "sox"

Is it possible to specify a UDP port with Local Command like there is for Arecord and Pyaudio?

1 Like

Yes. I’ll push an update for this tonight.

1 Like

Also having a small test-set with 2.5 running now - and plenty of crappy microphones (just working on getting old c.h.i.p. computers working with a cheap usb mic as satellite). Any new results on this @hawkeye217? Maybe, we could find something that records the ambient noise constantly and generates a filter out of this, so this could be automated.

See my first post in the thread. I made a noise profile with sox and then used that to record audio into Rhasspy with it using the Local Command for audio input.