I2S Directional Mic

@fastjack @synesthesiam

Hi guys I have been a bit dumb in searching for an alsa delay as my sample count is likely not to be greater than three.

If you read https://invensense.tdk.com/wp-content/uploads/2015/02/Microphone-Array-Beamforming.pdf and you take 2x INMP441 you will notice they are 14mm in diameter.
This is handy as speed of sound corresponds to 2x samples @ 48000

I was wondering if either of you 2 could have a go with portaudio and the following or could get someone to give it a go.

So rather than L/R we have F/R (Front/Rear).
Rdelay1 = 0
Rdelay2 = 0

What we do is create a mono stream from the stereo input.
Current = -Rdelay2 + F
Rdelay2 = Rdelay1
Rdelay1 = R

You know me and my aversion to dev nowadays but I think its fairly easy to open up a stream for playback as if its easier to playback to a snd_loopback sink then do as not sure how to create a source.
The other side of the loopback will be available to the system as a source…

That is it that is our directional mono mic using a simple beamforming alg.
Its likely to sound not so great as it creates frequency dependent attenuation, but that is relatively easy.
If you read https://invensense.tdk.com/wp-content/uploads/2015/02/Microphone-Array-Beamforming.pdf and Figure 12. Frequency Response of an Endfire Beamformer at Different Incident Angles
From the null frequency we accumulate 6db of attenuation on each octave we go down.

So alsaeq to the rescue and some asound.conf to resample to 16Khz but by the end of it we should have a directional i2s mono 16khz mic .

As we can just use alsaeq to flatten the attentuation by adding the corresponding stepped gain. Probably drop the low freqs or just leave alone and same with 16khz if 14mm spacing is used.

Probably even I could hack some python portaudio to get the above working to test but thought would just shout out to someone more accomplished to give it a go and maybe also add the following.
If we are doing this work then guess we could add a noise gate where anything under a threshold is just set to zero.
Also a compressor is just a bit rotation >> 15 or more just to lose the LSBs or at least it would act like a psuedo one and only effect low order energy.

A little C utility would likely create much less load and be more performant but the routine is so few steps and simple I thought I would shout out that maybe give it a go in python.
Guess you could even use https://numba.pydata.org/ to accelerate the routine.

That is it and apart from testing & https://learn.adafruit.com/adafruit-i2s-mems-microphone-breakout/raspberry-pi-wiring-test we have directional I2S mems on a PI maybe with noisegate & compressor?

Also with other boards you might able to get a 11/12mm spacing so the null happens past 16Khz?

I mentioned the INMP441 as they are cheap but other board shapes with a little overlap could make closer spacing.

#!/usr/bin/env python3
"""Pass input directly to output.

https://app.assembla.com/spaces/portaudio/git/source/master/test/patest_wire.c

"""
import argparse

import sounddevice as sd
import numpy as np



def int_or_str(text):
    """Helper function for argument parsing."""
    try:
        return int(text)
    except ValueError:
        return text


parser = argparse.ArgumentParser(add_help=False)
parser.add_argument(
    '-l', '--list-devices', action='store_true',
    help='show list of audio devices and exit')
args, remaining = parser.parse_known_args()
if args.list_devices:
    print(sd.query_devices())
    parser.exit(0)
parser = argparse.ArgumentParser(
    description=__doc__,
    formatter_class=argparse.RawDescriptionHelpFormatter,
    parents=[parser])
parser.add_argument(
    '-i', '--input-device', type=int_or_str,
    help='input device (numeric ID or substring)')
parser.add_argument(
    '-o', '--output-device', type=int_or_str,
    help='output device (numeric ID or substring)')
parser.add_argument(
    '-c', '--channels', type=int, default=2,
    help='number of channels')
parser.add_argument('--dtype', help='audio data type')
parser.add_argument('--samplerate', type=float, help='sampling rate')
parser.add_argument('--blocksize', type=int, help='block size')
parser.add_argument('--latency', type=float, help='latency in seconds')
args = parser.parse_args(remaining)
tail = [0.0, 0.0, 0.0]


def callback(indata, outdata, frames, time, status):
    if status:
        print(status)
    global tail
    rear = np.hstack((tail, indata[:frames - 3, 1] *-1))
    #print(rear)
    tail = indata[frames -3:, 1] *-1
    #print(indata[:,0].shape)
    #print(rear.shape)
    #print(outdata.shape)
    outdata[:] = np.stack((indata[:,0], rear), axis = 1)

#def callback(indata, outdata, frames, time, status):
#    if status:
#        print(status)
#    outdata[:] = indata

try:
    with sd.Stream(device=(args.input_device, args.output_device),
                   samplerate=args.samplerate, blocksize=args.blocksize,
                   dtype=args.dtype, latency=args.latency,
                   channels=args.channels, callback=callback):
        print('#' * 80)
        print('press Return to quit')
        print('#' * 80)
        input()
except KeyboardInterrupt:
    parser.exit('')
except Exception as e:
    parser.exit(type(e).__name__ + ': ' + str(e))

This should if I am not mistaken provide an inverse delay by 3 samples.

Speed of sound at 48Khz = just over 7mm so 3 samples is 21mm mic spacing.

I have to sudo modprobe snd-aloop as we need to do so /etc/asound.conf trickery for the rest but when I actually get 2x mics and find my glue gun I will give it a try.

The above Python code was relatively easy and stolen from python-sounddevice which seems a great util.

So input from 2x mics @48Khz output that to a loopback
Sum and resample to 16Khz

Think something like this

pcm.micdir { type plug slave { pcm "loopback" channels 2 } route_policy sum }
As let alsa-plugins sum the inverted rear and front mic.
Think you can add sample rate there will have to check or its another step in the chain.

That should sound like holy hell as the above alsaeq should come to the rescue to flatten out the frequency response.

I was trying to be more efficient with numpy and the very small bit of code in the callback really needs to be as optimal as possible but guess you could add any effect there you wish if the Pi3A+ can cope.
If anyone fancies doing anything more than my horrid hack please do.
I failed at efficient code and did what I thought would work :slight_smile:

Need to test as me being me there is probably something stupid but that should get you a basic directional mic with I2S mems.

I noticed that RPi I2S adafruit driver is great but still not correct and prob would improve.

I know what is wrong but its C and often figuring out how to put it right is another thing but the WM8960 repo should allow the clock timings and format be added to the code as currently they are just missing so works on defaults and likely why its not so great.

Also at the end the idea is to tack on to the asound.conf also chain the speex alsa-plugin and use the AGC to garner better far field.

Slightly confused as if I output to headphones I can hear both channels clearly and the sound waves will not cancel even if in inverse as apart from my brain they never meet.
When I play on a loopback and record from the other side suddenly I get loads of distortion which doesn’t make sense as I still haven’t summed into a mono stream but purely taken what I was listening to on headphones and passed through a loopback to record?

I expected my code to be a problem but not python_sounddevice playing through a loopback so off for a google.