C vs Python and then some thoughts to audio DSP

rolyan_trauts · March 2, 2022, 7:40pm

nano c_loop.c paste the following

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
    int NUMBER, i, s;
    NUMBER = atoi(argv[1]);
    for (s = i = 0; i < NUMBER; ++i) {
        s += 1;
    }
    printf("s: %d",s);
    return 0;
}

gcc c_loop.c -o c_loop
time ./c_loop 450000000 you should get something like the following.

pi@raspberrypi:~/c-test $ time ./c_loop 450000000
s: 450000000
real    0m2.094s
user    0m2.094s
sys     0m0.001s
pi@raspberrypi:~/c-test $ gcc -O3 c_loop.c -o c_loop
pi@raspberrypi:~/c-test $ time ./c_loop 450000000
s: 450000000
real    0m0.002s
user    0m0.002s
sys     0m0.000s

-O is default -O3 brings in most optimisations and as you see is fast

Now python

nano python_loop.py paste the following

#!/usr/bin/env python3
import sys
NUMBER = int(sys.argv[1]) 
s = 0
for i in range(NUMBER):
    s += 1

run with time python python_loop.py 10000000 and do it twice so your sure the 2nd is at least all from python byte code.

 time python python_loop.py 10000000

real    0m1.964s
user    0m1.930s
sys     0m0.037s

That is a basic example of in process c vs python but what really sucks is the marshalling across from python to a c lib as so many python pip packages are. Also because I am impatient I dropped the python loop count so multiply the time result by x45, but yeah we are really talking about x43,930 difference.
Python is a great language, really useful, easy to read and generally fast enough but when it comes to any audio DSP where we are marshalling audio frames it absolutely sucks far more than normal speed increases.

I have been waiting for what feels like an eons puzzled why for various initial DSP functions there seems to be a total lack of community progress so picked up the mantle myself and don’t rate my chances as my MS really does mean to others they have a head start to my normal challenged self but hey.

I am picking up C with an eye to the ESP32-S3 but also completely confused why no-one has ever provided a few basics of DSP on the original Zero and what I know for sure the Zero 2 can do with some fairly cheap hardware such as Respeaker 2/4 mic hats.
I am not expecting much from myself even if I do manage some awful hacks but posted this to my ever puzzled frustration no-one in the community has provided some simple DSP utils especially with all the github examples avail and supposed professionals involved…

I just thought I would post the above as many know C is faster with a simple loop being so much faster before we even start the killer of marshalling data across c & python boundaries its just crazy we are complaining about hardware when its the language we use for certain processes.

Python is great but its absolute dirt for audio DSP and much of the required input of a VoiceAI is audioDSP alone.

Also breaking up items such as beamforming, EC and MFCC as individual packages is nuts as marshalling across memory barriers takes place in each package but also through the pipeline things are worse as what could be singular load intensive FFT routines are often repeated in each package.

If you want to dig deeper than my Dummies Guide C level have a look at some of the great work by https://github.com/orgs/42io/repositories
Which is very interesting for me as already thinking this guy is C amazing but I am pretty sure I can make those datasets much better for noise and operation.

rolyan_trauts · March 3, 2022, 7:00am

Probably one of the best implementations of a various beamformers, agc, ns I have seen https://github.com/athena-team/athena-signal and its quite likely the addition portaudio or piped input isn’t that much work for someone with a smattering of C.

It will be an age before I get to that level and around to it as I am trying to get to grips with ESP32.
But if anyone fancies adding streaming to something like the above which is not much more than chunking frames with efficient optimised code please do and release as standalone for all to use.

Athena grabs 128 frames at a time which is 8ms chunks which with a modicum of knowledge should not be that hard to do. Commercially there are a rake of offerings and apart from research python code there is a huge hole in the linux ecosphere for optimised libs to do what has become a pretty common HMI.

If that was done or even better the great hybrid of athena and https://github.com/breizhn/DTLN in https://github.com/avcodecs/DTLNtfliteC then the Zero2 & Pi3A+ with the respeaker 2mic / 4mic hat is not microcontroller cheap but an extra $10-20 for many is no big deal for ease and added horsepower which it has in bundles but not enough for python research code.

It been a complete mystery why the many source repos have not been adapted to streaming interfaces and that adopted linux libs don’t already exist and its shocking to realise it is purely a lack of coding ability to port what already exists without even need for innovation!
I still think station side targeted voice extraction would still further enhance that greatly but just some simple efficient audio input algs for KWS ‘ears’ is such a basic need.