It does work the coder might of made one mistake and like the best of us slipped up and prob got thinsg the wrong way round.
The frame_size is a singular FFT process and should be a power of 2 but the code does it the other way round and sets a frames to a division of the sample rate to give 10ms worth. The tail_length has been provided as a power of 2 (4096) which actually could be 100ms + of the samplerate.
Prob needs someone just to check over some simple compile optimisations.
I did try the fftw3 libs in the speechdsp compile the 2nd time (pi3) but either my inexperience or that the internal fft routine is specific for what is needed it either didn’t make things better or runs slightly worse.
Is there an overhead when calling external libs to internal code as presume so.
I haven’t a clue but with my lack of knowledge if the FFT can be batched then maybe gpu_fft would be a major boost.
Generally though the webrtc AEC is generally regarded as better and its a much more complex piece of code.
Its a shame that it doesn’t have python wrappers in entirety as from what I have seen its just the VAD that seems to crop up. Is this complete or just VAD?
Its part of Pulseaudio but they have done this thing for drift compensation that I am not sure if its a postive or negative for embedded.
It was done to allow webrtc to run on sperate sound cards for playback and capture and cope with clock drift.
I have a hunch that if that was removed and vanilla webrtc_audio_processing was in its entirety with AEC that worked in a similar way of the fifo system of the above, we may get something much better than speex.
But the speexdsp is running quite well and this was something many said would never work.
Its not about being cheap its about enabling multiple satelites in wide aray network microphone systems that even with a Pi3A+ & $10 sound card isn’t really cheap when your talking 2 or 4 of them.
But wow its at least possible rather than $70 USB mics before you even have a soc. $140 is much more palitable than $500+ if you want a 4x speaker/mic setup or $70 vs $250 for x2.
I have even been thinking with snapcast that a room might also have a echo channel which is not just what the device is playing but what the whole room might be playing that could be sourced from HDMI arch.
Then in domestic situation if we can get AEC working to a resonable level and maybe the webrtc clock drift is needed after all the common source of domestic media noise inteference is negated.
But would love to see a native webrtc AEC running, but not sure about the drift compensation freedesktop implemented with the Pi, but webrtc rocks.
I will set up pulseaudio tomorrow and see how it runs with el cheapo sound card with AEC and post some audio and opinion.
We have no audio preprocessing in what we are doing unless purchased in through hardware which I think is probably unessacary.
There are filters we can attach ‘notch’ for frequency AEC and compress that could give really good results especially if there is a model recorded the same also.
It would be really good to set a standard audio preprocessor and also have it in the project as it may vastly improve accuracy with noise and media.
PS not going to edit that as my dyslexia got thinsg wrong in a sentence about things being the wrong way round, change the code and ignore my rambles
@fastjack I haven’t got one but doesn’t the respeaker have a loopback that actually you could loop back what you play on another card. Why is another question but you could and still get AEC?
Actually yeah maybe you could get all audiophile and maybe use a usb dac for audio out but loopback that audio for aec? (It still plays through a WM8960 that the 2 mic respeaker does)