I’d also like to see this. I wrote a Tasker script that records voice when I press my Bixby button, but unfortunately Tasker only does 3gpp and mp4. If I trigger an external app it sometimes take very long for the app to start/stop.
I’m thinking we could do this pretty easily with ffmpeg if the HTTP Content-Type field is filled out appropriately when you POST to /api/speech-to-intent (or /api/speech-to-text).
It looks like ffmpeg supports 3gp audio, so this should work as long as we include it in the Docker image.