Implement web interface for audio streaming - could alleviate need for satellite rhasspy instances

If rhasspy had a “client” webpage with the ability to stream audio back and forth to the base, it could enable practically any device with a web browser and audio to become a satellite without the need for multiple rhasspy devices (or platform-specific applications), and optionally allow the complete processing of speech-to-text and intents directly on the base.

1 Like

I support the proposal.
Such a web microphone could be embedded in any smart home control system, since they all have a web interface.

I’m working on this on my own right now: Possible to use Tablet as Mic?

Fantastic, I was mulling over doing this myself - I will hold off for the time being.

It seems the easiest strategy would be opening audio via websockets and publishing the audio chunks to MQTT hermes/audioServer/SITEID on the backend while also subscribing to the necessary tts stream for command playback on the device.

Hopefully the frontend could be flexible enough to have a basic webpage with the websocket javascript code, then allow for user-customizable javascript/css/html for all skinning possibilities.

Obviously it is a fairly large undertaking, I am looking forward to your work!

Linto are doing this the just created some MFCC libs in DART so they will run in a browser.

Yep! Basically the plan! My first iteration will have this all communicating through Node-RED (via websocket) so I can exercise some extra control over a few aspects but regardless, once I’m done, I’ll share the code and the rest of you much-smarter folks can incorporate into documentation or a cookbook and go from there.

It’s going well. I’m not that far from demoing. Issue is finding enough time!!

I’m very excited about this! It solves so many problems for me.

I may eventually find that the best way forward is to rely on a framework of sorts but for now, I’m working on building this all with vanilla JS. The efficiency nerd in me can’t help but try. :slight_smile: