There is always going to be some need for an audio-service but to be honest I am the same as @koan as have never ran snapcast or mopidy.
I am sort of curious to how this is handled in server/satelite situations as being a Rhasspy noob not sure how its handled or what future plans are.
Each satelite has a microphone and vad to stop streaming on no voice and on multiple satelites obviously makes huge savings in bandwidth.
You might have a 5:1 or 7:1 or whatever mutichannel audio is in flavour, as that also makes a wide area mic array, so that can be a lot of bandwidth to save.
So presuming there will always be an audio-service even if just input.
You might still have 2 tier recognition with an authoritative server or a satelite model.
So even though its been modularized doesn’t the need reapear because it can now be partitioned into tiers?
My head is spinning a bit because with media its likely to be streaming channels with at least a single stream channel accompanied by a room echo channel + mic channel.
If you have a satelite mode then audio duck/cork initially is processed locally and continues duck/cork until server authorisation or releases to normal.
Thats a local audio process isn’t it and Rasspy process? As thought I would ask as not sure of what snapcast or mopidity do or even have any function or awareness for the need of a room echo channel, or input.
I have had a brief look over snapcast which looks awesome and the default audio backend whilst apps like mopidity are just audio clients to push streams onto snapcast.
Can you stream channels to and from snapcast or does each client pull the whole stream and then can be assigned a channel?