Thank you for getting started on this! Once it’s in a workable state, let’s talk about the best way to get it into users’ hands.
One integration approach is to add it as a git submodule in rhasspy-voltron
and add a configuration section in the web interface.
Another approach is to leave it separate, and come up with an easy way for users to add external services to their Rhasspy installation (probably via Docker). The user would then select “Hermes MQTT” for speech to text, and probably do additional configuration via the service itself (or something in their profile).
The first approach is probably easier for users, but the second approach gives you more control to change defaults, how configuration works, and which version of your service is in use without having to go through a pull request.
Let me know your thoughts.
No, I’ve just not found a good solution to the different ways that each speech system represents “confidence”. I’m thinking ultimately it will just have to be a setting within each service.