I wanted to share some ideas that came to my mind for the creation of a skills system for rhasspy surely there will be someone who will have better ideas than mine. Some topics to manage skills could be:
rhasspy/skills/<skills name>/register this is the main topic where skills communicate both sentences and slots could do so in a similar way as when instructs the NLU system to re-train. Once received the data saves them in a folder that could be named skills and inside would have a folder for each skill containing slots and sentences. After saving send a request for re-train to rhasspy and if all goes well send rhasspy/skills/<skills name>/registered and save the skills name in a slot otherwise erases the saved files and send rhasspy/skills/<skills name>/error.
rhasspy/skills/<skills name>/unregister once received erases all the sentences and slots and removes the skills name from the slot which holds the name of all skills and retrain the NLU. If all goes well send rhasspy/skills/<skills name>/unregistered
rhasspy/skills/<skills name>/updateSlots and rhasspy/skills/<skills name>/updateSentences to update slots and phrase.
All these topics could be handled by a new module that could be called rhasspy-skills. Once managed the recording of the skills happens the most difficult part, the management of the conflicts that however could be made later. For the name of the intent, there is no big problem because the name of the intent should start with the skills name after that an underscore and finally the name of the intent (weatherSkill_isWindy). Instead for the equal sentences, it is more difficult my idea, for now, it when occurred a equals sentence but in different intent during the training phase is reported with an MQTT topic and these can be captured by the Dialogue manager. When a text is equal to these sentences that can give conflict, there are two ways to solve it (those that have come to mind for now). Have some kind of status for example if we have previously set the music and say stop even if maybe there is a stop also for the timer since the music is playing rhasspy stop the music. if you have no idea how to resolve the conflict might require an intervention by the user through continueSession, so the user says which skills he wants to use for the management of the intent.
If you want I could start converting some of my ideas into code, I’m not the best python programmer but not the worst either. Sorry for the bad English but I still hope that I was able to express my ideas on how to integrate the skills on rhasspy so in the future there could be a marketplace where everyone can download skills easily.
I’m still not sure myself what the best API for skills would be. There have been some discussions about skills/apps in the past:
Have you taken a look at these for some inspiration?
I’m also not sure yet that we need something like the rhasspy-skills module you propose. At least for now, there’s barely an app ecosystem yet, so working on an app manager looks like putting the cart before the horse now. And some of the problems like conflicts between sentences look actually unsolvable anyway.
I had read some of these posts but I didn’t find the issue on Github thank you for finding it. you’re right I thought too far, however, I think even a simple module that can only add the files containing the sentences is recommended because otherwise at least from what I believe the topic management go to another component in rhasspy instead with the module you could keep things more orderly and easier to expand in the future.
I think this is what’s proposed in https://github.com/rhasspy/rhasspy-hermes/issues/12, isn’t it? Rhasspy would expose an MQTT topic/REST endpoint that the app could use to add its sentences. On installation or startup of the app, it could run this to retrain Rhasspy. Or do you mean something else?
For what its worth, I really like this idea.
Mycroft has skills, but of course they are cloud based install.
It would be great (similar to Home Assistant) to be able to pull down or copy/paste local components that bring functionality into Rhasspy. As when you build Rhasspy its pretty much from Scratch.
Being able to play music from spotify or sonos, or set a timer, or other tasks we take for granted (and can have the community work on) is very appealing, as i’m sure there are currently lots of node-red flows in the mix trying to parse out the intents.