WIP skills Project

LordQuasar · July 26, 2020, 9:26pm

I am Gido am ex snips users and amature skill developer… I normally work as a frontend developer and Python isnt my native language… I tried installing rhasspy a few times but only after seeing this video Privacy Aware Voice Assistant, now for real: Mycroft, Rhasspy, and Node-RED - Part 2 (Rhasspy) - YouTube from ulno. I also didnt want to depend on a docker service and wanted to locally install rhasspy.

since a few days I started reprogramming and optimizing a collection of skills which I already made at snips and combined all together running as one service listening on the mqtt commands…The project is in german and english (eventhough I didnt get the timer running in english yet)

My goal is to replace my alexa device theirfor I need a kitchen timer, a way to play music preferably the ones I realy own as mp3’s, perhaps hear some streaming radio, watch youtube, make a playlist (which is still still not possible at Alexa I presume), turn on and off the lights in the apartment (already working on this), and last but not least a face regonition (or if you rhasspy guys get it working a voice regonition) to be able to know who whos talking with the device and later on train the device on the needs of the people in the apartment and of course totally offline… and I dont want to use homeautomation… I want to code it myself totally offline… okay I do use the wavenet voices from google but who cares…

I didnt understood the weatherskill (which was shown here in the forum) compared to the snips skill it had a zillion more files and presumably is way better but I presume I will take the snips version and the snips thinking of just acting on the mqtt actions. There are no skills at Rhasspy, the Alice Project has them. I dont think the skills at snips were an ideal world, for there is no overall command like stop, cancel, next or previous… how can the raspi know what “next” means when I play music or play video if I make 2 different skills… therefor I am now making an overall project with all the skills I need and an overall next etc command

You can see my rhasspy project working here in german

(the sound of the device is not very high and harder to hear for while playing music while the boxes are too near mycroft doesnt react anymore)

perhaps it is baby python though it works almost perfectly in german but I am still struggeling with the english timer. I noticed using numbers like (1…10000) make the training last for aaages… so I reduced those numbers
You can find the baby WIP code here

If you have music just put the music in the /home/pi/Music directory and if you say scan the music this directory gets scanned and 2 files are generated, one with all information about the mp3 and one with a json list of all artists… I used the json list to add the artists to the slot “artists” and manually add the artists to the slot… but I will try to add the artist slot automaticly and then retrain rhasspy this is great for not even snips had this … snips had a super long list of artists but a lot of artists in my personal mp3 where not on their list therefor unreachable…
I will also add a virtaul display of what is happening at rhasspy as soon as I have my skills done…

I am hoping there will be a personal wakeword somehow so I can differ who is talking to the machine that did work at snips and the possibilty to make a question by rhasspy to confirm something or to just ask something. And of course to be able to translate longer texts which have nknown words… I know this does work with Kaldi. I installed Kaldi a while ago and used my translation skill from snips and it was able to translate with microsoft azure on the fly. I know there is a variable which one can set here but then it works for everything you say and works slow, it would be great to have some like I say translate from russian to english, rhaspy says “okay lets start” and then starts translating everthing it hears till I say “stop translating”… dreams

awesome project! love it!

Daenara · July 26, 2020, 9:38pm

I started tinkering with snips before it went down but never got out of testing and I wrote that weather skill that is here. The snips weather skills at the time I played around with it only had logic in them to tell the current weather, nothing more. I started adding to them by adding the weather in the future, and the ability to ask if certain items were needed and it outgrew the structure it was in so I did a rewrite which is what I ported for rhasspy. It works fine in 2.4 but is completely broken in 2.5 at the moment, so I would advise against using it until either the legacy command functionality works in rhasspy or there is a way to dynamically add in lists of words via mqtt. Before the later is achieved there is no need to port it to a mqtt skill.

If you are okay with what the snips weather skills functionality (or if they evolved since I last looked at them) then there is no need to even try using my chaos, because right now that is what it is. I need to work some more on that, add more to the documentation and make it easier to use but for now I have other things to solve first, like getting my respeaker4 to work again or adding other things into my system so the weather skill hasen’t been getting much love as of late.

LordQuasar · July 26, 2020, 9:42pm

hey no offense your code is so much better… I already saw you typing seconds after I submited this text… and though ooh no, this article should not be about the wheater app … I just mentioned it to compare awesome code and my baby code…
check my video… hope you like it… I just didnt know how to get it working thats all… Presumably my approach is the wrong one but it works for me

LordQuasar · July 26, 2020, 9:44pm

ah btw the code I wrote will not be broken after an update of rhasspy, unless the mqtt jsons are different… for I just only listen to the mqtt jsons and then do other stuff… if rhasspy has different jsons in the mqtt in version 2.9 or 3.7 I only need to adapt the intentHandler.py

Daenara · July 26, 2020, 9:55pm

I am pretty sure I will make use of quite a bit of what you wrote, you tackled problems I haven’t even gotten to, like a timer. When I adapted the weather skill there was no mqtt and it uses slot programs to dynamically add wordlists from the skill to rhasspy, that isn’t possible with mqtt, so it still works with the legacy variant which is a python script and that script handling is currently broken in 2.5, it just isn’t even called, which is why you would have had problems using it, if you tried. Basically not compatible with 2.5, but it hopefully will be one day.

If you want to integrate more into rhasspy and make it easier for others to use you might want to take a look at this. I personally plan to use it for my weather monstrosity once slotprograms work via mqtt. The plan is to build a skill repository like snips used to have to make installing different skills easier.

LordQuasar · July 26, 2020, 10:10pm

I guess I will make a backup first before I enter

pip3 install --upgrade rhasspy-hermes-app

I dont whant to intergrate my code into rhasspy I am a frontend (angular\react\etc) Developer… I see rhasspy as my backend and use a normalizer to get the right jsons. I checked all projects for months to get a ai voice project on which I can work on and I am really sorry that your code now doesnt work anymore. The wheather project at snips was I gues number one in germany. I copied the timer code from a french skill and adapted it, but after making it better for rhasspy it now really works in german. It never really did everything it supposed to at snips…
I am sure there is some way to add slots entries from outside the rhasspy project and let rhasspy train again, since there is a frontend which does this I am sure it is possible to inject a training at command…
The rhasspy skill maker community isnt very large, you are number one, I guess I will be number two… lets skill this place!

Daenara · July 26, 2020, 10:14pm

Installing the rhasspy-hermes app should not break anything, it is just an easier way to integrate with mqtt and there are plans to add in automated installation, adding in sentences and wordlists and then automatically train rhasspy with those. It might be a few months before it gets to that point thought, because part of it needs support from the rhasspy side of things. For now it is just a way to listen to intents with less hassle than when using mqtt directly.

LordQuasar · July 26, 2020, 11:08pm

The comment from No_one was deleted… I used pygame as mp3 player and of course had a hard time playing a playlist and not blocking the intentHandler… I am totally unsure about my solution but I used “threading” in python to start playing a song so the intentHandler continues even after starting a song…it seams to me that by using threading in python one can do a kind of async playing but I am totally ansure about this,

LordQuasar · July 27, 2020, 9:17am

looks like I deleted too much of the timer code before pushing the code to github … it is not working anymore, I’;ll fix that today

LordQuasar · July 27, 2020, 3:53pm

fixed the timer problem, like I thought, I deleted a bit too much code

LordQuasar · August 3, 2020, 7:43pm

thank you Daenara, I think I got finally what you mean with the

pip3 install --upgrade rhasspy-hermes-app

I see now I should rewrite my code so it fits the rhasspy-hermes-app (RHA) for it gots what I need!
I tried the RHA before but that gave an error and forgot about it, after reading some other articles here I got back to to the RHA and realized that it needed to be updated and now it works… awesome!

I will reprogramm the skills I made and adapt them to the RHA!

Thanks!

Daenara · August 3, 2020, 8:46pm

Once you have done that, there should be a thread somewhere in this forum where you can have added your skills to a list of every skill working in 2.5. I haven’t seen that one active after it was started but that might be because there aren’t that many ppl working on skills.

No_one · August 3, 2020, 10:01pm

There is a AWSOME list here:

and if you use the RHA, maybe leave a message here aswell:

Daenara · August 3, 2020, 10:05pm

Exactly the list I meant, but I think there was a thread about it in this forum. If the thread gets a bit more love more ppl will find out about the list.

synesthesiam · August 4, 2020, 12:30am

Awesome project, @LordQuasar! We need help getting Rhasspy skills/apps off the ground, so thank you for your contributions. Would you be OK with me linking to your video from the docs?

This is a great point, and one I’ve thought about a lot. One way would be to design a set of MQTT topics that abstract the functionality of a media player (like media/player1/stop or something). Then, you would just need to write the small app that takes in your specific intents and emits the abstract messages. As a community, we could define and support these protocols.

Another way is to have Rhasspy represent things internally, like media players, lights, etc. I don’t want to go the full Home Assistant route, where Rhasspy cares which brand of smart lock you have. But it could be useful to build a knowledge base about common home automation things, formally represent it in Rhasspy, and have intents modify that representation – e.g., change the state of a light to “on” when a TurnLightOn intent is emitted with its ID.

You’re in luck! Raven is still pretty new, but it may do what you want. You can record multiple keywords from different speakers, and then use the keyword know who spoke it. The underlying technique is similar to what Snips did.

LordQuasar · August 4, 2020, 5:52am

@synesthesiam I am flattered, yes please use any of my videos where ever you want to. I just uploaded a new one where I gave the raspy a home printed case for the respeaker

that sounds great! I’ll get to that after I reprogrammed the skills in rhasspy friendly code. The alternative would of course be to add a camera and let the device check who is in the room, I checked that for a while there are some great easy script to do this one just needs to add opencv.

I think there needs to be a global status on the rhasspy skills so one know what the device is doing in the moment so it can interpet next, prevoius, stop, continue each time differently. If next previous etc is globally and I know in which state the device is the amount of error is reduced… eg if I am playing music stop will always end the song. But if I ony have a timer running stop should automaticly mean stop the timer. If I am running music and the timer only then I need a different input, like stop the timer or stop the music.
Also eg and intent called Help, while running music should help you out telling what you can do with the music player, but if a timer is running help means what can I do with the timer. If I remember correctly the Alexa from Amazon does this too but here you need to start a skill so it know which help you need…but with rhasspy I of course dont want to have to say start music skill before playing music…

I dont think my python is good enough to help out with the architetural implementation of skills within the rhasspy skills api but I am happy to programm the skills and think with you. When I am done reprogramming I’ll let you know.

Daenara · August 4, 2020, 11:35am

I don’t think this is something rhasspy itself should do, it feels like it is too much. We do have a skill system in the works, it might fit in there better.

My idea would be to let the skills that keep running keep track of that and post their status via mqtt. They should also publish a list of valid controls via mqtt and then when a general command like stop comes in, skills can check if something else that could react to that command is running and decide if they should go ahead or not.

With this, the only thing that would be needed in rhasspy, if even that, would be a way to declare general intents somehow, to differentiate them from app specific intents so the skills don’t need to check every command. That seems more like something rhasspy should do, it has a broad range of use, rather than the pretty specific state tracking that only seems useful if you are trying to build a smart home or something and that is pretty redundant with whatever system ppl are using for that.