Tons of Questions coming from Mycroft user of 1-2 years back

Seems like this project sprung up to fill a gap with users that wanted offline. I was also someone of similar sentiment.

  1. What speeds are we looking at as bottlenecks right now for offline with Raspberry Pi (4gb)
    Back a year or two ago I was getting like 4-5 seconds for a response on Rpi3. Don’t know what the market looks like now. I heard some get 1 second times? I remember using pocketsphinx.
    Are language bottlenecks a problem either? Like Python vs pypy?

  2. Is it possible to have multithread? Can I have the voice assistant talk to me and then tell it to do something at the same time? Like with Google home i can tell it to “stop”

  3. I used to hack Mycroft to do offline only. What is the difference here? I see Python or whatever language u can do. I see a lot of Shell scripting. Do i even need Node red? To what extent is Node red better than old fashioned programming? I see you can interact with devices easier. But if i want to pull some data off the web. Should I still be using like python?

  4. Is it easy to nest intents or commands with Rhasspy? Like say I want context for command based on previous?

  5. Is there any boilerplate for this? Like boilerplate skill stuff just to try out? (although i know its optional)

  6. Are we still looking at smaller dictionaries to speed up performance times? How are we comparing to Google/alexa/mycroft in terms of speed?

  7. Can i run this on an android phone with something like a linux continer or busy box or whatever vm?

  8. My main use cases for things like this: local dictionary lookup definitions of words (webster epub/pdf), read books/pdfs, play music, etc. Do you guys think i could do all this with python? Do u have any better suggestions?

  9. Is there any issues with PulseAudio just dropping and having to restart it?

  10. Has anyone tried making their own voice the playback voice?

  11. Is there any tips/tricks i should be aware of? I’m currently looking at buying ReSpeaker 2. I have currently a Raspberry pi (4gb) with a camera.

  12. I was thinking of using this in my car too or buy another. Would u guys recommend a UPS for the Pi or is it just overkill?

So, I’m running on Rhasspy 2.5.10, where it sounds like 2.5.11 is about to be released, which may change the answers a bit. That said:

It depends a lot on the length of the speech being detected / read out. I’m running on a RaspberryPi 4 though, so if there’s say a piece of text you wanted me to put into the TTS, or read into the STT, I could try it and let you know the speed.

Well, firstly, you don’t need to hack Rhasspy to do offline! :stuck_out_tongue_winking_eye: In addition to that, Mycroft is pretty monolithic, where Rhasspy acts more as glue between independent systems. Practically, what this means, is that each part of Rhasspy has alternatives which can be selected at any time and potentially released and upgraded with new features, or just chosen because they suit your use case better. Don’t like how one text to speech voice engine sounds? Pick another? Are you more interested in high accuracy recognition of a pre-defined wake word, or in making your own wake word? There are different best options for both!

You don’t actually have to have node red to use Rhasspy at all (I don’t). Node Red gives you flexibility to program and interface with other systems, but Rhasspy alone is remarkably flexible.

Not… that I… know of. (correct me anyone if I’m wrong here!) That said there has been user interest in this posted elsewhere, and I don’t think anyone is violently against the idea.

There’s a couple of default phrases Rhasspy recognizes out of the box. You can see them in the “sentences” section. It also gives you a pretty good idea on how to add new sentences to be recognized.

Heh. I’ve been tempted, actually! @synesthesiam has said that it takes about 1.5 hours of voice recording to make a new voice. You can always help out in smaller chunks though, just by contributing to the Mozilla Common Voice project just by reading out single sentences or listening and rating the recordings of others on their website.

Depends… How important is to you to milk the last bit of uptime out of your device, versus the value of the money to get the UPS?


Inside a single session this can be achieved using continue session, which has the option of passing an user defined context (customData).