The Future of Rhasspy

I hope you are wrong as well, I was not using Mycroft but reading your post makes me think twice (and some more)

I will be sticking to Rhasspy anyway and I hope it will still evolve, maybe slower or maybe other developer pitch in.

Congratulations for your job ! This should help you for a better life…

Thanks for all your works on Rhasspy, very usefull and really private by design.
Things I really appreciate : fully offline tts and stt, docker installation, custom wake word (I’m not fond on “ok Google/Alexa/jarvis/my lad…”), ability to change speech recognition (deepscpeech is the future, well, i Hope so).

Best regards,
Damien

2 Likes

Thank you for the reply, @VoxAbsurdis. I’ve only worked at Mycroft now for a month, but I may have a little more insight than I did previously.

A lot of the problems you mention all seem to stem from the same issue: Mycroft’s (current) dependence on Google for their speech to text. If they send audio to Google (aggregated at least, but still), then their privacy policy must be at least as worse as Google’s. My hope is to get them off Google, and they are all for that :slight_smile:

As far as users under 13, I actually totally get that. “Child” is legally defined differently around the globe, so some cutoff us surely needed (not sure who came up with 13). I learned about this problem when I found out how bad the open source face recognition software was for photos with my kids: there’s almost no training data. Turns out, anyone who collects a bunch of data on children (pictures, audio, etc.) is usually not doing it for the good of humanity :frowning:

I had considered this, and I appreciate the offer; do you have any idea how I would sustain such a thing? I don’t really have anything to offer as a subscription (unlike, say, the Home Assistant folks at Nabu Casa). What would I have asked people to crowdfund?

I came across the Libre Endowment Fund, and thought something like that might have been a good fit. If anyone knows about something similar – a public fund for open source software – or a way to fund the development of open voice software for people with disabilities, I’d love to hear about it.

There are potentially a few options for open source crowdfunding for something like Rhasspy. From the simple GitHub Sponsor or Patreon approach (where you could have a devblog for folks who contribute). I’ve also heard of some folks doing a Kickstarter for a “year’s worth of development”. You don’t necessarily need something additional to offer, as the folks who would contribute likely just want Rhasspy, as it’s established and you’re always keeping it up-to-date (not to mention writing software that also has research benefits). Some folks have also been successful with getting grants for their open source stuff, but you’d have to find them first.

The downside would be increasingly more overhead (GitHub the least and grant-writing, likely the most) and getting up to a full developer salary would be tough. It’d likely require a bit of self-promotion. I’m not sure if we’re ready to value open source software properly as a culture, but I hope we get there.

2 Likes

Yep, that was my read from the start. And sadly it’s been the norm in the speech recognition industry for decades now. Every ASR/TTS platform that’s any good gets bought by on of the GAFAM gang members so they retain control over the industry - and the effect is higher prices and more difficulty for independent developers to build voice apps - which is probably their main goal… to make sure only the big gorilla companies own the voice-apps. My guess is Mycroft’s goal/exit-strategy is to get acquired by a gorilla. And because of this, I don’t think any dev should waste time building apps on/with it, otherwise you’ll be wasting your time like the devs who built on Snips.

3 Likes

This reminds me of Jeff Geerling, who I follow on a few platforms. He seems to be doing great, but I don’t think I could stand doing all of the necessary social media promotion in addition to actual development.

Absolutely agree. One person can make such a difference, but for some reason there is very little support. In my time working for the U.S. government, I saw millions of dollars wasted (in my opinion) on projects that, even if successful, got us no closer to something actually useful.

Just to be clear: I went to Mycroft asking for a job; they didn’t come to me. I interviewed with 4 other companies as well: 3 of them said that if I were hired, Rhasspy was to be shut down immediately and everything I did going forward was going to be closed sourced. 1 company was currently open source, but the backend they planned to create was not going to be (remind you of Snips?)

I get the cynicism, but I’d suggest looking at the broader picture. Even if Mycroft gets acquired at some point in the future, everything is still open. Contrast this with Snips, where they published an amazing paper about their training backend, but it was only ever a promise that they would open source it – and they lied!

I still believe our two best defenses against the GAFAM gang are (1) don’t support any company that isn’t fully open source and, (2) build interoperable standards between voice assistants so that when they inevitably do get acquired or abandoned, we can just shift to a new project without starting from scratch.

8 Likes

@synesthesiam Congratulations on your new job. Topic wise it seems like the perfect fit.

Rhasspy is quite impressive considering that it mainly was made by one person. I have always been impressed buy the amount of work and care you have put into it. Thank you.

But it was also sad to see how little traction Rhasspy was getting. There is a huge OSS community for smart home (Home Assistant and others) and a lot of those users are using Google or Amazon for voice control so I hoped more would switch to Rhasspy. The end result is: It still very much depends on you. there are just noch enough users and therefor developers.

I hope that making Mycroft more OSS and 100% offline usable works out. Until then I hope Rhasspy lives on.

(I would have been willing to support it on Patreon btw. But I don’t think there are enough users to pay you a fair wage using Patreon)

What I would need to use Mycroft are: 100% offline usage, German TTS and Hermes or similar open usable protocol (I am using it to show feedback to voice commands on LED matrix screens).

My naive plan/hope/strategy would have been:

You’ve mentioned Nabu Casa: I think Rhasspy fits very well in their open smart home vision. They don’t really have a working offline voice assistant solution in HA. It would be great if they could support a few months of work on Rhasspy with a focus on out of the box user friendly usage with HA. (integration in the HASS OS, integration with entites and auto generation of slots and some form of repository for intents/automations/scripts). With that done they could market Rhasspy as default offline voice solution for HA. Rhasspy would get more users from the HA community and more users means more supporters (devs and possible Patreon supports). (but well the other issue is: Readily available hardware without much DIY work that you can put in a normal living room. (basically a Pi with a speaker and good microphone(s) in a nice looking case.) )

In any way: Thank you for your work on Rhasspy and good luck on your new job.

2 Likes

It’s fun that you mention that, as that is what I’ve been working on with Home Intent. Intended to bridge the gap between HA and Rhasspy by auto-creating slots/sentences/responses where users can just connect it to Home Assistant and it manages Rhasspy for them.

In the next month or so, I’m planning to get better satellite support and start looking into Hass addon store.

1 Like

Congratulations Michael on the new job - it’s great you are working at something you love, getting appreciated for it financially, and closer with your family !

I’ve had a quick look at Mycroft, and am also of two minds. Both projects will benefit from a closer association. I just hope that it continues the way you hope.

What is your relationship with Nabu Casa ? If they can dedicate a programmer to ESPhome (15% of their user base), then why are not doing more to provide offline voice assistant to the 75% who currently use the big commercials ?

1 Like

Oh nice. Didn’t know about it. Will give it a try when If ind some time.

Just a suggestion: I think it would be gerade to add somewhere early on the site that Home Intent is based on Rhasspy (and a link to this community) and is for use with Home Assistant. Or that it is the bridge between Rhasspy and Home Assistant or something like that. As it is now I would have thought “oh a new alternative to Rhasspy” if I randomly found the website.

Do have any plans to integrate something like “Apps” oder Scripts for functions that are more then just Home Assistant Intents? (Maybe just add a simple ui to enter python scripts that are using some the APIs developed by others in this community).

Edit: I should have read more in the documentation before askig about the script/app issue. The Component feature is for that. Then my question would be: What about CustomComponents and some form of repository for those.

Ah, I do actually mention it right away on the GitHub page. I’m actually not sure where people would head to first, but yeah, I’ll update the homepage of the docs to indicate it’s based on Rhasspy!

I do like the idea of a repository of custom components that are accessible via the UI! I don’t know of anyone who has written a custom component yet, but it’s definitely a consideration for down the line.

My pet beef with Mycroft was an uneasy feeling it was the same as it presented itself far beyond of what it was capable.
Its only in the MKII that they actually process the incoming audio with DSP echo cancellation and beamforming and Rhasspy suffered the same as really in the presence of noise it was unusable.
The delay and cancelled crowdfunders and this thing of ‘patent trolls’ just had my alarm bells going.
They have a new CEO and we will have to see which way it goes but there is a strong possibility you could be right.

synesthesiam like all needs $ and has to work and if you have got to work you have to work.
I am not a fan at all of Mycroft because to me it still seems the goal is to present the semblance rather than the real thing and hence why I am dubious.
As for working for them his core skills are bang on the button and the guys got to work even though I am not a fan of the company as likely 99% of us are not doing what we want but doing our best to earn $

I have told synesthesiam what I think of Mycroft but don’t and shouldn’t have any opinion on what someone needs to do to earn $

2 Likes

I’ve spoken with Paulus (creator of Home Assistant) a few times. He definitely sees the value in local voice assistant tech for HA. I’m hoping to collaborate with Nabu Casa through Mycroft; who knows what will become of that :wink:

Looking at the history of Mycroft, I agree. They were talking about full-on conversational assistants years ago, and the software they have today is nowhere even close to that. The new CEO seems much more down to Earth, though he’s still forward thinking.

On this topic, I think people may not have realized that my last job was for the U.S. military. So even with the many reservations folks have about Mycroft, it’s way better than the alternative! I’m grateful to my previous employer for getting me off to a good start, of course, but my work would have ultimately been shut down or relicensed.

6 Likes

Hi @synesthesiam, first of all, congratulations on your new job. It’s always good to work on something you have fun with:)

Would be great to continue the concepts we started in the other thread.

If you can work with a non commercial only restriction, I might have a nice dataset source for your trainings.


Regarding your questions about missing features in Mycroft, for me the most important, which also were some of the reasons to build Jaco, were:

  • Missing full offline capability
  • Problems with recognition accuracy because of the missing language model adaption
  • Similar to Snips the skills did only support python and not arbitrary languages and dependencies

And now in comparison to Jaco it’s also missing the skill privacy concept and some of the modularity.


By the way, it would be great if you could add Mycroft with your new STT approach to the benchmarks: Jaco-Assistant / Benchmark-Jaco ¡ GitLab

Greetings, Daniel

1 Like

I thought your announcement was a bad dream when you first made it, but alas it is not.

Seems I’m a bit late to the game again, first it was Snips before the Sonos purchase and now Rhasspy. I guess if you end up shutting Rhasspy down at least what I have now will still work.

There are a few things that I don’t like about Mycroft, first is the obvious it’s cloud-based. From what I’ve been reading it appears that when compared to Siri, Mycroft does less to protect their users personal data than Apple.
The second part is the terrible detection Mycroft has for wake words. Now to be fair the last time I had Mycroft running was nearly 2 years ago but the experience was terrible. Snips detected everyone in the house for ‘Hey Snips’. There wasn’t a blasted thing I could do to get Mycroft to detect my wife or kids when they wanted Mycroft’s attention. The FAF (Family Approval Factor) was zero which lead me to Rhasspy.

I hope you will have Sway with Mycroft but I fear it won’t be enough and we’ll loose possible the best offline assistant around.

1 Like

Can we sadly say that we will never see Rhasspy 2.5.12 ?

After Snips, and maybe now Rhasspy, which is still far better than anything else, I ask myself going google/alexa route with all sadness to not revive one more time having to redo everything … :cry:

Of course Rhasspy still works nice, but if you are standing still, you are actually going backwards …

Congratulation for your new job !

It’s seems very positive, mycroft will probably reuse some part of Rhasspy, and empower it.

Mycroft has a business models plus people paid for working on voice assistant. That s what is missing on current rhasspy project to one day compete with the big one.

1 Like

Mycroft is small enough that I’m not worried about not having enough sway. Offline speech to text and text to speech are already being promised for the Mark II’s release :slight_smile:

A difference with Rhasspy, of course, is that we need to find some way of funding development. I proposed that we offer to train custom voices or speech to text models for businesses, and use that money to keep the lights on.

I absolutely agree (though I would argue Raven is even worse :laughing:). I’ve already started working a new wakeword system (based on this paper). Let’s hope it performs better in the end!

No, I’m not abandoning the project. I have some stuff in the works for 2.6, in fact! Besides spare time, what’s holding me up right now is that so many things have changed in the past year that need to be integrated.

For example, Larynx has grown up as a TTS system and (as “Mimic 3” under Mycroft) is fast enough to be useful on a Pi. Additionally, Vosk and Coqui STT are now mature and ready for use (though they can’t be re-trained like my Kaldi system).

Another exciting thing I’ve been working on in a hybrid STT system that can recognize fixed commands and fall back to an open system like Vosk/Coqui for everything else.

Thanks! I think the future looks bright :slight_smile:

3 Likes

What do you mean by this?
I ve been using coqui and deepspeech before this in my homebrew nodered pipeline and i train a custom scorer based on my own domain specific language model which also adds new vocabulary and i actually find it it a lot easier than kaldi for this.

I mean re-trained from scratch quickly on-device. You can definitely create a custom scorer for Coqui STT/Deepspeech, but adding new vocabulary/sentences to the pre-trained scorers isn’t possible (as far as I know) without recreating the language model or doing an expensive merge on the n-gram counts.

Both Vosk and Coqui STT let you boost existing vocabulary at runtime, which is awesome. My goal with the hybrid STT is to allow for fast re-training of fixed commands, but have it “know what it doesn’t know” and let Vosk/Coqui do what they do best (open-ended transcription).

1 Like