Newbie build problem

I went ahead and created the PRs for all the repositories that address the build issues discussed here:

  1. /rhasspy/rhasspy-snips-nlu-hermes/pull/2
  2. /rhasspy/rhasspy-snips-nlu/pull/2
  3. /rhasspy/rhasspy-server-hermes/pull/33
  4. /rhasspy/rhasspy-nlu-hermes/pull/4
  5. /rhasspy/rhasspy-nlu/pull/18
  6. /rhasspy/rhasspy-fuzzywuzzy-hermes/pull/2
  7. /rhasspy/rhasspy-fuzzywuzzy/pull/5
  8. /rhasspy/rhasspy-asr-vosk-hermes/pull/1
  9. /rhasspy/rhasspy-asr-pocketsphinx-hermes/pull/3
  10. /rhasspy/rhasspy-asr-pocketsphinx/pull/1
  11. /rhasspy/rhasspy-asr-kaldi-hermes/pull/1
  12. /rhasspy/rhasspy-asr-kaldi/pull/3
  13. /rhasspy/rhasspy/pull/315
1 Like

I think it’s is but infrequently as @synesthesiam is busy working with Nabu Casa now and their Home-Assistant priorities.

As @fluidvoice mentioned, it has been infrequent since I changed jobs in 2021 (and then again in 2022).

When I have time, I’ve spent it working towards Rhasspy version 3. I am purposely keeping services/tools external to the core in version 3, using isolated virtual environments (or just calling pre-compiled programs directly).

Thanks for the reply. I didn’t want to spend effort if it wasn’t the right path forward. Should I be working with version 3? Is that available somewhere? Sorry if this is documented somewhere. Just trying to get my local setup and encountering various issues that I can help with.

It’s not available just yet, but I’m getting closer every day :slight_smile:

I’d be curious to hear about your use case for Rhasspy. Are you using it as a self-contained voice assistant on a single machine, as a server with satellites, or just as an API backend to open source voice services?

1 Like

Hi @synesthesiam
I want to create skills for non home automation uses in the fastest way possible.
My ideal config would be one simple pre-configured STT/TTS server (English by default) that I can spin up in a LXD/lxc container or on a local or hosted Pi4 and a USB mic or an Android phone app for audio over MQTT. Meaning, no setup/config GUI but instead just a config file to be edited, with the goal to reduce time to download or build. Eg. spin up an LXC container and run a script to download a pre-configured Rhasspy for English or some other lang set to use ALSA/Pulseaudio audio in/out.

The objective is get to a working server and audio in/out state as quick as possible for skills development.

Maybe a second pre-configured setup (or simple config file edit) for a cloud hosted VM or RPi4 which uses a MQTT enabled client app on a mobile phone.

I think you mentioned v3 will be more modular so maybe this will help the above and also for people to be able to build and contribute to the less “monolithic” Rhasspy

exactly. I wasted enough time trying to figure out how to get v2.5 to build.
And I don’t even care about 80% of it. The GUI, the various different STT/TTS engines and spoken languages. I just want something easily buildable that has good ASR/TTS performance in English/etc. and I can use to create skills and hack on to experiment with integrations and app ideas.

Personally being completely new to Python I found the building of this app to be a nightmare of broken dependencies as shown above. Hopefully the modular v3 will help this some.
If you want to start learning the language and writing skills then it’s a big overhead.
Docker is another similar story. If you’re new to it, then getting it to work and debugging problems is a total PITA. I found learning how to use LXD/LXC containers much more straight forward.

My intention is to simply turn on or off a light in home assistant with my voice without going to the cloud. I don’t intend to use satellites or anything.

@fluidvoice I think v3 is exactly what you’ll want. Only install what you need, and all of the services just have the bare minimum of what they need to run.

@chrismiceli What I’ll be working on for Nabu Cases for the Year of Voice may be more of an interest to you. We already have an intent recognizer built into Home Assistant now, and we’re planning to automate the installation of STT/TTS add-ons so you can have it all “just work”. Sort of like Rhasspy Junior but for more than just English.

1 Like

@synesthesiam Thanks for the reply Mike, and for all that you have done and are doing for the open source and “can’t be evil” (don’t spy on me) speech assistant space. I feel like you’re one of the few technically savvy people fighting the gorillas that have held back speech rec for soo long via hording the talent, code and speech models. ie; Microsoft, Google, and Amazon all of which have been acquiring the smaller ASR, TTS companies for over 20 years. Hearing about Alexa losing billions of $ I felt like it was kinda justified karma-esque rewards. Long term my hope is open source speech tech wins in a similar fashion to Linux.

I very much look forward to v3 being released, and will try to help the project however I can.
Cheers, Brad.

1 Like

@synesthesiam Mike, any idea when even an alpha of v3 will be released?
It would be great if even one single working English STT + TTS config was released so we could test/hack on it, and be able to write skills… if not also help out in some way.

1 Like

In my use case I have an Intel Atom system running other services in addition to a Raspberry Pi running HassOS/HA. I see many advantages to NOT run Rhasspy on an RPi/HA server if there are other physical servers available. The one downside is that I had to run Rhasspy on a VM running an older version of Debian. I also discovered that some audio hardware that works in Debian does NOT work in a Debian VM. On the up side Rhasspy has been locking up its VM every week or so, and this affects only Rhasspy and none of the other services I’m running.

In the beginning I tried Rhasspy in a Docker container. That failed immediately due to Docker issues, and the Docker documentation I could find was completely inadequate. I find linux VM technology easier to learn and debug; googling for info was simply more effective.

In my case I use Rhasspy to handle voice command input and TTS output for my music server (Logitech Media Server / Squeezebox). It also does timers, and tells me the time and temperatures.

1 Like

Thanks! It’s been exciting to learn about all the new developments in STT, TTS, NLU in the last few years, and disappointing at the same time to see just how much of it is behind closed doors. Maybe I’m weird, but I just don’t get the point of companies publishing about things they can’t even share. Just say it runs on magic and don’t tease us :stuck_out_tongue:

I’m hoping this week or next! I’ll start a thread once I’m ready and ping people I know are interested. There’s no GUI or automated installation of services yet, so this will be only for people who want to get their hands dirty.

Were you trying to share a microphone with the Docker container? I found this to be one of the most frustrating experiences in my career. So much so that I will recommend to everyone in v3 to use a streaming mic input (like gstreamer) even if the mic is on the same machine.


@synesthesiam I tried rhasspy junior but ran into some issues.

  1. on WSL, the tensorflow doesn’t properly detect architecture, so tflite_runtime fails to install.
  2. Switched to a native machine, and the install script works, but the train script fails. Initially it fails opening a lexicon.db, which seems to be created later. After skipping that error by commenting it out, I get another complaing about kaldi models missing. I am following the README. Any advice there? Should I create issues in that github?
  3. Trying the docker build, it fails too trying to find the model files (rhasspy-junior/Dockerfile at master · rhasspy/rhasspy-junior · GitHub)

As I recall the command to launch the container failed and reported an error regarding some missing audio device which actually existed. But even before that I was quite frustrated working with the Docker examples. It could have been a personal problem, but while I could find documentation it often seemed to fail to contain the information I was needing.

Getting Rhasspy on a VM to use a microphone was a frustrating experience for me also. Some of it was getting audio hardware to pass through to the VM (adding the correct edits for hardware passthrough to the VM’s config XML resulted in the config file failing validation), some of it was audio hardware driver that worked in Debian but failed in a Debian VM, some of it was my usual frustration with ALSA, and some was learning something about pyaudio and how to test it. I think my installation had some unmet dependencies. My music servers, and the VM for Rhasspy, are built with a fresh install of Debian with no desktop. On the music servers I then install mpd even though I don’t use it. I don’t know exactly what it installs but setting up my audio after that just seems to work whereas without the mpd install I can’t seem to get my audio working the way I want. I suspect I could probably write a tutorial covering all the many steps to end up with Rhasspy running on a VM running on a headless Debian server, maybe including a section on testing audio configuration BEFORE trying to get that working in Rhasspy.

I use Rhasspy successfully in FHEM as home automation. I am pleased with the announcement that Rhasspy will continue to be open to all systems. Please add me to the group of version 3 testers. Greetings, Jens

@GregD @jens-schiffke @chrismiceli @fluidvoice

1 Like

Thanks, it’s working! :+1:


mosdef will :eyes: asap. thanks! :love_you_gesture:

1 Like