Hey Matt, Rhasspy author here (Mike). I’m enjoying watching your videos! This is giving me some great feedback about what I need to fix with the web UI for first time users
Most people probably want to set up Rhasspy like an Alexa, so I think there needs to be a “one-click” method when you first start up that just selects all of the recommended systems. You fortunately found the “best” setup for most people – Porcupine, Kaldi, and Fsticuffs (+ Dialogue Manager).
Answers to some questions I saw so far:
- Why is the fallback TTS for Google Wavenet gone?
- In 2.5, each TTS system is a separate service now (everything was previously one big Python program). We haven’t gotten multiple services working together yet, but it’s on the roadmap. This should enable fallback TTS as well as (hopefully) multiple speech to text languages simultaneously.
- Why is there no Websocket API for text to speech, etc.?
- There is, sort of. You can publish any MQTT message into
/api/mqtt via Websocket.
Let me know if you have any more questions, or get stuck. Thanks for having the patience to work through the difficult parts of setting up Rhasspy, and I look forward to seeing more videos!
P.S. Something that is totally not obvious: the timeouts that you were seeing from text to speech were because you hit “Speak” before the TTS service had started. There’s no indication of this besides the text vomit from the logs, so I apologize