Where to start?

I am also using Kaldi and PicoTTS (it’s OK, but compared to Google or Amazon it is still really bad :frowning: ).

Wake word is an issue. I am currently using porcupine with one of the universal models. Problem: I don’t really like the wake word choices. All feel weird to say for native german speaker and not easy to pronounce without some German accent. I am currently using “pico voice” as wake word. It recognizes me most of the time and very few false activates.

I am thinking about making my only custom wake word with precise. How did you do it (source of the training data?)

Is it possible to use rhasspy with Amazon Pollylike it was with Snips?

Have you tried MaryTTS? I don’t know the quality of the German voices, but I’m using the voice dfki-prudence-hsmm for English and it’s really enjoyable.

@DanielW If you want more information on training a custom wakeword model with mycroft precise take a look at this Mycroft Precise Model .
I followed the instructions from mycroft for training your own wakeword model.

What kind of hardware are you running your system on?
How fast does Mary TTS work on a pi4 ?

I wrote down some of my experiences about training a custom wake word here in this post Mycroft Precise Model Problem (computer-en.pb) especially with training a noise resistent model.

This is on a Raspberry Pi 4. I haven’t benchmarked it, but it’s fast enough for me.

I just tried the three German voices but there is no huge difference in quality compared to PicoTTS for me. But it uses more RAM and CPU. (it is OK on a Pi 4. But longer texts will slow down Rhasspy)

Got it up and running. Will first create functionality and try around with different voices later :slight_smile:

Have a look at this:


It explains, how to setup master/satellite with Rhasspy as you are used to with snips.

Of course, no problem, really works well

What mics are you using? Respeaker or Matrix or …?

If Respeaker or Matrix have a look at HLC LED Control in this forum. Just to light up your LEDs, triggered by event

I use the PSeye cam on the big Raspi which is working great. Later I want to try my zeros with a two mic platine attached (may check for the name later) which was working great with snips before. I think those satellites have an LED I might start.

I found the respeakers just to expensive when creating multiple devices. Mist are even placed where they are invisible :slight_smile:

I think my other mics are seeed hats if I remember correct.

Hi @koan how did you get your setup to work with MaryTTS.

The play back speed is sped up as if it is fast-forwarded.

Could you give some pointers?

Hi @koan !! :slight_smile: Could you please provide instructions for building/installing that MaryTTS voice dfki-prudence-hsmm? I took a very quick look at the MaryTTS github, and am not certain where to get started. Will I need to install a full MaryTTS environment on my Raspbian OS? Thank you in advance for EVERYTHING you do for this Rhasspy community.

@FredTheFrog are you using docker? synesthesiam released a docker image for maryTTS here: https://rhasspy.readthedocs.io/en/latest/text-to-speech/#marytts

Thank you!! Most appreciated.

I didn’t do anything special: I swapped out eSpeak for MaryTTS and it just worked. This doesn’t seem to be a MaryTTS issue, but an issue with the settings for your audio device.

I use the Docker image, works fine here.

Hmm :thinking: if my audio settings were the issue, I would think that espeak should also exhibit the same issue, but it does not. Espeak is the only TTS that works (meaning I can hear the words, and not something that sounds like its fast-forwarded). Can anyone point me to some docs to help me start diagnosing this issue. I’m not really sure what keywords to search for.