Mimic 3 TTS Preview

sve · May 13, 2022, 4:46pm

Was it difficult to get working on the phone at all?

No. It was as easy as for a Debian or Ubuntu computer.

rejoe2 · May 13, 2022, 5:30pm

Hi there, got the server up and running (manualy for now).

Settings are saved (wrt. to maryTTS) as follows:

    "text_to_speech": {
        "marytts": {
            "voice": "thorsten_low"
        },
        "system": "marytts"
    },

The other keys mentionned in docu / text-to-speech/#marytts are not explicitely stored in the JSON, but visible in the Rhasspy UI (de_DE, thorsten_low).
Putting that combined in the “Voice” field doesn’t help changing back leads to the locale also beeing stored in the JSON, but still this results in

TtsException: file does not start with RIFF id

What did I miss or could do better?

Tests with “http://external-ip:59125/” work quite good, calling with “openapi” postfix results in 404 error…

jens-schiffke · May 14, 2022, 6:51am

@synesthesiam
I like the announcement!
Is there already a date when the mimic3 repository will be online in the Github? I would like to test the Debian packages.

Greetings, Jens

fluidvoice · May 14, 2022, 8:47pm

Wow, how could Portuguese (Brazilian) not be on the list?

synesthesiam · May 15, 2022, 2:58pm

I’ll have to check this myself. The Mimic 3 server should also work with Rhasspy’s “remote TTS” option, but I need to double check I haven’t broken anything with that either!

Hopefully next month, but I sent you a link with the beta packages

It was, but people told me that the voice I trained wasn’t understandable. I used this dataset: https://github.com/Edresson/TTS-Portuguese-Corpus

Do you know of any other TTS Portuguese datasets?

fluidvoice · May 15, 2022, 8:33pm

sorry no. I’m clueless about lang models, data, etc.

fluidvoice · May 16, 2022, 5:23pm

did you look here? Hugging Face – The AI community building the future.

jens-schiffke · May 17, 2022, 3:52pm

Nice!

    "text_to_speech": {
        "command": {
            "say_arguments": " --ssml --voice 'de_DE/m-ailabs_low#rebecca_braunert_plunkett' ",
            "say_program": "mimic3"
        },
        "satellite_site_ids": "default",
        "system": "command"
    },

Das ist ein Test in deutsch <voice name="en_US/vctk_low#p236">and this is an test in english.</voice>

… and Rhasspy speaks two languages in one sentence - cool.
It runs a bit slow on my old machine without GPU. With enough power and cache it will definitely get better.

Greetings, Jens

synesthesiam · May 17, 2022, 4:13pm

I didn’t, but I don’t see any useful data there

Awesome! The way to speed this up is to run mimic-server as a service (check the source code for a systemd unit example), and then use mimic3 --remote ... so it will use the web server instead.

jens-schiffke · May 17, 2022, 4:19pm

Calling it up via the web interface wasn’t faster either. Now I have to pimp my base a bit first…

fluidvoice · May 18, 2022, 12:18pm

Btw, I think Mycroft should link to some demo’s in their Mimic 3 blog post announcement.
If people could hear presumably how good the TTS sounds they’d be more likely to sign up and get involved. My 2 cents.

CrankyCoder · May 31, 2022, 10:03pm

Will this be a drop in replacement?

The1And0 · June 28, 2022, 8:28pm

Hello,

unfortunately i cannot sent PM as a new user and therefore cant test RTF’s for different architectures. Can someone give hints about RTFs, maybe for ARM?

Thanks

synesthesiam · June 29, 2022, 12:40pm

Hi @The1And0, on 64-bit ARM you can get an RTF of around 0.5. 32-bit ARM is slower, around 1.2 or 1.3. If you’re on a 64-bit x86/64 machine though, it can be 10x faster than ARM

Try it out for yourself: https://github.com/mycroftAI/mimic3

AndreKR · June 29, 2022, 12:58pm

Oh, there’s a Docker image now. (Although apparently without harvard-glow_tts yet?)

Is it compatible with the “Remote HTTP” TTS option of Rhasspy?

synesthesiam · June 29, 2022, 1:52pm

It is! Just set this as the URL: http://localhost:59125/api/tts

You can change the voice like this: http://localhost:59125/api/tts?voice=en_US/vctk_low#p236

tipofthesowrd · July 1, 2022, 5:18am

I think I’m missing something with the docker image.
After running the image I can access the web server but I cannot synthesize voices.

My guess is I have to still manually download the voices. But how do you do it using the docker image?
I tried locating the mimic3-download command in the image but no luck.

ERROR:mimic3_http.synthesis:Error during inference
Traceback (most recent call last):
File “/home/mimic3/app/mimic3_http/synthesis.py”, line 125, in do_synthesis_proc
result = do_synthesis(item, mimic3)
File “/home/mimic3/app/mimic3_http/synthesis.py”, line 81, in do_synthesis
raise e
File “/home/mimic3/app/mimic3_http/synthesis.py”, line 61, in do_synthesis
mimic3.speak_text(params.text, text_language=params.text_language)
File “/home/mimic3/app/mimic3_tts/tts.py”, line 368, in speak_text
voice = self._get_or_load_voice(self.voice)
File “/home/mimic3/app/mimic3_tts/tts.py”, line 579, in _get_or_load_voice
voice = Mimic3Voice.load_from_directory(
File “/home/mimic3/app/mimic3_tts/voice.py”, line 283, in load_from_directory
onnx_model = Mimic3Voice._load_model(
File “/home/mimic3/app/mimic3_tts/voice.py”, line 403, in _load_model
onnx_model = onnxruntime.InferenceSession(
File “/home/mimic3/app/.venv/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py”, line 335, in init
self._create_inference_session(providers, provider_options, disabled_optimizers)
File “/home/mimic3/app/.venv/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py”, line 370, in _create_inference_session
sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
RuntimeError: /onnxruntime_src/onnxruntime/core/platform/posix/env.cc:183 onnxruntime::{anonymous}::PosixThread::PosixThread(const char*, int, unsigned int ()(int, Eigen::ThreadPoolInterface), Eigen::ThreadPoolInterface*, const onnxruntime::ThreadOptions&) pthread_setaffinity_np failed, error code: 0 error msg:

rolyan_trauts · July 1, 2022, 5:39am

Connect to the docker image
sudo docker exec –it nginx-test /bin/bash as if the container name was nginx-test but change name to the container name
Its like ssh as you connect to the docker image and logon as by default your will be root
then do as you would do on a host wget and you may need to install wget as you would do ‘apt-get install’

Docker to connect its always sudo docker exec –it <container-name> /bin/bash

tipofthesowrd · July 1, 2022, 6:18am

That’s what I meant by running the mimic3-download command ‘from the image’
Should’ve specified I was running it through docker exec /bin/bash

synesthesiam · July 5, 2022, 3:16pm

@tipofthesowrd You may need to run this before running the Docker image:

mkdir -p "${HOME}/.local/share/mycroft/mimic3"
chmod a+rwx "${HOME}/.local/share/mycroft/mimic3"

The Docker image runs as an unprivileged user for security, so it may not have permission to download voices.