No audio playback from MaryTTS

devjklein1 · August 12, 2020, 3:56pm

I got MaryTTS setup and I can access the web interface at http://localhost:59125, but when I run a command that works with eSpeak I don’t hear anything from MaryTTS. Here is my rhasspy log:

[ERROR:2020-08-12 11:53:00,478] rhasspyserver_hermes: file does not start with RIFF id
Traceback (most recent call last):
File “/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/quart/app.py”, line 1821, in full_dispatch_request
result = await self.dispatch_request(request_context)
File “/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/quart/app.py”, line 1869, in dispatch_request
return await handler(**request_.view_args)
File “/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/main.py”, line 1612, in api_text_to_speech
results = await asyncio.gather(*aws)
File “/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/main.py”, line 1598, in speak
session_id=session_id,
File “/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/init.py”, line 596, in speak_sentence
raise TtsException(say_response.error)
rhasspyserver_hermes.TtsException: file does not start with RIFF id
[ERROR:2020-08-12 11:53:00,476] rhasspyserver_hermes: TtsError(error=‘file does not start with RIFF id’, site_id=‘default’, context=‘06094371-412b-4155-802c-3c7be7deeb44’, session_id=’’)
[DEBUG:2020-08-12 11:53:00,465] rhasspyserver_hermes: Handling TtsError (topic=hermes/error/tts, id=80ba8b41-f9dc-4b72-a4fe-ce9ee8f43181)
[DEBUG:2020-08-12 11:53:00,456] rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/default/playBytes/06094371-412b-4155-802c-3c7be7deeb44, id=80ba8b41-f9dc-4b72-a4fe-ce9ee8f43181)
[DEBUG:2020-08-12 11:53:00,416] rhasspyserver_hermes: Publishing 125 bytes(s) to hermes/tts/say
[DEBUG:2020-08-12 11:53:00,415] rhasspyserver_hermes: -> TtsSay(text=‘It is 11 53 AM.’, site_id=‘default’, lang=‘en’, id=‘06094371-412b-4155-802c-3c7be7deeb44’, session_id=’’)
[DEBUG:2020-08-12 11:53:00,398] rhasspyserver_hermes: Sent 370 char(s) to websocket
[DEBUG:2020-08-12 11:53:00,396] rhasspyserver_hermes: Handling NluIntent (topic=hermes/intent/GetTime, id=9ccda5d8-603a-4c1c-adbd-ae86acdfd33f)
[DEBUG:2020-08-12 11:53:00,391] rhasspyserver_hermes: <- NluIntent(input=‘what time is it’, intent=Intent(intent_name=‘GetTime’, confidence_score=1.0), site_id=‘default’, id=‘2e289aae-b3af-483c-a716-c7c936533980’, slots=[], session_id=‘2e289aae-b3af-483c-a716-c7c936533980’, custom_data=None, asr_tokens=[[AsrToken(value=‘what’, confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value=‘time’, confidence=1.0, range_start=5, range_end=9, time=None), AsrToken(value=‘is’, confidence=1.0, range_start=10, range_end=12, time=None), AsrToken(value=‘it’, confidence=1.0, range_start=13, range_end=15, time=None)]], asr_confidence=None, raw_input=‘what time is it’, wakeword_id=None, lang=None)
[DEBUG:2020-08-12 11:53:00,377] rhasspyserver_hermes: Publishing 204 bytes(s) to hermes/nlu/query
[DEBUG:2020-08-12 11:53:00,376] rhasspyserver_hermes: -> NluQuery(input=‘what time is it’, site_id=‘default’, id=‘2e289aae-b3af-483c-a716-c7c936533980’, intent_filter=None, session_id=‘2e289aae-b3af-483c-a716-c7c936533980’, wakeword_id=None, lang=None)

In the MaryTTS docker log, I noticed that Rhasspy appears to be sending a string of text without quotes (so not a string?), I am not sure if this is causing an issue segmenting the text in Mary. Here is the log:

2020-08-12 15:53:00,444 [I/O dispatcher 5] INFO marytts.server Connection from null

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server New synthesis request: /process

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server INPUT_TEXT=It is 11 53 AM.

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server VOICE=cmu-slt-hsmm

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server LOCALE=en

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server INPUT_TYPE=TEXT

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server OUTPUT_TYPE=AUDIO

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server AUDIO=WAVE

2020-08-12 15:53:00,444 [I/O dispatcher 5] DEBUG marytts.server No audio effects requested

2020-08-12 15:53:00,444 [I/O dispatcher 5] INFO marytts.R 12 New request (input type “TEXT”, output type “AUDIO”, voice “cmu-slt-hsmm”, effect “”, audio “WAVE”)

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.IO Setting text input: It is 11 53 AM.

2020-08-12 15:53:00,445 [I/O dispatcher 5] INFO marytts.server Read: It is 11 53 AM.

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.R 12 Now converting the following input data from TEXT to RAWMARYXML:

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.IO Writing Text output:

It is 11 53 AM.

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.R 12 Determining which modules to use

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.ModuleRegistry Module TextToMaryXML converts TEXT into RAWMARYXML (locale en, voice cmu-slt-hsmm)

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.ModuleRegistry found path through modules

2020-08-12 15:53:00,445 [I/O dispatcher 5] INFO marytts.R 12 Handling request using the following modules:

2020-08-12 15:53:00,445 [I/O dispatcher 5] INFO marytts.R 12 - TextToMaryXML (marytts.modules.TextToMaryXML)

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.R 12 Handing the following data to the next module:

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.IO Writing Text output:

It is 11 53 AM.

2020-08-12 15:53:00,445 [I/O dispatcher 5] INFO marytts.R 12 Next module: TextToMaryXML

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.TextToMaryXML textNodeString=`It is 11 53 AM.’

2020-08-12 15:53:00,445 [I/O dispatcher 5] DEBUG marytts.R 12 Now splitting the following RAWMARYXML data into chunks:

2020-08-12 15:53:00,446 [I/O dispatcher 5] DEBUG marytts.IO <?xml version="1.0" encoding="UTF-8"?>

It is 11 53 AM.

2020-08-12 15:53:00,447 [I/O dispatcher 5] DEBUG marytts.R 12 Now converting the following input data from RAWMARYXML to AUDIO:

2020-08-12 15:53:00,447 [I/O dispatcher 5] DEBUG marytts.IO <?xml version="1.0" encoding="UTF-8"?>

It is 11 53 AM.

2020-08-12 15:53:00,447 [I/O dispatcher 5] DEBUG marytts.R 12 Determining which modules to use

2020-08-12 15:53:00,448 [I/O dispatcher 5] DEBUG marytts.ModuleRegistry Module JTokeniser converts RAWMARYXML into TOKENS (locale en, voice cmu-slt-hsmm)

2020-08-12 15:53:00,448 [I/O dispatcher 5] DEBUG marytts.ModuleRegistry Module Preprocess converts TOKENS into WORDS (locale en, voice cmu-slt-hsmm)

2020-08-12 15:53:00,448 [I/O dispatcher 5] DEBUG marytts.ModuleRegistry Module OpenNLPPosTagger converts WORDS into PARTSOFSPEECH (locale en, voice cmu-slt-hsmm)

2020-08-12 15:53:00,448 [I/O dispatcher 5] ERROR marytts.server Processing failed.

java.lang.UnsupportedOperationException: No known way of generating output (AUDIO) from input(RAWMARYXML), no processing path through modules.

at marytts.server.Request.processOneChunk(Request.java:524)

at marytts.server.Request.processOrLookupOneChunk(Request.java:403)

at marytts.server.Request.process(Request.java:337)

at marytts.server.http.SynthesisRequestHandler.process(SynthesisRequestHandler.java:261)

at marytts.server.http.SynthesisRequestHandler.handleClientRequest(SynthesisRequestHandler.java:91)

at marytts.server.http.BaseHttpRequestHandler.handle(BaseHttpRequestHandler.java:138)

at org.apache.http.nio.protocol.BufferingHttpServiceHandler$RequestHandlerAdaptor.handle(BufferingHttpServiceHandler.java:189)

at org.apache.http.nio.protocol.SimpleNHttpRequestHandler.handle(SimpleNHttpRequestHandler.java:51)

at org.apache.http.nio.protocol.AsyncNHttpServiceHandler.processRequest(AsyncNHttpServiceHandler.java:453)

at org.apache.http.nio.protocol.AsyncNHttpServiceHandler.requestReceived(AsyncNHttpServiceHandler.java:225)

at org.apache.http.nio.protocol.BufferingHttpServiceHandler.requestReceived(BufferingHttpServiceHandler.java:127)

at org.apache.http.impl.nio.DefaultNHttpServerConnection.consumeInput(DefaultNHttpServerConnection.java:161)

at org.apache.http.impl.nio.DefaultServerIOEventDispatch.inputReady(DefaultServerIOEventDispatch.java:147)

at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:161)

at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:335)

at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)

at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:275)

at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)

at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:542)

at java.lang.Thread.run(Thread.java:748)

2020-08-12 15:53:00,448 [I/O dispatcher 5] DEBUG marytts.http Returning HTTP status 500: Processing failed.

java.lang.UnsupportedOperationException: No known way of generating output (AUDIO) from input(RAWMARYXML), no processing path through modules.

at marytts.server.Request.processOneChunk(Request.java:524)

at marytts.server.Request.processOrLookupOneChunk(Request.java:403)

at marytts.server.Request.process(Request.java:337)

at marytts.server.http.SynthesisRequestHandler.process(SynthesisRequestHandler.java:261)

at marytts.server.http.SynthesisRequestHandler.handleClientRequest(SynthesisRequestHandler.java:91)

at marytts.server.http.BaseHttpRequestHandler.handle(BaseHttpRequestHandler.java:138)

at org.apache.http.nio.protocol.BufferingHttpServiceHandler$RequestHandlerAdaptor.handle(BufferingHttpServiceHandler.java:189)

at org.apache.http.nio.protocol.SimpleNHttpRequestHandler.handle(SimpleNHttpRequestHandler.java:51)

at org.apache.http.nio.protocol.AsyncNHttpServiceHandler.processRequest(AsyncNHttpServiceHandler.java:453)

at org.apache.http.nio.protocol.AsyncNHttpServiceHandler.requestReceived(AsyncNHttpServiceHandler.java:225)

at org.apache.http.nio.protocol.BufferingHttpServiceHandler.requestReceived(BufferingHttpServiceHandler.java:127)

at org.apache.http.impl.nio.DefaultNHttpServerConnection.consumeInput(DefaultNHttpServerConnection.java:161)

at org.apache.http.impl.nio.DefaultServerIOEventDispatch.inputReady(DefaultServerIOEventDispatch.java:147)

at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:161)

at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:335)

at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)

at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:275)

at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)

at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:542)

at java.lang.Thread.run(Thread.java:748)

2020-08-12 15:53:00,448 [I/O dispatcher 5] INFO marytts.server Request couldn’t be handled successfully.

2020-08-12 15:53:00,450 [I/O dispatcher 5] INFO marytts.server Connection closed: [closed]

2020-08-12 15:54:18,746 [I/O dispatcher 2] INFO marytts.server Connection closed: [closed]

2020-08-12 15:54:18,746 [I/O dispatcher 3] INFO marytts.server Connection closed: [closed]

Edit: Clarified findings in MaryTTS docker log

devjklein1 · August 12, 2020, 4:47pm

I got a log of a good MaryTSS operation by running this url in my web browser:

localhost:59125/process?INPUT_TYPE=TEXT&AUDIO=WAVE_FILE&OUTPUT_TYPE=AUDIO&LOCALE=en_us&INPUT_TEXT=It%20is%2011%2053%20AM.

This produces a wave file with audio.

I then compared this log to the log of the operation sent from Rhasspy, and the only difference I can find is that Rhasspy is for some reason sending “en” as the locale and not “en_us”. I have “en_us” set in my profile.json, but Rhasspy is still sending “en”. Is this possibly a bug?

Gfawkes · August 12, 2020, 4:52pm

this is what i have in my profile.json, and its been fine since 2.5.0 pre
you could start by trying CAPS for the last bit or en-GB?
edit - or it might be the underscore

"text_to_speech": {
    "marytts": {
        "locale": "en-GB",
        "url": "http://myserver.local:59125/process",
        "voice": "dfki-prudence-hsmm"
    },

devjklein1 · August 12, 2020, 5:07pm

I am absolutely stumped. I tried every combination of en-us, en_us, en-US, en_US, and nothing worked. The log still shows the locale submitted from rhasspy is “en”. I tried changing the voice setting in Rhasspy and that did nothing either. It’s like Rhasspy is ignoring its profile.json.

devjklein1 · August 12, 2020, 6:34pm

Figured it out! Everything in Rhasspy was correct. In my home assistant configuration.yaml, I had this code:

rest_command:
  espeak:
    url: http://localhost:12101/api/text-to-speech?voice=en
    method: POST
    headers:
      content_type: text/plain
    payload: '{{ message }}'

Of course, the voice=en command in the url was screwing up the request to MaryTTS. I set this to en_us and everything works fantastic now!