Hi,
I’m following the tutorial for setting up a “Server with Satellites” (tutorials/#server-with-satellites) and I am not managing to get the TTS to work via HTTP.
As far as I can tell, the satellite is sending a siteId
parameter (the URL it is POSTing to is http://<myserver>:12101/api/text-to-speech?play=false&siteId=<mysatellite>
). This is causing the server to respond in plaintext, with the input text as the body, and a header of content-type: text/html; charset=utf-8
.
As a result, the satellite fails to speak with an error message of Expected audio/wav content type, got text/html; charset=utf-8
and wave.Error: file does not start with RIFF id
(stacktrace below).
When I make the same POST manually without the siteId
parameter (api/text-to-speech?play=false
), I correctly get WAV data. However, I cannot figure out how to get the satellite to avoid sending its siteId
property as part of the POST, and I’m not sure if this would even be correct behaviour.
I can see someone had a similar issue here, but the solution in that thread seemed to be a workaround of “use MQTT instead”.
Issue #262 on Github seems to describe the exact problem, but is unsolved and still open. I tried downgrading to 2.5.9, but it did not solve the issue.
Can someone help me figure out what I might be doing wrong, or whether this is a bug in Rhasspy?
Thank you!
Satellite: Raspberry Pi 3 running rhasspy 2.5.11 in docker
Server: Ubuntu running rhasspy 2.5.11 in docker
Stack Trace on the satellite:
rhasspy | [DEBUG:2023-01-09 14:31:05,871] rhasspyserver_hermes: TTS timeout will be 30 second(s)
rhasspy | [DEBUG:2023-01-09 14:31:05,876] rhasspyserver_hermes: -> TtsSay(text='Hello World', site_id='pi1', lang=None, id='a1c19df1-f005-4ba9-b636-c7899ee29125', session_id='', volume=1.0)
rhasspy | [DEBUG:2023-01-09 14:31:05,877] rhasspyserver_hermes: Publishing 132 bytes(s) to hermes/tts/say
rhasspy | [DEBUG:2023-01-09 14:31:05,895] rhasspyremote_http_hermes: <- TtsSay(text='Hello World', site_id='pi1', lang=None, id='a1c19df1-f005-4ba9-b636-c7899ee29125', session_id='', volume=1.0)
rhasspy | [DEBUG:2023-01-09 14:31:05,898] rhasspyremote_http_hermes: http://10.0.1.3:12101/api/text-to-speech
rhasspy | [WARNING:2023-01-09 14:31:09,193] rhasspyremote_http_hermes: Expected audio/wav content type, got text/html; charset=utf-8
rhasspy | [DEBUG:2023-01-09 14:31:09,198] rhasspyremote_http_hermes: -> AudioPlayBytes(11 byte(s)) to hermes/audioServer/pi1/playBytes/a1c19df1-f005-4ba9-b636-c7899ee29125
rhasspy | [DEBUG:2023-01-09 14:31:09,202] rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/pi1/playBytes/a1c19df1-f005-4ba9-b636-c7899ee29125, id=80fe37d5-3890-4b30-87e0-dab711acedfc)
rhasspy | [DEBUG:2023-01-09 14:31:09,204] rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/pi1/playBytes/a1c19df1-f005-4ba9-b636-c7899ee29125, id=48eeae67-bbbe-4e7d-a76a-03c2f882d9b8)
rhasspy | [DEBUG:2023-01-09 14:31:09,204] rhasspyremote_http_hermes: -> TtsSayFinished(site_id='pi1', id='a1c19df1-f005-4ba9-b636-c7899ee29125', session_id='')
rhasspy | [DEBUG:2023-01-09 14:31:09,208] rhasspyremote_http_hermes: Publishing 80 bytes(s) to hermes/tts/sayFinished
rhasspy | [DEBUG:2023-01-09 14:31:09,208] rhasspyspeakers_cli_hermes: <- AudioPlayBytes(11 byte(s))
rhasspy | [DEBUG:2023-01-09 14:31:09,209] rhasspyspeakers_cli_hermes: ['aplay', '-q', '-t', 'wav']
rhasspy | [ERROR:2023-01-09 14:31:09,220] rhasspyspeakers_cli_hermes: handle_play
rhasspy | Traceback (most recent call last):
rhasspy | File "/usr/lib/rhasspy/rhasspy-speakers-cli-hermes/rhasspyspeakers_cli_hermes/__init__.py", line 248, in convert_to_wav
rhasspy | with io.BytesIO(sound_bytes) as sound_io, wave.open(sound_io, "rb"):
rhasspy | File "/usr/lib/python3.7/wave.py", line 510, in open
rhasspy | return Wave_read(f)
rhasspy | File "/usr/lib/python3.7/wave.py", line 164, in __init__
rhasspy | self.initfp(f)
rhasspy | File "/usr/lib/python3.7/wave.py", line 131, in initfp
rhasspy | raise Error('file does not start with RIFF id')
rhasspy | wave.Error: file does not start with RIFF id
rhasspy |
rhasspy | During handling of the above exception, another exception occurred:
rhasspy |
rhasspy | Traceback (most recent call last):
rhasspy | File "/usr/lib/rhasspy/rhasspy-speakers-cli-hermes/rhasspyspeakers_cli_hermes/__init__.py", line 255, in convert_to_wav
rhasspy | audio_data, sample_rate = soundfile.read(sound_file)
rhasspy | File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/soundfile.py", line 257, in read
rhasspy | subtype, endian, format, closefd) as f:
rhasspy | File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/soundfile.py", line 629, in __init__
rhasspy | self._file = self._open(file, mode_int, closefd)
rhasspy | File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/soundfile.py", line 1184, in _open
rhasspy | "Error opening {0!r}: ".format(self.name))
rhasspy | File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/soundfile.py", line 1357, in _error_check
rhasspy | raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
rhasspy | RuntimeError: Error opening <_io.BytesIO object at 0x75ec35a0>: File contains data in an unknown format.
rhasspy |
rhasspy | During handling of the above exception, another exception occurred:
rhasspy |
rhasspy | Traceback (most recent call last):
rhasspy | File "/usr/lib/rhasspy/rhasspy-speakers-cli-hermes/rhasspyspeakers_cli_hermes/__init__.py", line 81, in handle_play
rhasspy | wav_bytes = self.convert_to_wav(sound_bytes)
rhasspy | File "/usr/lib/rhasspy/rhasspy-speakers-cli-hermes/rhasspyspeakers_cli_hermes/__init__.py", line 265, in convert_to_wav
rhasspy | temp_file.name, backends=self.audioread_backends
rhasspy | File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/audioread/__init__.py", line 116, in audio_open
rhasspy | raise NoBackendError()
rhasspy | audioread.exceptions.NoBackendError
Satellite profile:
{
"dialogue": {
"system": "rhasspy"
},
"handle": {
"system": "hass"
},
"home_assistant": {
"access_token": "<redacted>",
"url": "<redacted>"
},
"intent": {
"remote": {
"url": "http://10.0.1.3:12101/api/text-to-intent"
},
"system": "remote"
},
"microphone": {
"pyaudio": {
"device": "1",
"siteId": "pi1"
},
"system": "pyaudio"
},
"mqtt": {
"enabled": "",
"host": "<redacted>",
"password": "<redacted>",
"site_id": "pi1",
"username": "<redacted>"
},
"sounds": {
"error": "${RHASSPY_PROFILE_DIR}/sounds/xp-critical-stop.wav",
"recorded": "${RHASSPY_PROFILE_DIR}/sounds/xp-hw-remove.wav",
"system": "aplay",
"wake": "${RHASSPY_PROFILE_DIR}/sounds/xp-hw-insert.wav"
},
"speech_to_text": {
"remote": {
"url": "http://10.0.1.3:12101/api/speech-to-text"
},
"system": "remote"
},
"text_to_speech": {
"larynx": {
"default_voice": "northern_english_male"
},
"remote": {
"url": "http://10.0.1.3:12101/api/text-to-speech"
},
"system": "remote"
},
"wake": {
"porcupine": {
"keyword_path": "computer_raspberry-pi.ppn",
"sensitivity": "0.7"
},
"system": "porcupine"
}
}
Server profile:
{
"intent": {
"satellite_site_ids": "pi1",
"system": "fsticuffs"
},
"mqtt": {
"enabled": "true",
"host": "<redacted>",
"password": "<redacted>",
"site_id": "root",
"username": "<redacted>"
},
"sounds": {
"aplay": {
"device": "front:CARD=PCH,DEV=0"
}
},
"speech_to_text": {
"satellite_site_ids": "pi1",
"system": "kaldi"
},
"text_to_speech": {
"larynx": {
"default_voice": "northern_english_male",
"vocoder": "vctk_medium"
},
"nanotts": {
"language": "en-GB"
},
"satellite_site_ids": "pi1",
"system": "larynx"
}
}