Satellite does not give TTS response to requests. Wake sound and "Speak" button work

Wetzel402 · September 8, 2022, 9:00pm

Hello,

I currently have a server with a couple of test satellites set up. The server runs on my QNAP NAS with the following profile…

{
    "dialogue": {
        "satellite_site_ids": "rhasspy-dell-tablet,default-sat-test",
        "system": "rhasspy"
    },
    "handle": {
        "satellite_site_ids": "rhasspy-dell-tablet,default-sat-test",
        "system": "hass"
    },
    "home_assistant": {
        "access_token": "xxx",
        "url": "http://192.168.1.xxx:8123/"
    },
    "intent": {
        "satellite_site_ids": "rhasspy-dell-tablet,default-sat-test",
        "system": "fsticuffs"
    },
    "mqtt": {
        "site_id": "rhasspy-server"
    },
    "speech_to_text": {
        "satellite_site_ids": "rhasspy-dell-tablet,default-sat-test",
        "system": "pocketsphinx"
    },
    "text_to_speech": {
        "nanotts": {
            "language": "en-GB"
        },
        "satellite_site_ids": "rhasspy-dell-tablet,default-sat-test",
        "system": "larynx"
    },
    "wake": {
        "porcupine": {
            "keyword_path": "jarvis_raspberry-pi.ppn"
        }
    }
}

I am using Pocketsphinx for STT since Kaldi does not work with my hardware. I have tried and used both NanoTTS and Larynx for TTS.

My test satellites are running in a VM on Windows with the following profile…

{
    "dialogue": {
        "satellite_site_ids": "rhasspy-server",
        "system": "rhasspy"
    },
    "intent": {
        "remote": {
            "url": "http://192.168.1.xxx:12101/api/text-to-intent"
        },
        "satellite_site_ids": "rhasspy-server",
        "system": "remote"
    },
    "microphone": {
        "system": "pyaudio"
    },
    "mqtt": {
        "site_id": "default-sat-test"
    },
    "sounds": {
        "system": "aplay"
    },
    "speech_to_text": {
        "remote": {
            "url": "http://192.168.1.xxx:12101/api/speech-to-text"
        },
        "satellite_site_ids": "rhasspy-server",
        "system": "remote"
    },
    "text_to_speech": {
        "remote": {
            "url": "http://192.168.1.xxx:12101/api/text-to-speech"
        },
        "satellite_site_ids": "rhasspy-server",
        "system": "remote"
    },
    "wake": {
        "porcupine": {
            "keyword_path": "jarvis_linux.ppn"
        },
        "system": "porcupine"
    }
}

Interestingly the satellites will speak text when the “Speak” button is used on the web GUI and if spoken to the wake sound works but the satellite will never respond with speech. Home Assistant intents do get handled, I just don’t get a spoken response back.

The following are logs from a requests…

The server:

[DEBUG:2022-09-08 15:25:59,743] rhasspyserver_hermes: Sent 377 char(s) to websocket
[DEBUG:2022-09-08 15:25:59,722] rhasspyserver_hermes: Handling NluIntent (topic=hermes/intent/GetTime, id=cbda10f3-608f-4eff-a175-c27667dc7aa8)
[DEBUG:2022-09-08 15:25:59,720] rhasspyserver_hermes: <- NluIntent(input='what time is it', intent=Intent(intent_name='GetTime', confidence_score=1.0), site_id='rhasspy-server', id='7ee5bbd3-e8df-4dc7-82ac-9f3970637b85', slots=[], session_id='7ee5bbd3-e8df-4dc7-82ac-9f3970637b85', custom_data=None, asr_tokens=[[AsrToken(value='what', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='time', confidence=1.0, range_start=5, range_end=9, time=None), AsrToken(value='is', confidence=1.0, range_start=10, range_end=12, time=None), AsrToken(value='it', confidence=1.0, range_start=13, range_end=15, time=None)]], asr_confidence=None, raw_input='what time is it', wakeword_id=None, lang=None)
[DEBUG:2022-09-08 15:25:59,623] rhasspyserver_hermes: Publishing 278 bytes(s) to hermes/nlu/query
[DEBUG:2022-09-08 15:25:59,622] rhasspyserver_hermes: -> NluQuery(input='what time is it', site_id='rhasspy-server', id='7ee5bbd3-e8df-4dc7-82ac-9f3970637b85', intent_filter=None, session_id='7ee5bbd3-e8df-4dc7-82ac-9f3970637b85', wakeword_id=None, lang=None, custom_data=None, asr_confidence=None, custom_entities=None)
[DEBUG:2022-09-08 15:25:59,620] rhasspyserver_hermes: Subscribed to hermes/error/nlu
[DEBUG:2022-09-08 15:25:54,534] rhasspyserver_hermes: Handling AsrTextCaptured (topic=hermes/asr/textCaptured, id=10a787dd-7bb4-45aa-a3dc-30bb861f3a9b)
[DEBUG:2022-09-08 15:25:47,404] rhasspyserver_hermes: Publishing 81 bytes(s) to hermes/asr/stopListening
[DEBUG:2022-09-08 15:25:47,404] rhasspyserver_hermes: -> AsrStopListening(site_id='rhasspy-server', session_id='f6f64d5f-476c-46c2-aee9-0703f1229722')
[DEBUG:2022-09-08 15:25:47,402] rhasspyserver_hermes: Sent 63712 byte(s) of WAV data
[DEBUG:2022-09-08 15:25:47,397] rhasspyserver_hermes: Publishing 188 bytes(s) to hermes/asr/startListening
[DEBUG:2022-09-08 15:25:47,396] rhasspyserver_hermes: -> AsrStartListening(site_id='rhasspy-server', session_id='f6f64d5f-476c-46c2-aee9-0703f1229722', lang=None, stop_on_silence=False, send_audio_captured=True, wakeword_id=None, intent_filter=None)
[DEBUG:2022-09-08 15:25:47,393] rhasspyserver_hermes: Subscribed to hermes/error/asr

And the satellite:

[DEBUG:2022-09-08 15:26:00,518] rhasspyserver_hermes: Sent 429 char(s) to websocket
[DEBUG:2022-09-08 15:26:00,496] rhasspyserver_hermes: <- NluIntent(input='what time is it', intent=Intent(intent_name='GetTime', confidence_score=1.0), site_id='default-sat-test', id=None, slots=[], session_id='default-sat-test-jarvis_linux-7c73b867-71f6-40a8-ae18-a6e527db2931', custom_data=None, asr_tokens=[[AsrToken(value='what', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='time', confidence=1.0, range_start=5, range_end=9, time=None), AsrToken(value='is', confidence=1.0, range_start=10, range_end=12, time=None), AsrToken(value='it', confidence=1.0, range_start=13, range_end=15, time=None)]], asr_confidence=None, raw_input='what time is it', wakeword_id='jarvis_linux', lang=None)
[DEBUG:2022-09-08 15:25:40,948] rhasspyserver_hermes: <- HotwordDetected(model_id='/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/pvporcupine/resources/keyword_files/linux/jarvis_linux.ppn', model_version='', model_type='personal', current_sensitivity=0.5, site_id='default-sat-test', session_id=None, send_audio_captured=None, lang=None)

Any advice would be greatly appreciated.

Thanks!