Lost in dialogues: customData, intentFilter and dialogueManager/configure

Continuing the discussion from Add context setting to intents - easy building dialogues (PROBLEM and SOLUTION):

As the original thread seems to have some other focus, here some questions about how to build up dialogues.

Background:
Goal is to build a generic solution (kind of plugin) to FHEM (which itself is written in Perl), german disussion on dialogues may be found starting here in the FHEM forum, for more details on my own environment see below.
By now, the implementation presees two categories of intents:

  • do something (or answer a question like temperature or on/off state of a device)
  • dialogue related intents like
    – “do it” (ConfirmAction)
    – “cancel” (don’t execute command or user is not willing to make the requested choice) (CancelAction)
    – “make a choice” (either room or device) (ChoiceRoom / ChoiceDevice)
    sentences.ini are (more or less) described here.

My idea was to use dialogueManager/configure to deactivate all off the dialogue related intents, an then to switch them on at runtime in case they are appropriate.
Unfortunately, this doesn’t seem to work as intented, as

  • settings via dialogueManager/configure seem not to work, and
  • using additional data fields (hermes/dialogueManager/continueSession: customData, intentFilter and sendIntentNotRecognized) cause problems with the tts system:
    If none of the above is used, nanoTTS will speak text content, the coice may be accepted and the appropriate answer given. So basically, the code on the FHEM side seems to work. But:
    using one of these may break voice response and give a “no one managed
”-timout result without option to do a choice.
    picoTTS seems to be more tolerant, but using intentFilter in combination with one of the others will also lead to no spoken response/choice option and timeout


So first things to check and ask for feedback are:

  • Is the general setup (organisation of intents) ok or are there suggestions for improvement?
  • Are the payloads to the respective topics ok?
    Imo, the latter would be the first thing to check, so here’s what the code produces as output (as seen with mosquitto_sub -h <host> -p <port> -v -t hermes/dialogueManager/# -t hermes/hotword/# -t hermes/intent/#) :
    Initialization of disabled intents (not working as expected):
hermes/dialogueManager/configure {"intents":[{"enable":"false","intentId":"de.fhem:ConfirmAction"},{"enable":"false","intentId":"de.fhem:CancelAction"},{"enable":"false","intentId":"de.fhem:ChoiceRoom"},{"enable":"false","intentId":"de.fhem:ChoiceDevice"}],"siteId":"motox"}
hermes/dialogueManager/configure {"intents":[{"enable":"false","intentId":"de.fhem:ConfirmAction"},{"enable":"false","intentId":"de.fhem:CancelAction"},{"enable":"false","intentId":"de.fhem:ChoiceRoom"},{"enable":"false","intentId":"de.fhem:ChoiceDevice"}],"siteId":"bĂŒro"}

Wrt. to the boolean values, they are packed in quotes due to the way Perl/fhem.pl encodes JSON. But as
JensS (@ FHEM forum) did also some tests without quotes , I’m unsure if investing in postprocessing code would help?

And here’s one “dialogue” example without spoken output (picoTTS):

hermes/intent/de.fhem:GetNumeric {"input": "temperature ist es im wohnzimmer", "intent": {"intentName": "de.fhem:GetNumeric", "confidenceScore": 1.0}, "siteId": "motox", "id": null, "slots": [{"entity": "Type", "value": {"kind": "Unknown", "value": "temperature"}, "slotName": "Type", "rawValue": "warm", "confidence": 1.0, "range": {"start": 0, "end": 11, "rawStart": 0, "rawEnd": 4}}, {"entity": "de.fhem.Room", "value": {"kind": "Unknown", "value": "wohnzimmer"}, "slotName": "Room", "rawValue": "wohnzimmer", "confidence": 1.0, "range": {"start": 22, "end": 32, "rawStart": 15, "rawEnd": 25}}], "sessionId": "4a052e48-ba67-cea3-a396-67e8faed35bd", "customData": null, "asrTokens": [[{"value": "temperature", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 11, "time": null}, {"value": "ist", "confidence": 1.0, "rangeStart": 12, "rangeEnd": 15, "time": null}, {"value": "es", "confidence": 1.0, "rangeStart": 16, "rangeEnd": 18, "time": null}, {"value": "im", "confidence": 1.0, "rangeStart": 19, "rangeEnd": 21, "time": null}, {"value": "wohnzimmer", "confidence": 1.0, "rangeStart": 22, "rangeEnd": 32, "time": null}]], "asrConfidence": null, "rawInput": "warm ist es im wohnzimmer", "wakewordId": null, "lang": null}
hermes/dialogueManager/continueSession {"intentFilter":["de.fhem:ChoiceDevice","de.fhem:CancelAction"],"sendIntentNotRecognized":"false","sessionId":"4a052e48-ba67-cea3-a396-67e8faed35bd","siteId":"motox","text":"Es kommen mehrere GerĂ€te in Frage, bitte wĂ€hle zwischen raumfĂŒhler heizkörper sĂŒdost oder heizkörper sĂŒdwest"}
hermes/dialogueManager/endSession {"sessionId":"4a052e48-ba67-cea3-a396-67e8faed35bd","siteId":"motox","text":"Tut mir leid, da hat etwas zu lange gedauert"}

And here one “chaotic dialogue” with spoken response(s) - there seems even to be opened up a second session


hermes/intent/de.fhem:GetNumeric {"input": "temperature ist es im wohnzimmer", "intent": {"intentName": "de.fhem:GetNumeric", "confidenceScore": 1.0}, "siteId": "motox", "id": null, "slots": [{"entity": "Type", "value": {"kind": "Unknown", "value": "temperature"}, "slotName": "Type", "rawValue": "warm", "confidence": 1.0, "range": {"start": 0, "end": 11, "rawStart": 0, "rawEnd": 4}}, {"entity": "de.fhem.Room", "value": {"kind": "Unknown", "value": "wohnzimmer"}, "slotName": "Room", "rawValue": "wohnzimmer", "confidence": 1.0, "range": {"start": 22, "end": 32, "rawStart": 15, "rawEnd": 25}}], "sessionId": "d09c2612-aad1-f4ce-1942-7e6dcc27edb0", "customData": null, "asrTokens": [[{"value": "temperature", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 11, "time": null}, {"value": "ist", "confidence": 1.0, "rangeStart": 12, "rangeEnd": 15, "time": null}, {"value": "es", "confidence": 1.0, "rangeStart": 16, "rangeEnd": 18, "time": null}, {"value": "im", "confidence": 1.0, "rangeStart": 19, "rangeEnd": 21, "time": null}, {"value": "wohnzimmer", "confidence": 1.0, "rangeStart": 22, "rangeEnd": 32, "time": null}]], "asrConfidence": null, "rawInput": "warm ist es im wohnzimmer", "wakewordId": null, "lang": null}
hermes/dialogueManager/continueSession {"sessionId":"d09c2612-aad1-f4ce-1942-7e6dcc27edb0","siteId":"motox","text":"Es kommen mehrere GerĂ€te in Frage, bitte wĂ€hle zwischen heizkörper sĂŒdwest heizkörper sĂŒdost oder raumfĂŒhler"}
hermes/intent/de.fhem:GetNumeric {"input": "temperature ist es im wohnzimmer", "intent": {"intentName": "de.fhem:GetNumeric", "confidenceScore": 1.0}, "siteId": "motox", "id": null, "slots": [{"entity": "Type", "value": {"kind": "Unknown", "value": "temperature"}, "slotName": "Type", "rawValue": "warm", "confidence": 1.0, "range": {"start": 0, "end": 11, "rawStart": 0, "rawEnd": 4}}, {"entity": "de.fhem.Room", "value": {"kind": "Unknown", "value": "wohnzimmer"}, "slotName": "Room", "rawValue": "wohnzimmer", "confidence": 1.0, "range": {"start": 22, "end": 32, "rawStart": 15, "rawEnd": 25}}], "sessionId": "d09c2612-aad1-f4ce-1942-7e6dcc27edb0", "customData": null, "asrTokens": [[{"value": "temperature", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 11, "time": null}, {"value": "ist", "confidence": 1.0, "rangeStart": 12, "rangeEnd": 15, "time": null}, {"value": "es", "confidence": 1.0, "rangeStart": 16, "rangeEnd": 18, "time": null}, {"value": "im", "confidence": 1.0, "rangeStart": 19, "rangeEnd": 21, "time": null}, {"value": "wohnzimmer", "confidence": 1.0, "rangeStart": 22, "rangeEnd": 32, "time": null}]], "asrConfidence": null, "rawInput": "warm ist es im wohnzimmer", "wakewordId": null, "lang": null}
hermes/dialogueManager/continueSession {"sessionId":"d09c2612-aad1-f4ce-1942-7e6dcc27edb0","siteId":"motox","text":"Es kommen mehrere GerĂ€te in Frage, bitte wĂ€hle zwischen heizkörper sĂŒdwest heizkörper sĂŒdost oder raumfĂŒhler"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/intent/de.fhem:SetTimer {"input": "auf", "intent": {"intentName": "de.fhem:SetTimer", "confidenceScore": 1.0}, "siteId": "motox", "id": null, "slots": [], "sessionId": "017689b3-1e57-4814-9393-10130364410e", "customData": null, "asrTokens": [[{"value": "auf", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 3, "time": null}]], "asrConfidence": null, "rawInput": "auf", "wakewordId": null, "lang": null}
hermes/intent/de.fhem:SetTimer {"input": "auf", "intent": {"intentName": "de.fhem:SetTimer", "confidenceScore": 1.0}, "siteId": "motox", "id": null, "slots": [], "sessionId": "017689b3-1e57-4814-9393-10130364410e", "customData": null, "asrTokens": [[{"value": "auf", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 3, "time": null}]], "asrConfidence": null, "rawInput": "auf", "wakewordId": null, "lang": null}
hermes/dialogueManager/endSession {"sessionId":"017689b3-1e57-4814-9393-10130364410e","siteId":"motox","text":"Tut mir leid, ich habe die Dauer nicht verstanden"}
hermes/dialogueManager/endSession {"sessionId":"017689b3-1e57-4814-9393-10130364410e","siteId":"motox","text":"Tut mir leid, ich habe die Dauer nicht verstanden"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/dialogueManager/endSession {"sessionId":"d09c2612-aad1-f4ce-1942-7e6dcc27edb0","siteId":"motox","text":"Tut mir leid, da hat etwas zu lange gedauert"}

As one can at least see from the chaotic one: intentFilter might be a good idea in combination with picoTTs, as this works much better using default nanoTTS.
So first thing would be to fix the payload/intentFilter syntay, if there’s any problem hidden in it



My personal system atm. consits of Rhasspy 2.5.10 installed as debian package, external MQTT server, all Rhasspy settings including tts on recommended (especially tts is set to nanoTTS), for audio in- and output, the rhasspy-mobile-app is used, the system basically works as intented


So any hints on how to get the job finally done will be appreciated!

1 Like

By now, I’ve added some postprocessing code to get “proper” boolean values in the submittet json payload. Seems to be accepted now by Rhasspy and no longer confuses nanoTTS, but unfortunately still doesn’t work as intented, see (due to forum limits shortened) console output at the end.

To some extend, one of the problems involved here might be: dialogue was initiated by pressing the shortcup button in rhasspy-mobile-app, what seems to cause

[WARNING:2021-06-16 07:47:42,571] rhasspydialogue_hermes: Ignoring unknown session [
]

finally leading to the intent filtering still not working as inteted. Or am I misinterpreting the identification of the disabled intent here:

[DEBUG:2021-06-16 07:47:52,936] rhasspynlu_hermes: → NluIntent(input=‘mach das radio on’, intent=Intent(intent_name=‘de.fhem:SetOnOff’, [
]

Will do some additional tests, but validation of the “DialogueConfigureIntent” output would be appreciated.

sudo rhasspy -p de --user-profiles /opt/rhasspy/profiles
Starting up...
DEBUG:rhasspysupervisor:Namespace(debug=True, docker_compose='', local_mqtt_port=12183, mosquitto_path='mosquitto', profile='de', supervisord_conf='supervisord.conf', system_profiles=None, user_profiles=PosixPath('/opt/rhasspy/profiles'))
[...]
[DEBUG:2021-06-16 07:46:10,414] rhasspydialogue_hermes: Namespace(debug=True, group_separator=None, host='192.168.2.72', log_format='[%(levelname)s:%(asctime)s] %(name)s: %(message)s', min_asr_confidence=0.0, no_sound=['base'], password='password', port=1884, say_chars_per_second=33.0, session_timeout=30.0, site_id=['base', 'motox', 'buero', 'bĂŒro', 'KĂŒche'], sound=[['wake', '/opt/rhasspy/profiles/de/sounds/start_of_input.wav'], ['recorded', '/opt/rhasspy/profiles/de/sounds/end_of_input.wav'], ['error', '/opt/rhasspy/profiles/de/sounds/error.wav']], tls=False, tls_ca_certs=None, tls_cert_reqs='CERT_REQUIRED', tls_certfile=None, tls_ciphers=None, tls_keyfile=None, tls_version=None, username='xyz', volume=1.0, wakeword_id=None)
[...]
[...]
[DEBUG:2021-06-16 07:46:17,061] rhasspyprofile.profile: Loading /opt/rhasspy/profiles/de/profile.json
[DEBUG:2021-06-16 07:46:17,062] rhasspyprofile.profile: Loading default profile settings from /usr/lib/rhasspy/rhasspy-profile/rhasspyprofile/profiles/defaults.json
[DEBUG:2021-06-16 07:46:17,066] rhasspyserver_hermes: Starting core
[...]
[DEBUG:2021-06-16 07:46:17,082] rhasspyserver_hermes: Starting web server at http://0.0.0.0:12101
Running on 0.0.0.0:12101 over http (CTRL + C to quit)
[DEBUG:2021-06-16 07:46:49,076] rhasspydialogue_hermes: <- DialogueConfigure(intents=[DialogueConfigureIntent(intent_id='de.fhem:ConfirmAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:CancelAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceRoom', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceDevice', enable=False)], site_id='motox')
[DEBUG:2021-06-16 07:46:49,090] rhasspydialogue_hermes: <- DialogueConfigure(intents=[DialogueConfigureIntent(intent_id='de.fhem:ConfirmAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:CancelAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceRoom', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceDevice', enable=False)], site_id='buero')
[DEBUG:2021-06-16 07:46:49,103] rhasspydialogue_hermes: <- DialogueConfigure(intents=[DialogueConfigureIntent(intent_id='de.fhem:ConfirmAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:CancelAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceRoom', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceDevice', enable=False)], site_id='bĂŒro')
[DEBUG:2021-06-16 07:46:49,107] rhasspydialogue_hermes: Removed default intent filter
[DEBUG:2021-06-16 07:46:49,109] rhasspydialogue_hermes: Removed default intent filter
[DEBUG:2021-06-16 07:46:49,111] rhasspydialogue_hermes: Removed default intent filter
[DEBUG:2021-06-16 07:46:49,124] rhasspydialogue_hermes: <- DialogueConfigure(intents=[DialogueConfigureIntent(intent_id='de.fhem:ConfirmAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:CancelAction', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceRoom', enable=False), DialogueConfigureIntent(intent_id='de.fhem:ChoiceDevice', enable=False)], site_id='KĂŒche')
[DEBUG:2021-06-16 07:46:49,127] rhasspydialogue_hermes: Removed default intent filter
[DEBUG:2021-06-16 07:47:40,319] rhasspyasr_kaldi_hermes: Receiving audio
[DEBUG:2021-06-16 07:47:40,322] rhasspywake_porcupine_hermes: Receiving audio
[DEBUG:2021-06-16 07:47:40,506] rhasspyasr_kaldi_hermes: <- AsrToggleOn(site_id='motox', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
[DEBUG:2021-06-16 07:47:40,507] rhasspyasr_kaldi_hermes: Enabled
[DEBUG:2021-06-16 07:47:40,515] rhasspyasr_kaldi_hermes: <- AsrStartListening(site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id=None, intent_filter=None)
[DEBUG:2021-06-16 07:47:40,517] rhasspyasr_kaldi_hermes: Creating new transcriber session e5e7accc-ac46-9dd1-772b-375d4d18eba4
[DEBUG:2021-06-16 07:47:40,519] rhasspyasr_kaldi.transcribe: Using kaldi at /usr/lib/rhasspy/lib/kaldi
[DEBUG:2021-06-16 07:47:40,521] rhasspyasr_kaldi_hermes: Starting listening (session_id=e5e7accc-ac46-9dd1-772b-375d4d18eba4)
[DEBUG:2021-06-16 07:47:40,526] rhasspyasr_kaldi.transcribe: Creating FIFO at /tmp/tmp_wljguu9/chunks.fifo
[DEBUG:2021-06-16 07:47:40,528] rhasspyasr_kaldi.transcribe: ['/usr/lib/rhasspy/lib/kaldi/online2-cli-nnet3-decode-faster-confidence', '--config=/opt/rhasspy/profiles/de/kaldi/model/online/conf/online.conf', '--frame-subsampling-factor=3', '--max-active=7000', '--lattice-beam=8.0', '--acoustic-scale=1.0', '--beam=24.0', '/opt/rhasspy/profiles/de/kaldi/model/model/final.mdl', '/opt/rhasspy/profiles/de/kaldi/model/graph/HCLG.fst', '/opt/rhasspy/profiles/de/kaldi/model/graph/words.txt', '/tmp/tmp_wljguu9/chunks.fifo']
[DEBUG:2021-06-16 07:47:40,543] rhasspyasr_kaldi_hermes: Receiving audio
/usr/lib/rhasspy/lib/kaldi/online2-cli-nnet3-decode-faster-confidence --config=/opt/rhasspy/profiles/de/kaldi/model/online/conf/online.conf --frame-subsampling-factor=3 --max-active=7000 --lattice-beam=8.0 --acoustic-scale=1.0 --beam=24.0 /opt/rhasspy/profiles/de/kaldi/model/model/final.mdl /opt/rhasspy/profiles/de/kaldi/model/graph/HCLG.fst /opt/rhasspy/profiles/de/kaldi/model/graph/words.txt /tmp/tmp_wljguu9/chunks.fifo 
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:Collapse():nnet-utils.cc:1488) Added 1 components, removed 2
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:CompileLooped():nnet-compile-looped.cc:345) Spent 0.0358882 seconds in looped compilation.
[DEBUG:2021-06-16 07:47:41,304] rhasspyasr_kaldi.transcribe: ready
[DEBUG:2021-06-16 07:47:41,304] rhasspyasr_kaldi.transcribe: Decoder started
[DEBUG:2021-06-16 07:47:42,470] rhasspyasr_kaldi_hermes: -> AsrRecordingFinished(site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:42,473] rhasspyasr_kaldi_hermes: Publishing 72 bytes(s) to rhasspy/asr/recordingFinished
[DEBUG:2021-06-16 07:47:42,475] rhasspyasr_kaldi.transcribe: Finished stream. Getting transcription.
[DEBUG:2021-06-16 07:47:42,527] rhasspyasr_kaldi.transcribe: 0.0707423 warm 0.972215 0 0.6 ist 1 0.601667 0.695001 es 0.992238 0.695039 0.695039 im 0.992589 0.695039 0.8178 wohnzimmer 1 0.819533 2.04
[DEBUG:2021-06-16 07:47:42,528] rhasspyasr_kaldi_hermes: Transcription result: Transcription(text='warm ist es im wohnzimmer', likelihood=0.9292577, transcribe_seconds=1.2230689683929086, wav_seconds=2.048, tokens=[TranscriptionToken(token='warm', start_time=0.0, end_time=0.6, likelihood=0.972215), TranscriptionToken(token='ist', start_time=0.601667, end_time=0.695001, likelihood=1.0), TranscriptionToken(token='es', start_time=0.695039, end_time=0.695039, likelihood=0.992238), TranscriptionToken(token='im', start_time=0.695039, end_time=0.8178, likelihood=0.992589), TranscriptionToken(token='wohnzimmer', start_time=0.819533, end_time=2.04, likelihood=1.0)])
[DEBUG:2021-06-16 07:47:42,546] rhasspyasr_kaldi_hermes: -> AsrTextCaptured(text='warm ist es im wohnzimmer', likelihood=0.9292577, seconds=1.2230689683929086, site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', wakeword_id=None, asr_tokens=[[AsrToken(value='warm', confidence=0.972215, range_start=0, range_end=5, time=AsrTokenTime(start=0.0, end=0.6)), AsrToken(value='ist', confidence=1.0, range_start=5, range_end=9, time=AsrTokenTime(start=0.601667, end=0.695001)), AsrToken(value='es', confidence=0.992238, range_start=9, range_end=12, time=AsrTokenTime(start=0.695039, end=0.695039)), AsrToken(value='im', confidence=0.992589, range_start=12, range_end=15, time=AsrTokenTime(start=0.695039, end=0.8178)), AsrToken(value='wohnzimmer', confidence=1.0, range_start=15, range_end=26, time=AsrTokenTime(start=0.819533, end=2.04))]], lang=None)
[DEBUG:2021-06-16 07:47:42,548] rhasspyasr_kaldi_hermes: Publishing 801 bytes(s) to hermes/asr/textCaptured
[DEBUG:2021-06-16 07:47:42,549] rhasspyasr_kaldi_hermes: -> AsrAudioCaptured(56684 byte(s)) to rhasspy/asr/motox/motox/audioCaptured
[DEBUG:2021-06-16 07:47:42,570] rhasspydialogue_hermes: <- AsrTextCaptured(text='warm ist es im wohnzimmer', likelihood=0.9292577, seconds=1.2230689683929086, site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', wakeword_id=None, asr_tokens=[[AsrToken(value='warm', confidence=0.972215, range_start=0, range_end=5, time=AsrTokenTime(start=0.0, end=0.6)), AsrToken(value='ist', confidence=1.0, range_start=5, range_end=9, time=AsrTokenTime(start=0.601667, end=0.695001)), AsrToken(value='es', confidence=0.992238, range_start=9, range_end=12, time=AsrTokenTime(start=0.695039, end=0.695039)), AsrToken(value='im', confidence=0.992589, range_start=12, range_end=15, time=AsrTokenTime(start=0.695039, end=0.8178)), AsrToken(value='wohnzimmer', confidence=1.0, range_start=15, range_end=26, time=AsrTokenTime(start=0.819533, end=2.04))]], lang=None)
[WARNING:2021-06-16 07:47:42,571] rhasspydialogue_hermes: Ignoring unknown session e5e7accc-ac46-9dd1-772b-375d4d18eba4
[DEBUG:2021-06-16 07:47:42,596] rhasspyasr_kaldi_hermes: <- AsrStopListening(site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:42,597] rhasspyasr_kaldi_hermes: Stopping listening (session_id=e5e7accc-ac46-9dd1-772b-375d4d18eba4)
[DEBUG:2021-06-16 07:47:42,606] rhasspynlu_hermes: <- NluQuery(input='warm ist es im wohnzimmer', site_id='motox', id=None, intent_filter=None, session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', wakeword_id=None, lang=None, custom_data=None, asr_confidence=None, custom_entities=None)
[DEBUG:2021-06-16 07:47:42,608] rhasspynlu_hermes: Loading /opt/rhasspy/profiles/de/intent_graph.pickle.gz
[DEBUG:2021-06-16 07:47:42,736] rhasspynlu_hermes: -> NluIntentParsed(input='temperature ist es im wohnzimmer', intent=Intent(intent_name='de.fhem:GetNumeric', confidence_score=1.0), site_id='motox', id=None, slots=[Slot(entity='Type', value={'kind': 'Unknown', 'value': 'temperature'}, slot_name='Type', raw_value='warm', confidence=1.0, range=SlotRange(start=0, end=11, raw_start=0, raw_end=4)), Slot(entity='de.fhem.Room', value={'kind': 'Unknown', 'value': 'wohnzimmer'}, slot_name='Room', raw_value='wohnzimmer', confidence=1.0, range=SlotRange(start=22, end=32, raw_start=15, raw_end=25))], session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:42,737] rhasspynlu_hermes: Publishing 618 bytes(s) to hermes/nlu/intentParsed
[DEBUG:2021-06-16 07:47:42,751] rhasspynlu_hermes: -> NluIntent(input='temperature ist es im wohnzimmer', intent=Intent(intent_name='de.fhem:GetNumeric', confidence_score=1.0), site_id='motox', id=None, slots=[Slot(entity='Type', value={'kind': 'Unknown', 'value': 'temperature'}, slot_name='Type', raw_value='warm', confidence=1.0, range=SlotRange(start=0, end=11, raw_start=0, raw_end=4)), Slot(entity='de.fhem.Room', value={'kind': 'Unknown', 'value': 'wohnzimmer'}, slot_name='Room', raw_value='wohnzimmer', confidence=1.0, range=SlotRange(start=22, end=32, raw_start=15, raw_end=25))], session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', custom_data=None, asr_tokens=[[AsrToken(value='temperature', confidence=1.0, range_start=0, range_end=11, time=None), AsrToken(value='ist', confidence=1.0, range_start=12, range_end=15, time=None), AsrToken(value='es', confidence=1.0, range_start=16, range_end=18, time=None), AsrToken(value='im', confidence=1.0, range_start=19, range_end=21, time=None), AsrToken(value='wohnzimmer', confidence=1.0, range_start=22, range_end=32, time=None)]], asr_confidence=None, raw_input='warm ist es im wohnzimmer', wakeword_id=None, lang=None)
[DEBUG:2021-06-16 07:47:42,751] rhasspynlu_hermes: Publishing 1190 bytes(s) to hermes/intent/de.fhem:GetNumeric
[DEBUG:2021-06-16 07:47:42,801] rhasspydialogue_hermes: <- NluIntent(input='temperature ist es im wohnzimmer', intent=Intent(intent_name='de.fhem:GetNumeric', confidence_score=1.0), site_id='motox', id=None, slots=[Slot(entity='Type', value={'kind': 'Unknown', 'value': 'temperature'}, slot_name='Type', raw_value='warm', confidence=1.0, range=SlotRange(start=0, end=11, raw_start=0, raw_end=4)), Slot(entity='de.fhem.Room', value={'kind': 'Unknown', 'value': 'wohnzimmer'}, slot_name='Room', raw_value='wohnzimmer', confidence=1.0, range=SlotRange(start=22, end=32, raw_start=15, raw_end=25))], session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', custom_data=None, asr_tokens=[[AsrToken(value='temperature', confidence=1.0, range_start=0, range_end=11, time=None), AsrToken(value='ist', confidence=1.0, range_start=12, range_end=15, time=None), AsrToken(value='es', confidence=1.0, range_start=16, range_end=18, time=None), AsrToken(value='im', confidence=1.0, range_start=19, range_end=21, time=None), AsrToken(value='wohnzimmer', confidence=1.0, range_start=22, range_end=32, time=None)]], asr_confidence=None, raw_input='warm ist es im wohnzimmer', wakeword_id=None, lang=None)
[WARNING:2021-06-16 07:47:42,803] rhasspydialogue_hermes: No session for id e5e7accc-ac46-9dd1-772b-375d4d18eba4. Dropping recognition.
[DEBUG:2021-06-16 07:47:42,830] rhasspydialogue_hermes: <- DialogueConfigure(intents=[DialogueConfigureIntent(intent_id='de.fhem:ChoiceDevice', enable=True), DialogueConfigureIntent(intent_id='de.fhem:CancelAction', enable=True)], site_id='motox')
[DEBUG:2021-06-16 07:47:42,832] rhasspydialogue_hermes: Default intent filter set: ['de.fhem:ChoiceDevice', 'de.fhem:CancelAction']
[DEBUG:2021-06-16 07:47:42,838] rhasspydialogue_hermes: <- DialogueContinueSession(session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', custom_data='heizkörper sĂŒdwest,heizkörper sĂŒdost,raumfĂŒhler', text='Es kommen mehrere GerĂ€te in Frage, bitte wĂ€hle zwischen heizkörper sĂŒdwest heizkörper sĂŒdost oder raumfĂŒhler', intent_filter=['de.fhem:ChoiceDevice', 'de.fhem:CancelAction'], send_intent_not_recognized=False, slot=None, lang=None)
[WARNING:2021-06-16 07:47:42,840] rhasspydialogue_hermes: No session for id e5e7accc-ac46-9dd1-772b-375d4d18eba4. Cannot continue.
[DEBUG:2021-06-16 07:47:42,902] rhasspytts_cli_hermes: <- TtsSay(text='Es kommen mehrere GerĂ€te in Frage, bitte wĂ€hle zwischen heizkörper sĂŒdwest heizkörper sĂŒdost oder raumfĂŒhler', site_id='motox', lang=None, id='40798866-34ad-6f37-c4c2-c60eff70ca19', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', volume=None)
[DEBUG:2021-06-16 07:47:42,905] rhasspytts_cli_hermes: ['nanotts', '-v', 'de-DE', '-o', '/tmp/tmpc9jnbq44.wav']
Using Lingware directory: /usr/lib/rhasspy/lib/nanotts/pico/lang
read: 115 bytes from stdin
using lang: de-DE
[DEBUG:2021-06-16 07:47:42,938] rhasspyasr_kaldi_hermes: <- AsrStopListening(site_id='motox', session_id='null')
[DEBUG:2021-06-16 07:47:42,939] rhasspyasr_kaldi_hermes: Stopping listening (session_id=null)
[DEBUG:2021-06-16 07:47:42,941] rhasspyasr_kaldi_hermes: <- AsrToggleOff(site_id='motox', reason=<AsrToggleReason.TTS_SAY: 'ttsSay'>)
[DEBUG:2021-06-16 07:47:42,942] rhasspyasr_kaldi_hermes: Disabled (AsrToggleReason.TTS_SAY)
wrote "/tmp/tmpc9jnbq44.wav" (230572 bytes)
[DEBUG:2021-06-16 07:47:43,083] rhasspytts_cli_hermes: Got 230572 byte(s) of WAV data
[DEBUG:2021-06-16 07:47:43,090] rhasspytts_cli_hermes: -> AudioPlayBytes(230572 byte(s)) to hermes/audioServer/motox/playBytes/40798866-34ad-6f37-c4c2-c60eff70ca19
[DEBUG:2021-06-16 07:47:43,092] rhasspytts_cli_hermes: Waiting for play finished (timeout=7.454)
[WARNING:2021-06-16 07:47:50,554] rhasspytts_cli_hermes: Did not receive playFinished before timeout
[DEBUG:2021-06-16 07:47:50,557] rhasspytts_cli_hermes: -> TtsSayFinished(site_id='motox', id='40798866-34ad-6f37-c4c2-c60eff70ca19', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:50,558] rhasspytts_cli_hermes: Publishing 118 bytes(s) to hermes/tts/sayFinished
[DEBUG:2021-06-16 07:47:50,562] rhasspydialogue_hermes: <- TtsSayFinished(site_id='motox', id='40798866-34ad-6f37-c4c2-c60eff70ca19', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:51,021] rhasspytts_cli_hermes: <- AudioPlayFinished(id='40798866-34ad-6f37-c4c2-c60eff70ca19', session_id='')
[DEBUG:2021-06-16 07:47:51,021] rhasspydialogue_hermes: <- AudioPlayFinished(id='40798866-34ad-6f37-c4c2-c60eff70ca19', session_id='')
[DEBUG:2021-06-16 07:47:51,028] rhasspyasr_kaldi_hermes: <- AsrToggleOn(site_id='motox', reason=<AsrToggleReason.TTS_SAY: 'ttsSay'>)
[DEBUG:2021-06-16 07:47:51,032] rhasspyasr_kaldi_hermes: Enabled
[DEBUG:2021-06-16 07:47:51,035] rhasspyasr_kaldi_hermes: <- AsrStartListening(site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id=None, intent_filter=None)
[DEBUG:2021-06-16 07:47:51,036] rhasspyasr_kaldi_hermes: Creating new transcriber session e5e7accc-ac46-9dd1-772b-375d4d18eba4
[DEBUG:2021-06-16 07:47:51,037] rhasspyasr_kaldi.transcribe: Using kaldi at /usr/lib/rhasspy/lib/kaldi
[DEBUG:2021-06-16 07:47:51,037] rhasspyasr_kaldi_hermes: Starting listening (session_id=e5e7accc-ac46-9dd1-772b-375d4d18eba4)
[DEBUG:2021-06-16 07:47:51,039] rhasspyasr_kaldi_hermes: Receiving audio
[DEBUG:2021-06-16 07:47:51,039] rhasspyasr_kaldi.transcribe: Creating FIFO at /tmp/tmp3kjanxbu/chunks.fifo
[DEBUG:2021-06-16 07:47:51,041] rhasspyasr_kaldi.transcribe: ['/usr/lib/rhasspy/lib/kaldi/online2-cli-nnet3-decode-faster-confidence', '--config=/opt/rhasspy/profiles/de/kaldi/model/online/conf/online.conf', '--frame-subsampling-factor=3', '--max-active=7000', '--lattice-beam=8.0', '--acoustic-scale=1.0', '--beam=24.0', '/opt/rhasspy/profiles/de/kaldi/model/model/final.mdl', '/opt/rhasspy/profiles/de/kaldi/model/graph/HCLG.fst', '/opt/rhasspy/profiles/de/kaldi/model/graph/words.txt', '/tmp/tmp3kjanxbu/chunks.fifo']
/usr/lib/rhasspy/lib/kaldi/online2-cli-nnet3-decode-faster-confidence --config=/opt/rhasspy/profiles/de/kaldi/model/online/conf/online.conf --frame-subsampling-factor=3 --max-active=7000 --lattice-beam=8.0 --acoustic-scale=1.0 --beam=24.0 /opt/rhasspy/profiles/de/kaldi/model/model/final.mdl /opt/rhasspy/profiles/de/kaldi/model/graph/HCLG.fst /opt/rhasspy/profiles/de/kaldi/model/graph/words.txt /tmp/tmp3kjanxbu/chunks.fifo 
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:ComputeDerivedVars():ivector-extractor.cc:183) Computing derived variables for iVector extractor
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:ComputeDerivedVars():ivector-extractor.cc:204) Done.
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 1 orphan nodes.
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 2 orphan components.
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:Collapse():nnet-utils.cc:1488) Added 1 components, removed 2
LOG (online2-cli-nnet3-decode-faster-confidence[5.5]:CompileLooped():nnet-compile-looped.cc:345) Spent 0.039757 seconds in looped compilation.
[DEBUG:2021-06-16 07:47:51,732] rhasspyasr_kaldi.transcribe: ready
[DEBUG:2021-06-16 07:47:51,733] rhasspyasr_kaldi.transcribe: Decoder started
[DEBUG:2021-06-16 07:47:52,714] rhasspyasr_kaldi_hermes: -> AsrRecordingFinished(site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:52,717] rhasspyasr_kaldi_hermes: Publishing 72 bytes(s) to rhasspy/asr/recordingFinished
[DEBUG:2021-06-16 07:47:52,718] rhasspyasr_kaldi.transcribe: Finished stream. Getting transcription.
[DEBUG:2021-06-16 07:47:52,832] rhasspyasr_kaldi.transcribe: 0.654803 mach 0.672026 0 0.479009 das 0.992737 0.479009 0.690026 radio 1 0.690186 0.990489 on 0.680434 0.990489 1.77
[DEBUG:2021-06-16 07:47:52,832] rhasspyasr_kaldi_hermes: Transcription result: Transcription(text='mach das radio on', likelihood=0.345197, transcribe_seconds=1.0986130237579346, wav_seconds=1.792, tokens=[TranscriptionToken(token='mach', start_time=0.0, end_time=0.479009, likelihood=0.672026), TranscriptionToken(token='das', start_time=0.479009, end_time=0.690026, likelihood=0.992737), TranscriptionToken(token='radio', start_time=0.690186, end_time=0.990489, likelihood=1.0), TranscriptionToken(token='on', start_time=0.990489, end_time=1.77, likelihood=0.680434)])
[DEBUG:2021-06-16 07:47:52,843] rhasspyasr_kaldi_hermes: -> AsrTextCaptured(text='mach das radio on', likelihood=0.345197, seconds=1.0986130237579346, site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', wakeword_id=None, asr_tokens=[[AsrToken(value='mach', confidence=0.672026, range_start=0, range_end=5, time=AsrTokenTime(start=0.0, end=0.479009)), AsrToken(value='das', confidence=0.992737, range_start=5, range_end=9, time=AsrTokenTime(start=0.479009, end=0.690026)), AsrToken(value='radio', confidence=1.0, range_start=9, range_end=15, time=AsrTokenTime(start=0.690186, end=0.990489)), AsrToken(value='on', confidence=0.680434, range_start=15, range_end=18, time=AsrTokenTime(start=0.990489, end=1.77))]], lang=None)
[DEBUG:2021-06-16 07:47:52,845] rhasspyasr_kaldi_hermes: Publishing 678 bytes(s) to hermes/asr/textCaptured
[DEBUG:2021-06-16 07:47:52,860] rhasspyasr_kaldi_hermes: -> AsrAudioCaptured(46124 byte(s)) to rhasspy/asr/motox/motox/audioCaptured
[DEBUG:2021-06-16 07:47:52,882] rhasspyasr_kaldi_hermes: <- AsrStopListening(site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:52,884] rhasspyasr_kaldi_hermes: Stopping listening (session_id=e5e7accc-ac46-9dd1-772b-375d4d18eba4)
[DEBUG:2021-06-16 07:47:52,884] rhasspydialogue_hermes: <- AsrTextCaptured(text='mach das radio on', likelihood=0.345197, seconds=1.0986130237579346, site_id='motox', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', wakeword_id=None, asr_tokens=[[AsrToken(value='mach', confidence=0.672026, range_start=0, range_end=5, time=AsrTokenTime(start=0.0, end=0.479009)), AsrToken(value='das', confidence=0.992737, range_start=5, range_end=9, time=AsrTokenTime(start=0.479009, end=0.690026)), AsrToken(value='radio', confidence=1.0, range_start=9, range_end=15, time=AsrTokenTime(start=0.690186, end=0.990489)), AsrToken(value='on', confidence=0.680434, range_start=15, range_end=18, time=AsrTokenTime(start=0.990489, end=1.77))]], lang=None)
[WARNING:2021-06-16 07:47:52,885] rhasspydialogue_hermes: Ignoring unknown session e5e7accc-ac46-9dd1-772b-375d4d18eba4
[DEBUG:2021-06-16 07:47:52,883] rhasspynlu_hermes: <- NluQuery(input='mach das radio on', site_id='motox', id=None, intent_filter=None, session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', wakeword_id=None, lang=None, custom_data=None, asr_confidence=None, custom_entities=None)
[DEBUG:2021-06-16 07:47:52,925] rhasspynlu_hermes: -> NluIntentParsed(input='mach das radio on', intent=Intent(intent_name='de.fhem:SetOnOff', confidence_score=1.0), site_id='motox', id=None, slots=[Slot(entity='de.fhem.Device', value={'kind': 'Unknown', 'value': 'radio'}, slot_name='Device', raw_value='radio', confidence=1.0, range=SlotRange(start=9, end=14, raw_start=9, raw_end=14)), Slot(entity='OnOffValue', value={'kind': 'Unknown', 'value': 'on'}, slot_name='Value', raw_value='on', confidence=1.0, range=SlotRange(start=15, end=17, raw_start=15, raw_end=17))], session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:52,926] rhasspynlu_hermes: Publishing 592 bytes(s) to hermes/nlu/intentParsed
[DEBUG:2021-06-16 07:47:52,936] rhasspynlu_hermes: -> NluIntent(input='mach das radio on', intent=Intent(intent_name='de.fhem:SetOnOff', confidence_score=1.0), site_id='motox', id=None, slots=[Slot(entity='de.fhem.Device', value={'kind': 'Unknown', 'value': 'radio'}, slot_name='Device', raw_value='radio', confidence=1.0, range=SlotRange(start=9, end=14, raw_start=9, raw_end=14)), Slot(entity='OnOffValue', value={'kind': 'Unknown', 'value': 'on'}, slot_name='Value', raw_value='on', confidence=1.0, range=SlotRange(start=15, end=17, raw_start=15, raw_end=17))], session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', custom_data=None, asr_tokens=[[AsrToken(value='mach', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='das', confidence=1.0, range_start=5, range_end=8, time=None), AsrToken(value='radio', confidence=1.0, range_start=9, range_end=14, time=None), AsrToken(value='on', confidence=1.0, range_start=15, range_end=17, time=None)]], asr_confidence=None, raw_input='mach das radio on', wakeword_id=None, lang=None)
[DEBUG:2021-06-16 07:47:52,937] rhasspynlu_hermes: Publishing 1056 bytes(s) to hermes/intent/de.fhem:SetOnOff
[DEBUG:2021-06-16 07:47:52,977] rhasspydialogue_hermes: <- NluIntent(input='mach das radio on', intent=Intent(intent_name='de.fhem:SetOnOff', confidence_score=1.0), site_id='motox', id=None, slots=[Slot(entity='de.fhem.Device', value={'kind': 'Unknown', 'value': 'radio'}, slot_name='Device', raw_value='radio', confidence=1.0, range=SlotRange(start=9, end=14, raw_start=9, raw_end=14)), Slot(entity='OnOffValue', value={'kind': 'Unknown', 'value': 'on'}, slot_name='Value', raw_value='on', confidence=1.0, range=SlotRange(start=15, end=17, raw_start=15, raw_end=17))], session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', custom_data=None, asr_tokens=[[AsrToken(value='mach', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='das', confidence=1.0, range_start=5, range_end=8, time=None), AsrToken(value='radio', confidence=1.0, range_start=9, range_end=14, time=None), AsrToken(value='on', confidence=1.0, range_start=15, range_end=17, time=None)]], asr_confidence=None, raw_input='mach das radio on', wakeword_id=None, lang=None)
[WARNING:2021-06-16 07:47:52,979] rhasspydialogue_hermes: No session for id e5e7accc-ac46-9dd1-772b-375d4d18eba4. Dropping recognition.
[DEBUG:2021-06-16 07:47:52,982] rhasspydialogue_hermes: <- DialogueEndSession(session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', text='Gerne!', custom_data=None)
[WARNING:2021-06-16 07:47:52,983] rhasspydialogue_hermes: No session for id e5e7accc-ac46-9dd1-772b-375d4d18eba4. Cannot end.
[DEBUG:2021-06-16 07:47:53,021] rhasspytts_cli_hermes: <- TtsSay(text='Gerne!', site_id='motox', lang=None, id='6f5d80dc-8a4e-a24d-23b8-9ce318e8cba6', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', volume=None)
[DEBUG:2021-06-16 07:47:53,024] rhasspytts_cli_hermes: ['nanotts', '-v', 'de-DE', '-o', '/tmp/tmpc_uvf9n7.wav']
Using Lingware directory: /usr/lib/rhasspy/lib/nanotts/pico/lang
read: 6 bytes from stdin
using lang: de-DE
[DEBUG:2021-06-16 07:47:53,062] rhasspyasr_kaldi_hermes: <- AsrStopListening(site_id='motox', session_id='null')
[DEBUG:2021-06-16 07:47:53,062] rhasspyasr_kaldi_hermes: Stopping listening (session_id=null)
wrote "/tmp/tmpc_uvf9n7.wav" (29868 bytes)
[DEBUG:2021-06-16 07:47:53,069] rhasspytts_cli_hermes: Got 29868 byte(s) of WAV data
[DEBUG:2021-06-16 07:47:53,071] rhasspytts_cli_hermes: -> AudioPlayBytes(29868 byte(s)) to hermes/audioServer/motox/playBytes/6f5d80dc-8a4e-a24d-23b8-9ce318e8cba6
[DEBUG:2021-06-16 07:47:53,073] rhasspytts_cli_hermes: Waiting for play finished (timeout=1.182)
[WARNING:2021-06-16 07:47:54,258] rhasspytts_cli_hermes: Did not receive playFinished before timeout
[DEBUG:2021-06-16 07:47:54,262] rhasspytts_cli_hermes: -> TtsSayFinished(site_id='motox', id='6f5d80dc-8a4e-a24d-23b8-9ce318e8cba6', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:54,262] rhasspytts_cli_hermes: Publishing 118 bytes(s) to hermes/tts/sayFinished
[DEBUG:2021-06-16 07:47:54,269] rhasspydialogue_hermes: <- TtsSayFinished(site_id='motox', id='6f5d80dc-8a4e-a24d-23b8-9ce318e8cba6', session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4')
[DEBUG:2021-06-16 07:47:54,315] rhasspytts_cli_hermes: <- AudioPlayFinished(id='6f5d80dc-8a4e-a24d-23b8-9ce318e8cba6', session_id='')
[DEBUG:2021-06-16 07:47:54,315] rhasspydialogue_hermes: <- AudioPlayFinished(id='6f5d80dc-8a4e-a24d-23b8-9ce318e8cba6', session_id='')
[DEBUG:2021-06-16 07:48:02,026] rhasspydialogue_hermes: <- DialogueEndSession(session_id='e5e7accc-ac46-9dd1-772b-375d4d18eba4', text='Tut mir leid, da hat etwas zu lange gedauert', custom_data=None)
[DEBUG:2021-06-16 07:48:02,032] rhasspydialogue_hermes: <- DialogueConfigure(intents=[DialogueConfigureIntent(intent_id='de.fhem:ChoiceDevice', enable=False), DialogueConfigureIntent(intent_id='de.fhem:CancelAction', enable=False)], site_id='motox')
[WARNING:2021-06-16 07:48:02,033] rhasspydialogue_hermes: No session for id e5e7accc-ac46-9dd1-772b-375d4d18eba4. Cannot end.
[DEBUG:2021-06-16 07:48:02,035] rhasspydialogue_hermes: Removed default intent filter

The more testing, more questions arise


Atm I’m fiddeling around using also wakeword functionality. To my surprise, using intentFilter is not limited to the current session, but seems to affect all further intent recognition approaches - or better said:
The intent is recognized, but an “intentNotRecognized” message is sent out and session is finished. intentFilter will only be reset, when Rhasspy is restartet, and the filter doesn’t apply, if the session is started not by wakeword but by pressing the shortcut button.

So here’s some MQTT traffic: first comes a “normal wakeword initiated dialogue”. there’s no audio output, but the recognized content is shown in the app.

hermes/hotword/bumblebee_linux/detected {"modelId": "/usr/lib/rhasspy/usr/local/lib/python3.7/site-packages/pvporcupine/resources/keyword_files/linux/bumblebee_linux.ppn", "modelVersion": "", "modelType": "personal", "currentSensitivity": 0.5, "siteId": "motox", "sessionId": null, "sendAudioCaptured": null, "lang": null, "customEntities": null}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/dialogueManager/sessionStarted {"sessionId": "motox-bumblebee_linux-0af4de6c-d6fc-4414-b937-08ca4a80bd7c", "siteId": "motox", "customData": "bumblebee_linux", "lang": null}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "dialogueSession"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "dialogueSession"}
hermes/hotword/toggleOff {"siteId": "motox", "reason": "playAudio"}
hermes/hotword/toggleOn {"siteId": "motox", "reason": "playAudio"}
hermes/dialogueManager/sessionEnded {"termination": {"reason": "intentNotRecognized"}, "sessionId": "motox-bumblebee_linux-0af4de6c-d6fc-4414-b937-08ca4a80bd7c", "siteId": "motox", "customData": "bumblebee_linux"}

Submitting the recognized and shown text by use of the “send” button in the app later results in:

hermes/hotword/toggleOn {"siteId": "motox", "reason": "dialogueSession"}
hermes/intent/de.fhem:GetTime {"input": "wie spÀt ist es", "intent": {"intentName": "de.fhem:GetTime", "confidenceScore": 1.0}, "siteId": "motox", "id": null, "slots": [], "sessionId": "motox-bumblebee_linux-0af4de6c-d6fc-4414-b937-08ca4a80bd7c", "customData": null, "asrTokens": [[{"value": "wie", "confidence": 1.0, "rangeStart": 0, "rangeEnd": 3, "time": null}, {"value": "spÀt", "confidence": 1.0, "rangeStart": 4, "rangeEnd": 8, "time": null}, {"value": "ist", "confidence": 1.0, "rangeStart": 9, "rangeEnd": 12, "time": null}, {"value": "es", "confidence": 1.0, "rangeStart": 13, "rangeEnd": 15, "time": null}]], "asrConfidence": null, "rawInput": "wie spÀt ist es", "wakewordId": null, "lang": null}
hermes/dialogueManager/endSession {"sessionId": "motox-bumblebee_linux-0af4de6c-d6fc-4414-b937-08ca4a80bd7c","siteId": "motox","text": "Es ist 19 Uhr 43"}

Imo: Strange. Seems I’m missing something essential

Could someone explain, how to correctly use intentFiltering (and/dialogue configure), please?

Glad to report some progress on my project. It’s still not finished, but esp. Start Conversation with TTS and start listening helped to publish messages accepted by Rhasspy and get a better understanding how intentFilter seems to work:

Additionally - different to what I’ve expected while reading the docs - an intentFilter will be active even if the session has ended. To overcome this restriction, setting it to “null” at session end seems to be necessary.

Wrt. to hermes/dialogueManager/configure (#dialogue-manager) my summary up to now is:

  • settings done there will not survive restarts
  • there seem to be some interaction with intentFilter (which I do not completely understand by now)
  • using this option may also not disable the recognition of disabled intents, but also only lead to “not recognized” results?

So atm. resetting configure to some meaningfull defaults looks like a good idea.

For further progress, I’ll first have to do some code review on the Perl side.

Some clarification on the findings so far would be highly appreciated :slightly_smiling_face:.

3 Likes

I am currently not into all the details to give some feedback, but I wanted to complement you on your persistence on this and sharing it with the community :slight_smile:

2 Likes

Thanks a lot for your feedback!

Unfortunately, I’m not familiar with python. So finding the right spots in Rhasspy code and getting a propper idea, how it really works is more or less impossible for me (and most likely: above my league).

Wrt. to the understanding of intentFilter and configure, I now also found (might be of interest for anyone finding this posts here in the future):

Atm., I’m quite optimistic my recent code changes to get the thing fully functional, but imo, there’s quite “too much” MQTT traffic generated that might not be necessary, if

  • intentFilter would be reset after session is closed;
  • intentFilter would not only lead to “not recognized” results, but to full ignore (as configure should (and might?, I had to do some yet untested rework to get messages sent out in the right order));
  • configure would survive restarts (or a message would be sent out when Rhasspy restarts - or is there even one? - to reset configure from external application (FHEM) side)
1 Like

Hi @rejoe2, thanks for all of your work! I’ve been on vacation, so I haven’t had a chance to reply until today.

I’d like to work with you to get the dialogue system in better shape :slight_smile:

As I’m sure you know, there are two intent filters: one from the configure message, and another from startSession and continueSession messages. The session filters should only last for one turn of a session, while the configure will always be active until it’s disabled.

Rather than have the configure intent filter survive restarts, I think it would be better to add some message for when Rhasspy starts up (or a “ping” message you can use to poll). What do you think?

Thanks for picking this up, indeed, it was helpfull to have some time learn some things on my own and search the forum here to get a deeper insight in how rhasspy internally handles some things :wink:.
Having a “rhasspy started” message would be fine, jens-schiffke already provided some code here (I didn’t test it nor am I familiar with Python coding).

At this point in time, everything seems to work as intented, as finally I got sorted out how configure and intentFilter are working on the rhasspy side and when to sent out which kind of message. Hope you don’t mind me addressing some points that could be improved imo - but please keep in mind, I’m not IT professional but more kind of advanced hobbyist:

  1. configure (or the global intent filter) only works as a positive list that has to be activated first. So atm. FHEM first collects all available intents and activates everything but the dialogue-specific intents. Having just that single application doing so, that’s ok, but what will happen if there’s a second client doing the same? Imo, a global intent filter activating all intents should be active as soon as there’s no specific filter set. So each application could just deactivate what’s “conditional stuff”.
    Not sure, if there are more options to have similar effects, e.g. allowing applications to query “false” filters set by (other) applications, but imo administration of true/false filter list should not be done by each application.
  2. Both mechanisms in the end seem to be consolidated into a single filter effective for the respective session. This leads into “not recognized” results, regardless, if it’s a global or session specific filter. Imo, at least the global filter (esp. if set to all clients using siteId “null”) should really be acive as “nothing happens” in the background.
    I’m quite aware this might need some real deep rework in the framework, but will allow more “natural speaking” experience for the end user. Atm., the used sentences I built for the “dialogue intents” include some artifical keywords to allow kind of pre-sorting.

Please let me know, if more info (esp. MQTT traffic, sentences.ini) is needed, these words might be a little abstract?

Besides these findings, there’s also some slight differences in the way a session is handled when using the app:

  • with hotword-initiated sessions, intent filtering is working as expected (resulting in “no one
” afai remember)
  • using the microphone “button” or the shortcut, no filter seems to be applied. As FHEM will give some appropriate feedback, this is not a real problem. Nevertheless this gives some kind of “uneven” impression.

Don’t know if this is related to the session initialization (might be improved on the app side by simulation of a hotword detection?) or due to other restrictions, just wanted to mention it.

Thank you in advance for taking care of our questions / requests.

Sorry if I keep it short but my English is extremely bad and so G
g.e has to help me translate.
The aim of the dialogue process is a question and answer game.
The configure variant is a common means of achieving this goal. I would prefer if we could do without configure.
The following idea and corresponding suggestions:

  1. Rhasspy also learns the content of slots that do not have to be assigned an intent.
    $ConfirmCancel
    (yes) {Value: on}
    (no) {Value: off}

$Locations
Citys = (London | Paris | Berlin)
<Citys>{City}

  1. An intent with content is created in order to have a beginning for the dialogue
    [Trip]
    Please book a trip

  2. An empty intent [Dialogue] is created

The dialog runs as follows:
Human: “Computer, please book a trip”
Rhasspy: “Where do you want to start?”
Human: “In Berlin”
Rhasspy: “Start is Berlin - is that right?”
Human: “yes”
Rhasspy: “Where do you want to go?”
Human: “To Paris, please.”
Rhasspy: “The goal is Paris - is that right?”
Human: “yes”
Rhasspy: “Ok, I’m looking for a connection 
”

In the Hermes reference https://docs.snips.ai/reference/dialogue#outbound-message-1 I found continueSession and understood the following (maybe wrong).
sessionId: current, generated by sessionStarted after [Trip] has been recognized.
text: The question “Where do you want to start?”
intentFilter: [Dialogue]
customData: String which can be used to save the given answers.
sendIntentNotRecognized: true
slot: Locations - Here the intent [Dialogue], the slot $Locations is assigned. The answer is only searched for in this slot.

In the next step, the slot $ConfirmCancel is assigned to the intent [Dialouge].

Is the implementation conceivable?

Would you mind explaining why not using configure (understood as kind of globally predefined intent filter) should be preferred?

Atm. this is the existing logic as introduced by snips (which I never used). My personal approach is more like: If there’s an already existing logics, better think twice before breaking it. If it doesn’t suit to my own logics, most likely I did sth. wrong and/or didn’t get the besic idea behind the existing implementation.
On the FHEM (application) side, we atm. only focused on a “constructive” dialogue and expected the user to do just the next step on our predefined path - so esp. messages to the topic hermes/nlu/intentNotRecognized are completely ignored. We may review this to elaborate a “user wants something else?” side-path?

Additionally, resetting configure (at least for the involved siteId) in our “silent cancellation case” may be a good idea.

Atm. I didn’t get the idea why this is superior to using one or two “yes” or “no” intents+(global) filter - besides the limitating fact the former snips implementation only allowed the slot key in case filtering was restricted to a single intent.
Imo this could also be solved by just adding the “ConfimAction/CancelAction” sentences into the two Choice
 intents and add some (small) pieces of logics to the intent handling code to first sort that out.

To some extend, this is close to what we do in CoiceRoom and ChoiceDevice intents: provide a word list containing a maximum of possible choices. On whichever way such a wordlist may be activated: somewhere in tor programm logic one has to decide if the wordlist should be used or not. Or even more: if only a subset of the wordlist could be chosen?
Afai understood, the latter would require “training at runtime” - which may only be realized in voice2json, as rhasspy is to slow.
Perhaps also kind of (additional) internal filtering could be provided, but - besides the code to do so - this would require additional info exchanged on the mqtt side (alike the former “slot” (?) key used by snips?).
The “problem” with snips-slot seems to be it only accepted slot while beeing restricted to a single intent, see above.

My personal understanding to some of the mentionned keys is slightly different:

Afai understood, any program/participant in the MQTT data exchange may initiate a session and just has to provede a (if possible: unique) sessionId. Only if rhasspy-session-manager is called to to so, the sessionId will be provided by “rhasspy”. (I suspect this mechanism to be the root cause for the different experience the android-app gives dependend on the way microphone was activated: Most likely it opens the session itself (and thus claims the role of session management for this session) when not activated by the hotword.)

customData is some kind of arbitrary place to store some data. But capabilities are limited (esp.: no structured data, just flat strings) and any participant in the MQTT traffic may change it to his needs. Imo we should avoid to try to exchange info about expected behaviour of central parts of the ecosystem to any information that may be provided here.

So atm. most important next thing to clarify imo is whether rewriting the FHEM part to be capable to accept esp. “cancel” within the “Choice” intents is a good idea?
Besides that, I’ll have a look on the “user wants something else?” path.

Is there a possibility to (de) activate Intents by “default” - adequate to SNIPS?
https://docs.snips.ai/articles/console/actions/set-intents#give-your-intent-a-clear-name-and-description
Then there would be no need for a start trigger.

I think so, yes, but maybe a little different than what you have written.

Rather than an empty [Dialogue] intent, you would have an MQTT service that response to the [Trip] intent with a series of continueSession messages to fill in the details. Each continueSession message would set the intentFilter so that it fits the prompt (“Where do you want to start?” could filter for [StartLocation] or just [Location] intents).

When your MQTT service has enough information to proceed, it can just send endSession and start taking an action.

To some extend, this actually is how the intents ChoiceRoom and ChoiceDevice work (it’s just like choosing one of the cities).

But to mention this, there are two aspects in this solution that don’t really are to my personal satisfaction:

  1. Filtering an intent doesn’t mean completely deactivating it. So having a “one-word-sentence” with just “London” or “Berlin” might rise conflicts with longer sentences containing the same keywords (esp. if they contain only optional additional context). Atm. we have especially some trouble with “nein” (no) beeing recognized as ConfirmCancel intent when nothing is spoken (silence will be recognized as “nein” which seems just to be the shortes option amongst all).
    Dependend on the context (dialogueManager/hotword vs. microphone toggle in the app), we will either get a “no one managed
” info or a “silent response” (the latter generated by FHEM).
    So atm. we have to do some kind of workaroud and use sentences with additional “fill words” like “i would like to go to $Locations” to distinguish that intent from others

  2. When starting in London, London isn’t really a meaningfull choice in the second step. But afaik, atm. it’s not possible to restrict the $Locations to just Paris and Berlin at (second) recognition level (and at least try to match the recogintion to these two if ever reasonable). We have to sort out what might be not a good choice later in the process, which is much more difficult than just asking Rhasspy to limit the choice to named variants (we already know to be ok).
    As we atm use lists containing all “main device names” or “main device rooms”, it’s even possible for the user to choose any of the possible choices stored in these slots
 (but (unfortunately only) to some extend he will get an answer including the device info. We might enhance that, but this would lead to longer answers which is - in most cases - not convincing either).

I’ll provide some MQTT traffic the comming days, this might help you to get an impression of the way our implementation works by now. Seems what is needed on the (FHEM-) application side needs much more looping options to gather the finally needed info than atm. are implemented. Perhaps you may have some ideas how to organize things like that; to be honest, I’ve more or less zero experience in building that kind of application and atm. just beginning to see how existing pieces fit together


Excuse the mess.
rejoe2 is the programmer of the Perl module 10_RHASSPY.pm, which we use in our home automation (FHEM).
I am a user who would like to support him in this.
Even if the goal is basically the same, we have different approaches. Ultimately, rejoe2 decides on the implementation.
My idea is that there are intents [confirm], [cancel], etc. that are deactivated by “default”, including slots, which can be activated in the dialog using the intentFilter. Thus, the words “yes”, “no”, etc. contained there should not be recognized in normal operation.

Again to “slot” in continueSession.

What is the function of “slot” in relation to the intentFilter? An example:

existing Intent [day of the week]
Is it $Wday today?

existing Slot $Wday
Monday
Tuesday


Here the sample-dialogue:

hermes/dialogueManager/continueSession {“customData”: “alexa”,“intentFilter”:[“day of the week”],“slot”:“Wday”,“sendIntentNotRecognized”: true,“sessionId”: “0815”,“siteId”: “default”,“text”: “When do you want to start?”}

An existing intent is only searched for a value from the $Wday slot. The keywords “Is it 
 today” are not required and are ignored.
Could that be carried over into the Rhasspy reference?

Short update on this topic:

  • existing procedures on how to continue a session and how to set intentFilters seem to be understood quite well from our side, only one little piece of understanding (or feature?) is missing (see below);
  • atm.
    – we are about to more intensively test the most recent FHEM “app” allowing to ask e.g. for confirmation on more or less all controller side actions targeting to any IoT gadget (on/off, colour setting, 
). See e.g. one example on what’s happening on the MQTT side here (also showing some trouble with silence detection);
    – there’s one example “addon” provided by @jens-schiffke allowing to spell any arbitrary word - it’s kind of a demo app how an “infinite” dialogue could be realized (@jens-schiffke : Might be interesting for others to see the MQTT traffic for e.g. spelling of a short word - this might be a good base to further discuss “good usage” of continueSession/intentFilter/customIntent);
  • As the deb for 2.5.11-preview seem to come soon, I’ll try to update to this, but I’m not sure, if there’s much news wrt. to dialogues. Especially Vosk seems to be a good thing to have a closer look at


Next step from FHEM-“app” side may be to “leave” the “intentNotRecognized” path when intented change seems likely. Besides the “rawInput” value, there seems not to be any information available about what’s next, so my approach would be to (re-) feed the rawInput into the intent recognition system (after/while resetting the intentFilter).

Summary

Some guidance on how to do that would be highly appreciated! (Sorry, but I’m more or less complete noob to Rhasspy internal mechanisms!). Submitting one or two MQTT messages might do the job, but atm., I don’t have no clue neither wrt. the topic nor the payload format.

(EDIT: Topic seems to be hermes/nlu/query, payload is JSON-encoded as described in Reference - Rhasspy (Natural Language Understanding))


Some additional remarks wrt. to the findings in #13:

  • intentFilter might have two modes:
    – completely disable an intent (until it’s needed and actively switched on). This might apply especially to intents “we” named “ConfirmAction”, “CancelAction” or “ChoiceXy”.
    – simple “notRecognized” mode - allowing the user to change the “direction” of the entire dialogue
    So this is the left over question wrt. to intentFilter: Not sure, if complete “disable” is already implemented when using the “configure” Topic (from my understanding of the respective code I suppose: it’s just kind of predefined filter layer).

If not, this might be something to be discussed as feature request?

  • still, some kind of “restriction” (or “priority”?) option for “choices” might really be usefull (see above example: flying from Rome to Rome without any stop in between in most cases is not possible).

Thanks in advance for reading!


Btw.: voice2json seems to offer-on-the-fly reconfiguration of the entire system. I really doubt, if changing the horses would be a good idea, but just to be sure: did I get it right and voice2json would offer more options to restrict possible intents and choices?

Hi @synesthesiam, seems configure message doesn’t disable intent as expected. Trying to figure out - how to disable Hello intent in my case. Update to 2.5.11 and test through Audio Input not Web Interface seems works as expected.
But another question up - how to get full list of intents by API?

I will explain a little. Like others I’m looking for simple Confirmation dialogue managed by Home Assistant. The plan is:

  1. Create “ConfirmAction”, “CancelAction”, “ChoiceXy” intents with a one-two words.
  2. On Rhasspy start - disable “ConfirmAction”, “CancelAction”, “ChoiceXy” intents.
  3. Manage of dialogue by turn on intentFilter and hermes/dialogueManager/continueSession

To disable certain intents, I have to enable all others by hermes/dialogueManager/configure. But first I need to get a list of intents.
As I see for now we can’t get intents list through API. Any advice?

request to the local Rhasspy http://rhasspy:12101/api/intents

@jens-schiffke thank you! Looking to https://rhasspy.readthedocs.io/en/latest/reference and see no mentions.

Could someone point me to the docu why/when a “no one managed to handle 
 intent” message is generated, please?

Some background:
The FHEM-“App” (and it’s short “dialogues” to add either room or device information if there are several possibilities to interpret the original data set derived from intent recognition) work quite good by now (afai can see, that’s also what a couple of other users seem to think about that).

I then started with some experiements to allow also other intents in case the user then decided to go for a different direction. Turned out to be rather hard to really derive the suitable kind of reaction, so I gave up on that more or less.
By now, I’m about to add two other extensions to the FHEM-App.

One is to hand over messenger data (text) to Rhasspy - this is working surprisingly well. There’s been some discussions on a similar topic here - non-voice-interaction-chat.

Second is - and that is the reason why I’m asking - the option to not close each session once a final action has been derived, but to keep the microphone open by publishing to the “continueSession” topic (and setting a new intentFilter).
This also works surprisingly well - but unfortunately not all of the time. I then get a lot of these “no one managed 
”-Messages shown in rhasspy-mobile-app (my main input system).
Or should each session be terminated once an action is taken? Opening up a new session could be done as well

What’s the best choice for such a “continuous scanario”?

EDIT: Seems the “no one managed” may be just related to the usage of the rhasspy mobile app: