Making Rhasspy interact with APIs and speak out information

You have the Text To Speech, Audio Playing and Dialogue Management set to Hermes MQTT.
Why?

  • set Text To Speech to some engine, like ESpeak. This can be changed to a better one
  • set Audio Playing to another system, maybe aplay if you want to play sounds locally. The setting you have now publishes sounds to hermes/audioServer/default/plaBytes/<someID> and unless you have a system subscribed to that topic which is able to play audio, you will never hear sounds.
  • set Dialogue Management to Rhasspy
1 Like

Hi @romkabouter, thanks for your reply! I had those settings earlier but rhasspy wasn’t speaking out the words then so I tried changing them. Any other ideas as to why the output doesn’t work?

What was your setting for audio play and what was the error in the logs?

Rhasspy logs:

[INFO:2020-08-17 12:58:21,772] rhasspyserver_hermes: Started
[DEBUG:2020-08-17 12:58:21,772] rhasspyserver_hermes: Starting web server at http://0.0.0.0:12101
Running on 0.0.0.0:12101 over http (CTRL + C to quit)
1597683631: New connection from 10.0.2.15 on port 12183.
1597683631: New client connected from 10.0.2.15 as auto-31C3784A-DA08-3142-DF55-D4CB4DB0D8E8 (p2, c1, k60).
1597683640: Socket error on client auto-31C3784A-DA08-3142-DF55-D4CB4DB0D8E8, disconnecting.
1597683646: New connection from 10.0.2.15 on port 12183.
1597683646: New client connected from 10.0.2.15 as auto-A4765CFB-B51F-6B13-17F5-52BE9CEFE4DA (p2, c1, k60).
[DEBUG:2020-08-17 13:01:07,201] rhasspyprofile.download: speech_to_text.system pocketsphinx pocketsphinx = True
[DEBUG:2020-08-17 13:01:07,227] rhasspyprofile.download: Skipping base_dictionary.txt (/home/achintya/.config/rhasspy/profiles/en/base_dictionary.txt)
[DEBUG:2020-08-17 13:01:07,238] rhasspyprofile.download: Skipping g2p.fst (/home/achintya/.config/rhasspy/profiles/en/g2p.fst)
[DEBUG:2020-08-17 13:01:07,243] rhasspyprofile.download: Skipping g2p.corpus (/home/achintya/.config/rhasspy/profiles/en/g2p.corpus)
[DEBUG:2020-08-17 13:01:07,246] rhasspyprofile.download: Skipping acoustic_model/feat.params (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/feat.params)
[DEBUG:2020-08-17 13:01:07,247] rhasspyprofile.download: Skipping acoustic_model/feature_transform (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/feature_transform)
[DEBUG:2020-08-17 13:01:07,247] rhasspyprofile.download: Skipping acoustic_model/mdef (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/mdef)
[DEBUG:2020-08-17 13:01:07,251] rhasspyprofile.download: Skipping acoustic_model/means (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/means)
[DEBUG:2020-08-17 13:01:07,251] rhasspyprofile.download: Skipping acoustic_model/mixture_weights (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/mixture_weights)
[DEBUG:2020-08-17 13:01:07,256] rhasspyprofile.download: Skipping acoustic_model/noisedict (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/noisedict)
[DEBUG:2020-08-17 13:01:07,258] rhasspyprofile.download: Skipping acoustic_model/transition_matrices (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/transition_matrices)
[DEBUG:2020-08-17 13:01:07,259] rhasspyprofile.download: Skipping acoustic_model/variances (/home/achintya/.config/rhasspy/profiles/en/acoustic_model/variances)
[DEBUG:2020-08-17 13:01:07,259] rhasspyprofile.download: speech_to_text.system kaldi pocketsphinx = False
[DEBUG:2020-08-17 13:01:07,262] rhasspyprofile.download: speech_to_text.system deepspeech pocketsphinx = False
[DEBUG:2020-08-17 13:01:07,267] rhasspyprofile.download: speech_to_text.pocketsphinx.open_transcription True False = False
[DEBUG:2020-08-17 13:01:07,269] rhasspyprofile.download: speech_to_text.kaldi.open_transcription True False = False
[DEBUG:2020-08-17 13:01:07,270] rhasspyprofile.download: speech_to_text.deepspeech.open_transcription True False = False
[DEBUG:2020-08-17 13:01:07,270] rhasspyprofile.download: speech_to_text.pocketsphinx.mix_weight >0 0 = False
[DEBUG:2020-08-17 13:01:07,282] rhasspyprofile.download: speech_to_text.kaldi.mix_weight >0 0 = False
[DEBUG:2020-08-17 13:01:07,284] rhasspyprofile.download: speech_to_text.deepspeech.mix_weight >0 0 = False
[DEBUG:2020-08-17 13:01:10,253] rhasspyserver_hermes: Waiting for transcription (session_id=e751cd4d-efd9-4128-8eba-a063e8f25d92)
[DEBUG:2020-08-17 13:01:10,255] rhasspyserver_hermes: Subscribed to hermes/error/asr
[DEBUG:2020-08-17 13:01:10,407] rhasspyserver_hermes: -> AsrStartListening(site_id='default', session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id=None, intent_filter=None)
[DEBUG:2020-08-17 13:01:10,414] rhasspyserver_hermes: Publishing 180 bytes(s) to hermes/asr/startListening
[DEBUG:2020-08-17 13:01:10,424] rhasspyasr_pocketsphinx_hermes: <- AsrStartListening(site_id='default', session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id=None, intent_filter=None)
[DEBUG:2020-08-17 13:01:10,425] rhasspyasr_pocketsphinx_hermes: Starting listening (session_id=e751cd4d-efd9-4128-8eba-a063e8f25d92)
[DEBUG:2020-08-17 13:01:10,446] rhasspyasr_pocketsphinx_hermes: Receiving audio
[DEBUG:2020-08-17 13:01:13,615] rhasspyasr_pocketsphinx_hermes: Voice command recorded for session e751cd4d-efd9-4128-8eba-a063e8f25d92 (55680 byte(s))
[DEBUG:2020-08-17 13:01:13,615] rhasspyasr_pocketsphinx_hermes: Transcribing 55724 byte(s) of audio data
INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /home/achintya/.config/rhasspy/profiles/en/acoustic_model/feat.params
Current configuration:
[NAME]			[DEFLT]		[VALUE]
-agc			none		none
-agcthresh		2.0		2.000000e+00
-allphone				
-allphone_ci		yes		yes
-alpha			0.97		9.700000e-01
-ascale			20.0		2.000000e+01
-aw			1		1
-backtrace		no		no
-beam			1e-48		1.000000e-48
-bestpath		yes		yes
-bestpathlw		9.5		9.500000e+00
-ceplen			13		13
-cmn			live		current
-cmninit		40,3,-1		40,3,-1
-compallsen		no		no
-dict					/home/achintya/.config/rhasspy/profiles/en/dictionary.txt
-dictcase		no		no
-dither			no		no
-doublebw		no		no
-ds			1		1
-fdict					
-feat			1s_c_d_dd	1s_c_d_dd
-featparams				
-fillprob		1e-8		1.000000e-08
-frate			100		100
-fsg					
-fsgusealtpron		yes		yes
-fsgusefiller		yes		yes
-fwdflat		yes		yes
-fwdflatbeam		1e-64		1.000000e-64
-fwdflatefwid		4		4
-fwdflatlw		8.5		8.500000e+00
-fwdflatsfwin		25		25
-fwdflatwbeam		7e-29		7.000000e-29
-fwdtree		yes		yes
-hmm					/home/achintya/.config/rhasspy/profiles/en/acoustic_model
-input_endian		little		little
-jsgf					
-keyphrase				
-kws					
-kws_delay		10		10
-kws_plp		1e-1		1.000000e-01
-kws_threshold		1e-30		1.000000e-30
-latsize		5000		5000
-lda					
-ldadim			0		0
-lifter			0		22
-lm					/home/achintya/.config/rhasspy/profiles/en/language_model.txt
-lmctl					
-lmname					
-logbase		1.0001		1.000100e+00
-logfn					
-logspec		no		no
-lowerf			133.33334	1.300000e+02
-lpbeam			1e-40		1.000000e-40
-lponlybeam		7e-29		7.000000e-29
-lw			6.5		6.500000e+00
-maxhmmpf		30000		30000
-maxwpf			-1		-1
-mdef					
-mean					
-mfclogdir				
-min_endfr		0		0
-mixw					
-mixwfloor		0.0000001	1.000000e-07
-mllr					
-mmap			yes		yes
-ncep			13		13
-nfft			512		512
-nfilt			40		25
-nwpen			1.0		1.000000e+00
-pbeam			1e-48		1.000000e-48
-pip			1.0		1.000000e+00
-pl_beam		1e-10		1.000000e-10
-pl_pbeam		1e-10		1.000000e-10
-pl_pip			1.0		1.000000e+00
-pl_weight		3.0		3.000000e+00
-pl_window		5		5
-rawlogdir				
-remove_dc		no		no
-remove_noise		yes		yes
-remove_silence		yes		yes
-round_filters		yes		yes
-samprate		16000		1.600000e+04
-seed			-1		-1
-sendump				
-senlogdir				
-senmgau				
-silprob		0.005		5.000000e-03
-smoothspec		no		no
-svspec					
-tmat					
-tmatfloor		0.0001		1.000000e-04
-topn			4		4
-topn_beam		0		0
-toprule				
-transform		legacy		dct
-unit_area		yes		yes
-upperf			6855.4976	6.800000e+03
-uw			1.0		1.000000e+00
-vad_postspeech		50		50
-vad_prespeech		20		20
-vad_startspeech	10		10
-vad_threshold		3.0		3.000000e+00
-var					
-varfloor		0.0001		1.000000e-04
-varnorm		no		no
-verbose		no		no
-warp_params				
-warp_type		inverse_linear	inverse_linear
-wbeam			7e-29		7.000000e-29
-wip			0.65		6.500000e-01
-wlen			0.025625	2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='batch', VARNORM='no', AGC='none'
INFO: acmod.c(152): Reading linear feature transformation from /home/achintya/.config/rhasspy/profiles/en/acoustic_model/feature_transform
INFO: mdef.c(518): Reading model definition: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/mdef
INFO: bin_mdef.c(181): Allocating 142124 * 8 bytes (1110 KiB) for CD tree
INFO: tmat.c(149): Reading HMM transition probability matrices: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/transition_matrices
INFO: acmod.c(113): Attempting to use PTM computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/means
INFO: ms_gauden.c(242): 5138 codebook, 1 feature, size: 
INFO: ms_gauden.c(244):  32x36
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/variances
INFO: ms_gauden.c(242): 5138 codebook, 1 feature, size: 
INFO: ms_gauden.c(244):  32x36
INFO: ms_gauden.c(304): 813 variance values floored
INFO: ptm_mgau.c(803): Number of codebooks exceeds 256: 5138
INFO: acmod.c(115): Attempting to use semi-continuous computation module
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/means
INFO: ms_gauden.c(242): 5138 codebook, 1 feature, size: 
INFO: ms_gauden.c(244):  32x36
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/variances
INFO: ms_gauden.c(242): 5138 codebook, 1 feature, size: 
INFO: ms_gauden.c(244):  32x36
INFO: ms_gauden.c(304): 813 variance values floored
INFO: acmod.c(117): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/means
INFO: ms_gauden.c(242): 5138 codebook, 1 feature, size: 
INFO: ms_gauden.c(244):  32x36
INFO: ms_gauden.c(127): Reading mixture gaussian parameter: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/variances
INFO: ms_gauden.c(242): 5138 codebook, 1 feature, size: 
INFO: ms_gauden.c(244):  32x36
INFO: ms_gauden.c(304): 813 variance values floored
INFO: ms_senone.c(149): Reading senone mixture weights: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/mixture_weights
INFO: ms_senone.c(200): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(207): Not transposing mixture weights in memory
INFO: ms_senone.c(268): Read mixture weights for 5138 senones: 1 features x 32 codewords
INFO: ms_senone.c(320): Mapping senones to individual codebooks
INFO: ms_mgau.c(144): The value of topn: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 4114 * 32 bytes (128 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /home/achintya/.config/rhasspy/profiles/en/dictionary.txt
INFO: dict.c(213): Dictionary size 9, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(336): 9 words read
INFO: dict.c(358): Reading filler dictionary: /home/achintya/.config/rhasspy/profiles/en/acoustic_model/noisedict
INFO: dict.c(213): Dictionary size 18, allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 9 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 46^3 * 2 bytes (190 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 51152 bytes (49 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 51152 bytes (49 KiB) for single-phone word triphones
INFO: ngram_model_trie.c(354): Trying to read LM in trie binary format
INFO: ngram_model_trie.c(365): Header doesn't match
INFO: ngram_model_trie.c(177): Trying to read LM in arpa format
INFO: ngram_model_trie.c(193): LM of order 3
INFO: ngram_model_trie.c(195): #1-grams: 9
INFO: ngram_model_trie.c(195): #2-grams: 10
INFO: ngram_model_trie.c(195): #3-grams: 8
INFO: lm_trie.c(474): Training quantizer
INFO: lm_trie.c(482): Building LM trie
INFO: ngram_search_fwdtree.c(74): Initializing search tree
INFO: ngram_search_fwdtree.c(101): 9 unique initial diphones
INFO: ngram_search_fwdtree.c(186): Creating search channels
INFO: ngram_search_fwdtree.c(323): Max nonroot chan increased to 133
INFO: ngram_search_fwdtree.c(333): Created 9 root, 5 non-root channels, 9 single-phone words
INFO: ngram_search_fwdflat.c(157): fwdflat: min_ef_width = 4, max_sf_win = 25
[DEBUG:2020-08-17 13:01:18,924] rhasspyasr_pocketsphinx.transcribe: Successfully loaded decoder in 5.308273933999999 second(s)
INFO: cmn.c(133): CMN: 57.16 14.67 -6.79  7.85 -3.14 -5.00 -12.12  7.95  3.40 -18.73  5.88 -15.83 11.41 
INFO: ngram_search_fwdtree.c(1550):      700 words recognized (5/fr)
INFO: ngram_search_fwdtree.c(1552):    25725 senones evaluated (166/fr)
INFO: ngram_search_fwdtree.c(1556):    14442 channels searched (93/fr), 1325 1st, 12555 last
INFO: ngram_search_fwdtree.c(1559):     1323 words for which last channels evaluated (8/fr)
INFO: ngram_search_fwdtree.c(1561):      537 candidate words for entering last phone (3/fr)
INFO: ngram_search_fwdtree.c(1564): fwdtree 0.15 CPU 0.097 xRT
INFO: ngram_search_fwdtree.c(1567): fwdtree 0.41 wall 0.265 xRT
INFO: ngram_search_fwdflat.c(302): Utterance vocabulary contains 11 words
INFO: ngram_search_fwdflat.c(948):      663 words recognized (4/fr)
INFO: ngram_search_fwdflat.c(950):    36871 senones evaluated (238/fr)
INFO: ngram_search_fwdflat.c(952):    22515 channels searched (145/fr)
INFO: ngram_search_fwdflat.c(954):     2165 words searched (13/fr)
INFO: ngram_search_fwdflat.c(957):      686 word transitions (4/fr)
INFO: ngram_search_fwdflat.c(960): fwdflat 0.11 CPU 0.071 xRT
INFO: ngram_search_fwdflat.c(963): fwdflat 0.37 wall 0.239 xRT
[DEBUG:2020-08-17 13:01:19,723] rhasspyasr_pocketsphinx.transcribe: Decoded audio in 0.7921424759999809 second(s)
INFO: ngram_search.c(1250): lattice start node <s>.0 end node </s>.128
INFO: ngram_search.c(1276): Eliminated 3 nodes before end node
INFO: ngram_search.c(1381): Lattice has 207 nodes, 285 links
INFO: ps_lattice.c(1374): Bestpath score: -3045
INFO: ps_lattice.c(1378): Normalizer P(O) = alpha(</s>:128:153) = -156673
INFO: ps_lattice.c(1435): Joint P(O,S) = -191866 P(S|O) = -35193
INFO: ngram_search.c(872): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(875): bestpath 0.00 wall 0.002 xRT
INFO: ngram_search.c(1027): bestpath 0.00 CPU 0.000 xRT
INFO: ngram_search.c(1030): bestpath 0.00 wall 0.000 xRT
[DEBUG:2020-08-17 13:01:19,740] rhasspyasr_pocketsphinx_hermes: Transcription(text='what time is it', likelihood=0.029608080097991655, transcribe_seconds=0.7921424759999809, wav_seconds=1.74, tokens=[TranscriptionToken(token='<s>', start_time=0.0, end_time=0.02, likelihood=1.000100016593933), TranscriptionToken(token='what(2)', start_time=0.03, end_time=0.38, likelihood=0.5231039337814566), TranscriptionToken(token='<sil>', start_time=0.39, end_time=0.41, likelihood=0.44357240840395556), TranscriptionToken(token='time', start_time=0.42, end_time=0.71, likelihood=1.0), TranscriptionToken(token='is', start_time=0.72, end_time=0.96, likelihood=1.0), TranscriptionToken(token='it', start_time=0.97, end_time=1.27, likelihood=1.0), TranscriptionToken(token='</s>', start_time=1.28, end_time=1.53, likelihood=1.0)])
[DEBUG:2020-08-17 13:01:19,757] rhasspyasr_pocketsphinx_hermes: -> AsrTextCaptured(text='what time is it', likelihood=0.029608080097991655, seconds=0.7921424759999809, site_id='default', session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', wakeword_id=None, asr_tokens=[[AsrToken(value='<s>', confidence=1.000100016593933, range_start=0, range_end=4, time=AsrTokenTime(start=0.0, end=0.02)), AsrToken(value='what(2)', confidence=0.5231039337814566, range_start=4, range_end=12, time=AsrTokenTime(start=0.03, end=0.38)), AsrToken(value='<sil>', confidence=0.44357240840395556, range_start=12, range_end=18, time=AsrTokenTime(start=0.39, end=0.41)), AsrToken(value='time', confidence=1.0, range_start=18, range_end=23, time=AsrTokenTime(start=0.42, end=0.71)), AsrToken(value='is', confidence=1.0, range_start=23, range_end=26, time=AsrTokenTime(start=0.72, end=0.96)), AsrToken(value='it', confidence=1.0, range_start=26, range_end=29, time=AsrTokenTime(start=0.97, end=1.27)), AsrToken(value='</s>', confidence=1.0, range_start=29, range_end=34, time=AsrTokenTime(start=1.28, end=1.53))]], lang=None)
[DEBUG:2020-08-17 13:01:19,763] rhasspyasr_pocketsphinx_hermes: Publishing 1029 bytes(s) to hermes/asr/textCaptured
[DEBUG:2020-08-17 13:01:19,779] rhasspyasr_pocketsphinx_hermes: -> AsrAudioCaptured(55724 byte(s)) to rhasspy/asr/default/default/audioCaptured
[DEBUG:2020-08-17 13:01:19,809] rhasspyserver_hermes: Handling AsrTextCaptured (topic=hermes/asr/textCaptured, id=887b822f-90b1-44e1-8a7f-fe88f1c4efd8)
[DEBUG:2020-08-17 13:01:19,827] rhasspyserver_hermes: Waiting for intent (session_id=e751cd4d-efd9-4128-8eba-a063e8f25d92)
[DEBUG:2020-08-17 13:01:19,828] rhasspyserver_hermes: Subscribed to hermes/error/nlu
[DEBUG:2020-08-17 13:01:19,832] rhasspyserver_hermes: -> AsrStopListening(site_id='default', session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92')
[DEBUG:2020-08-17 13:01:19,834] rhasspyserver_hermes: Publishing 74 bytes(s) to hermes/asr/stopListening
[DEBUG:2020-08-17 13:01:19,835] rhasspyserver_hermes: -> NluQuery(input='what time is it', site_id='default', id='e751cd4d-efd9-4128-8eba-a063e8f25d92', intent_filter=None, session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', wakeword_id=None, lang=None)
    [DEBUG:2020-08-17 13:01:19,837] rhasspydialogue_hermes: <- AsrTextCaptured(text='what time is it', likelihood=0.029608080097991655, seconds=0.7921424759999809, site_id='default', session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', wakeword_id=None, asr_tokens=[[AsrToken(value='<s>', confidence=1.000100016593933, range_start=0, range_end=4, time=AsrTokenTime(start=0.0, end=0.02)), AsrToken(value='what(2)', confidence=0.5231039337814566, range_start=4, range_end=12, time=AsrTokenTime(start=0.03, end=0.38)), AsrToken(value='<sil>', confidence=0.44357240840395556, range_start=12, range_end=18, time=AsrTokenTime(start=0.39, end=0.41)), AsrToken(value='time', confidence=1.0, range_start=18, range_end=23, time=AsrTokenTime(start=0.42, end=0.71)), AsrToken(value='is', confidence=1.0, range_start=23, range_end=26, time=AsrTokenTime(start=0.72, end=0.96)), AsrToken(value='it', confidence=1.0, range_start=26, range_end=29, time=AsrTokenTime(start=0.97, end=1.27)), AsrToken(value='</s>', confidence=1.0, range_start=29, range_end=34, time=AsrTokenTime(start=1.28, end=1.53))]], lang=None)
    [DEBUG:2020-08-17 13:01:19,838] rhasspyserver_hermes: Publishing 204 bytes(s) to hermes/nlu/query
    [DEBUG:2020-08-17 13:01:19,845] rhasspynlu_hermes: <- NluQuery(input='what time is it', site_id='default', id='e751cd4d-efd9-4128-8eba-a063e8f25d92', intent_filter=None, session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', wakeword_id=None, lang=None)
    [DEBUG:2020-08-17 13:01:19,845] rhasspynlu_hermes: Loading /home/achintya/.config/rhasspy/profiles/en/intent_graph.pickle.gz
    [WARNING:2020-08-17 13:01:19,851] rhasspydialogue_hermes: Ignoring unknown session e751cd4d-efd9-4128-8eba-a063e8f25d92
    [DEBUG:2020-08-17 13:01:19,856] rhasspynlu_hermes: -> NluIntentParsed(input='what time is it', intent=Intent(intent_name='GetTime', confidence_score=1.0), site_id='default', id='e751cd4d-efd9-4128-8eba-a063e8f25d92', slots=[], session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92')
    [DEBUG:2020-08-17 13:01:19,861] rhasspynlu_hermes: Publishing 222 bytes(s) to hermes/nlu/intentParsed
    [DEBUG:2020-08-17 13:01:19,895] rhasspynlu_hermes: -> NluIntent(input='what time is it', intent=Intent(intent_name='GetTime', confidence_score=1.0), site_id='default', id='e751cd4d-efd9-4128-8eba-a063e8f25d92', slots=[], session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', custom_data=None, asr_tokens=[[AsrToken(value='what', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='time', confidence=1.0, range_start=5, range_end=9, time=None), AsrToken(value='is', confidence=1.0, range_start=10, range_end=12, time=None), AsrToken(value='it', confidence=1.0, range_start=13, range_end=15, time=None)]], asr_confidence=None, raw_input='what time is it', wakeword_id=None, lang=None)
    [DEBUG:2020-08-17 13:01:19,898] rhasspynlu_hermes: Publishing 683 bytes(s) to hermes/intent/GetTime
    [DEBUG:2020-08-17 13:01:19,922] rhasspydialogue_hermes: <- NluIntent(input='what time is it', intent=Intent(intent_name='GetTime', confidence_score=1.0), site_id='default', id='e751cd4d-efd9-4128-8eba-a063e8f25d92', slots=[], session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', custom_data=None, asr_tokens=[[AsrToken(value='what', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='time', confidence=1.0, range_start=5, range_end=9, time=None), AsrToken(value='is', confidence=1.0, range_start=10, range_end=12, time=None), AsrToken(value='it', confidence=1.0, range_start=13, range_end=15, time=None)]], asr_confidence=None, raw_input='what time is it', wakeword_id=None, lang=None)
    [DEBUG:2020-08-17 13:01:19,935] rhasspyserver_hermes: <- NluIntent(input='what time is it', intent=Intent(intent_name='GetTime', confidence_score=1.0), site_id='default', id='e751cd4d-efd9-4128-8eba-a063e8f25d92', slots=[], session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', custom_data=None, asr_tokens=[[AsrToken(value='what', confidence=1.0, range_start=0, range_end=4, time=None), AsrToken(value='time', confidence=1.0, range_start=5, range_end=9, time=None), AsrToken(value='is', confidence=1.0, range_start=10, range_end=12, time=None), AsrToken(value='it', confidence=1.0, range_start=13, range_end=15, time=None)]], asr_confidence=None, raw_input='what time is it', wakeword_id=None, lang=None)
    [DEBUG:2020-08-17 13:01:19,944] rhasspyserver_hermes: Handling NluIntent (topic=hermes/intent/GetTime, id=67a41e0a-3c5d-4a32-bb79-f65170a55fd8)
    [DEBUG:2020-08-17 13:01:19,945] rhasspydialogue_hermes: <- DialogueEndSession(session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', site_id='default', text="It's 13 01", custom_data=None)
    [ERROR:2020-08-17 13:01:19,946] asyncio: Task exception was never retrieved
    future: <Task finished coro=<HermesClient.publish_all() done, defined at rhasspy-hermes/rhasspyhermes/client.py:368> exception=AssertionError('No session')>
    Traceback (most recent call last):
      File "rhasspy-hermes/rhasspyhermes/client.py", line 370, in publish_all
      File "rhasspy-dialogue-hermes/rhasspydialogue_hermes/__init__.py", line 668, in on_message
      File "rhasspy-dialogue-hermes/rhasspydialogue_hermes/__init__.py", line 375, in handle_end
    AssertionError: No session
    [DEBUG:2020-08-17 13:01:19,956] rhasspyserver_hermes: Sent 370 char(s) to websocket
    [DEBUG:2020-08-17 13:01:20,007] rhasspyasr_pocketsphinx_hermes: <- AsrStopListening(site_id='default', session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92')
    [DEBUG:2020-08-17 13:01:20,008] rhasspyasr_pocketsphinx_hermes: Received a total of 309616 byte(s) for WAV data for session e751cd4d-efd9-4128-8eba-a063e8f25d92
    [DEBUG:2020-08-17 13:01:20,010] rhasspyasr_pocketsphinx_hermes: Stopping listening (session_id=e751cd4d-efd9-4128-8eba-a063e8f25d92)

Logs from the python script:

   python3 examples/time_app.py --port 12183 --host 10.0.2.15 --debug
[DEBUG:2020-08-17 13:00:46,170] HermesApp: Namespace(debug=True, host='10.0.2.15', log_format='[%(levelname)s:%(asctime)s] %(name)s: %(message)s', password=None, port=12183, site_id=None, tls=False, tls_ca_certs=None, tls_cert_reqs='CERT_REQUIRED', tls_certfile=None, tls_ciphers=None, tls_keyfile=None, tls_version=None, username=None)
[DEBUG:2020-08-17 13:00:46,170] asyncio: Using selector: EpollSelector
[DEBUG:2020-08-17 13:00:46,171] HermesApp: Connecting to 10.0.2.15:12183
[DEBUG:2020-08-17 13:00:46,174] asyncio: Using selector: EpollSelector
[DEBUG:2020-08-17 13:00:46,174] TimeApp: Connected to MQTT broker
[DEBUG:2020-08-17 13:00:46,174] TimeApp: Subscribed to hermes/intent/GetTime
[DEBUG:2020-08-17 13:01:19,939] TimeApp: -> DialogueEndSession(session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', site_id='default', text="It's 13 01", custom_data=None)
[DEBUG:2020-08-17 13:01:19,943] TimeApp: Publishing 116 bytes(s) to hermes/dialogueManager/endSession

and here is a screenshot of my settings:

It seems like everything is working fine but I can’t hear any responses :confused:

Please post the whole log file when you post something to your TTS

I’m sorry, I’ve edited my previous response to add more logs. The logs say that there was no session, maybe that is the problem? Thanks so much for your help!

Does aplay function from the command line with an example wav?

Yes it is, I’ve tried playing a wave file before and after rhasspy recognized my intent and it works fine but I still don’t hear the time returned by the handler.

Can you clarify your actions from begin to end?

Because I see 1 session: e751cd4d-efd9-4128-8eba-a063e8f25d92
But there is no DialogueStartSession in your logs, so at the end of the script you can see the messages for DialogueEndSession:

[DEBUG:2020-08-17 13:01:19,945] rhasspydialogue_hermes: <- DialogueEndSession(session_id='e751cd4d-efd9-4128-8eba-a063e8f25d92', site_id='default', text="It's 13 01", custom_data=None)
    [ERROR:2020-08-17 13:01:19,946] asyncio: Task exception was never retrieved
    future: <Task finished coro=<HermesClient.publish_all() done, defined at rhasspy-hermes/rhasspyhermes/client.py:368> exception=AssertionError('No session')>
    Traceback (most recent call last):
      File "rhasspy-hermes/rhasspyhermes/client.py", line 370, in publish_all
      File "rhasspy-dialogue-hermes/rhasspydialogue_hermes/__init__.py", line 668, in on_message
      File "rhasspy-dialogue-hermes/rhasspydialogue_hermes/__init__.py", line 375, in handle_end
    AssertionError: No session
    [DEBUG:2020-08-17 13:01:19,956] rhasspyserver_hermes: Sent 370 char(s) to websocket

The dialogue manager does not know what to do with that unknown sesssion and generates an error, and does not send the text to your TTS system, resulting in not playing audio

1 Like

Thanks so much for your reply @romkabouter ! After installing rhasspy I ran the rhasspy --profile en command and opened up another terminal and ran python3 examples/time_app.py --port 12183 --host 10.0.2.15 --debug, and after both of them were up and running I went to localhost on the web interface and spoke to rhasspy and it recognized my intent. Is there a step that I’m missing? Again, thanks so much for helping you’re a lifesaver!

Did you speak to Rhasspy after saying the wake word? I don’t think there’s a session started without a wake word.

I click on the Wake Up button on the web interface and then speak my command, i thought the wake up button acts like a wake word. Should I set up a wake word first?

I’m not sure this will make a difference, but the whole voice stack setup is fairly finicky, so yes, I suggest you set up first a wake word and verify that this all is working before you start with intent handling. It will make debugging problems easier if we know that all the rest is already in a working state.

2 Likes

I suggest setting up the wakeword as well, you should hear confirmation sounds.
If you do hear them, you know that the aplay command is working.

1 Like

I set up the wake word and it worked!! Thanks so much for all your help @koan and @romkabouter, you made someone really happy.

2 Likes

Additionaly: you need a message to DialogueStartSession when you want to manually start a session :slight_smile:

1 Like

Hello! One last question, what is the best way to send this to other people to work on? Should I commit all the files to github or build a docker image?

Creating a GitHub project is probably better if you want to attract contributions.

So should I upload all the files produced after installation? I want them to have my profile and the sentences I added with my settings.

Your sentences.ini and the contents of the slots directory should suffice.

1 Like