Truncated text from Google Speech to Text

Arnau_Dunjo · December 12, 2022, 11:56pm

Hi everybody.

I’ve managed to use Google Speech to Text to send what I’ve said as text to Rhasspy to process and run the relevant “intend”, but it’s only picking up the first two words I’ve said. I understand that it’s because of the error that happened just before, so I don’t know if it’s a cause or a consequence. I pass the interesting part in case some charitable soul want to help me

[DEBUG:2022-12-12 23:31:32,054] rhasspyremote_http_hermes: Received 55724 byte(s) of WAV data
[DEBUG:2022-12-12 23:31:32,055] rhasspyremote_http_hermes: [‘/GCloudSpeech/run.sh’, ‘–language’, ‘ca-ES’]
[DEBUG:2022-12-12 23:31:46,022] rhasspyremote_http_hermes: → AsrTextCaptured(text=‘Posa el Canal 3\n’, likelihood=0.0, seconds=13.965966224000113, site_id=‘Servidor’, session_id=‘Servidor-GINA-cbd2a979-c12f-41a1-aa51-8ba9abe013a6’, wakeword_id=None, asr_tokens=None, lang=None)
[DEBUG:2022-12-12 23:31:46,023] rhasspyremote_http_hermes: Publishing 221 bytes(s) to hermes/asr/textCaptured
[DEBUG:2022-12-12 23:31:46,024] rhasspyremote_http_hermes: → AsrAudioCaptured(55724 byte(s)) to rhasspy/asr/Servidor/Servidor-GINA-cbd2a979-c12f-41a1-aa51-8ba9abe013a6/audioCaptured
[DEBUG:2022-12-12 23:31:46,027] rhasspydialogue_hermes: ← AsrTextCaptured(text=‘Posa el Canal 3\n’, likelihood=0.0, seconds=13.965966224000113, site_id=‘Servidor’, session_id=‘Servidor-GINA-cbd2a979-c12f-41a1-aa51-8ba9abe013a6’, wakeword_id=None, asr_tokens=None, lang=None)
[DEBUG:2022-12-12 23:31:46,028] rhasspydialogue_hermes: Playing sound /profiles/ca/sons/val.wav
[DEBUG:2022-12-12 23:31:46,030] rhasspydialogue_hermes: → HotwordToggleOff(site_id=‘Servidor’, reason=<HotwordToggleReason.PLAY_AUDIO: ‘playAudio’>)
[DEBUG:2022-12-12 23:31:46,030] rhasspydialogue_hermes: Publishing 45 bytes(s) to hermes/hotword/toggleOff
[DEBUG:2022-12-12 23:31:46,032] rhasspydialogue_hermes: → AsrToggleOff(site_id=‘Servidor’, reason=<AsrToggleReason.PLAY_AUDIO: ‘playAudio’>)
[DEBUG:2022-12-12 23:31:46,032] rhasspydialogue_hermes: Publishing 45 bytes(s) to hermes/asr/toggleOff
[DEBUG:2022-12-12 23:31:46,034] rhasspydialogue_hermes: → AudioPlayBytes(114812 byte(s)) to hermes/audioServer/Servidor/playBytes/7844ee06-b07c-41e8-a693-991bc62724b3
[DEBUG:2022-12-12 23:31:46,035] rhasspydialogue_hermes: Waiting for playFinished (id=7844ee06-b07c-41e8-a693-991bc62724b3, timeout=1.5512244897959184)
[DEBUG:2022-12-12 23:31:46,038] rhasspyremote_http_hermes: ← AsrToggleOff(site_id=‘Servidor’, reason=<AsrToggleReason.PLAY_AUDIO: ‘playAudio’>)
[DEBUG:2022-12-12 23:31:46,039] rhasspyremote_http_hermes: ASR disabled
[DEBUG:2022-12-12 23:31:46,040] rhasspyspeakers_cli_hermes: ← AudioPlayBytes(114812 byte(s))
[DEBUG:2022-12-12 23:31:46,041] rhasspyspeakers_cli_hermes: [‘aplay’, ‘-q’, ‘-t’, ‘wav’]
[DEBUG:2022-12-12 23:31:46,052] rhasspywake_raven_hermes: ← HotwordToggleOff(site_id=‘Servidor’, reason=<HotwordToggleReason.PLAY_AUDIO: ‘playAudio’>)
[DEBUG:2022-12-12 23:31:46,052] rhasspywake_raven_hermes: Disabled
[DEBUG:2022-12-12 23:31:46,791] rhasspyspeakers_cli_hermes: → AudioPlayFinished(id=‘7844ee06-b07c-41e8-a693-991bc62724b3’, session_id=‘7844ee06-b07c-41e8-a693-991bc62724b3’)
[DEBUG:2022-12-12 23:31:46,791] rhasspyspeakers_cli_hermes: Publishing 99 bytes(s) to hermes/audioServer/Servidor/playFinished
[DEBUG:2022-12-12 23:31:46,795] rhasspydialogue_hermes: ← AudioPlayFinished(id=‘7844ee06-b07c-41e8-a693-991bc62724b3’, session_id=‘7844ee06-b07c-41e8-a693-991bc62724b3’)
[DEBUG:2022-12-12 23:31:46,795] rhasspytts_wavenet_hermes: ← AudioPlayFinished(id=‘7844ee06-b07c-41e8-a693-991bc62724b3’, session_id=‘7844ee06-b07c-41e8-a693-991bc62724b3’)
[DEBUG:2022-12-12 23:31:46,796] rhasspydialogue_hermes: → HotwordToggleOn(site_id=‘Servidor’, reason=<HotwordToggleReason.PLAY_AUDIO: ‘playAudio’>)
[DEBUG:2022-12-12 23:31:46,797] rhasspydialogue_hermes: Publishing 45 bytes(s) to hermes/hotword/toggleOn
[DEBUG:2022-12-12 23:31:46,798] rhasspydialogue_hermes: → AsrToggleOn(site_id=‘Servidor’, reason=<AsrToggleReason.PLAY_AUDIO: ‘playAudio’>)
[DEBUG:2022-12-12 23:31:46,799] rhasspydialogue_hermes: Publishing 45 bytes(s) to hermes/asr/toggleOn
[DEBUG:2022-12-12 23:31:46,800] rhasspydialogue_hermes: Received text: Posa el Canal 3

[DEBUG:2022-12-12 23:31:46,801] rhasspydialogue_hermes: → AsrStopListening(site_id=‘Servidor’, session_id=‘Servidor-GINA-cbd2a979-c12f-41a1-aa51-8ba9abe013a6’)
[DEBUG:2022-12-12 23:31:46,801] rhasspydialogue_hermes: Publishing 89 bytes(s) to hermes/asr/stopListening
[DEBUG:2022-12-12 23:31:46,803] rhasspydialogue_hermes: → HotwordToggleOn(site_id=‘Servidor’, reason=<HotwordToggleReason.DIALOGUE_SESSION: ‘dialogueSession’>)
[DEBUG:2022-12-12 23:31:46,804] rhasspydialogue_hermes: Publishing 51 bytes(s) to hermes/hotword/toggleOn
[DEBUG:2022-12-12 23:31:46,806] rhasspydialogue_hermes: → NluQuery(input=‘Posa el Canal 3\n’, site_id=‘Servidor’, id=None, intent_filter=None, session_id=‘Servidor-GINA-cbd2a979-c12f-41a1-aa51-8ba9abe013a6’, wakeword_id=‘GINA’, lang=None, custom_data=‘GINA’, asr_confidence=0.0, custom_entities=None)
[DEBUG:2022-12-12 23:31:46,806] rhasspydialogue_hermes: Publishing 257 bytes(s) to hermes/nlu/query
[DEBUG:2022-12-12 23:31:46,810] rhasspynlu_hermes: ← NluQuery(input=‘Posa el Canal 3\n’, site_id=‘Servidor’, id=None, intent_filter=None, session_id=‘Servidor-GINA-cbd2a979-c12f-41a1-aa51-8ba9abe013a6’, wakeword_id=‘GINA’, lang=None, custom_data=‘GINA’, asr_confidence=0.0, custom_entities=None)
[ERROR:2022-12-12 23:31:46,811] rhasspynlu_hermes: handle_query
Traceback (most recent call last):
** File “/usr/lib/rhasspy/rhasspy-nlu-hermes/rhasspynlu_hermes/init.py”, line 99, in handle_query**
** query.input = " “.join(words)**
** File “/usr/lib/rhasspy/rhasspy-nlu/rhasspynlu/numbers.py”, line 38, in replace_numbers**
** for number_word in number_to_words(n, language=language):**
** File “/usr/lib/rhasspy/rhasspy-nlu/rhasspynlu/numbers.py”, line 25, in number_to_words**
** num2words(number, lang=language).replace(”-“, " “).replace(”,”, “”).strip()**
** File “/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/num2words/init.py”, line 81, in num2words**
** raise NotImplementedError()**
NotImplementedError

It could be due to a bad configuration of “Voice Command Settings” (another problem it’s the delay between I finish the voice command and I ear the sound witch confirm the final of the record) I left the ones that came by default although I tried a few combinations without success

Thanks a lot for our time.

Arnau_Dunjo · December 13, 2022, 8:34pm

I “solved” the probem. Seems that hermes have troubles to “transalate” the numbers in my language (catalan). If I use sentence without number work well.
I have changed the method wich detect the silencie but doesn’t work well. I will low the mic level and maybe put the sox in th middle for a better results.