Hello!
I’m facing some problem with Transcription part in speech recognition. When ever I hit wake up button or use hot word to capture my command/speech it’s not transcribing. However, I confirmed that speech is recorded by playing last recorded wav file.It does capture my voice but somehow not able to translate it and thus not able to recognize intent.
Setup overview:
Rapsberry Pi 3B+ running Raspbian Buster
Rhasspy 2.5.9 on Docker
intent: fsticuffs
speech_to_text: kaldi
text_to_speech: nanotts
hotword: porcupine
One more thing I want to mention, If I write sentence and hit recognize magically it works but when I use hotword or wake up it does not work as stated above.
Kindly see log file for more clarification:
1. when I try to give voice command
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/asr/toggleOn
rhasspyserver_hermes: -> AsrToggleOn(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/hotword/toggleOn
rhasspyserver_hermes: -> HotwordToggleOn(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Handling AudioPlayFinished (topic=hermes/audioServer/default/playFinished, id=6a62736f-1eea-4ffb-aa3c-3a8ef298925b)
rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/default/playBytes/a1d3ae2a-c0f3-4d53-b4fd-b10af397b735, id=80e2cac5-ea21-422d-89ea-8ce8209a3626)
rhasspyserver_hermes: -> AudioPlayBytes(52844 byte(s))
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/asr/toggleOff
rhasspyserver_hermes: -> AsrToggleOff(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/hotword/toggleOff
rhasspyserver_hermes: -> HotwordToggleOff(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Playing 52844 byte(s)
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/asr/toggleOn
rhasspyserver_hermes: -> AsrToggleOn(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/hotword/toggleOn
rhasspyserver_hermes: -> HotwordToggleOn(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Handling AudioPlayFinished (topic=hermes/audioServer/default/playFinished, id=1083dc75-b7ab-4052-af68-162bcf10404b)
rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/default/playBytes/e902dbfe-4d65-4270-bdfd-995cb31d0462, id=80e2cac5-ea21-422d-89ea-8ce8209a3626)
rhasspyserver_hermes: -> AudioPlayBytes(52844 byte(s))
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/asr/toggleOff
rhasspyserver_hermes: -> AsrToggleOff(site_id='default', reason=<AsrToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Publishing 44 bytes(s) to hermes/hotword/toggleOff
rhasspyserver_hermes: -> HotwordToggleOff(site_id='default', reason=<HotwordToggleReason.PLAY_AUDIO: 'playAudio'>)
rhasspyserver_hermes: Playing 52844 byte(s)
rhasspyserver_hermes: Handling AudioPlayBytes (topic=hermes/audioServer/default/playBytes/3bf4cb1e-2f60-4115-b117-d13c610673fc, id=80e2cac5-ea21-422d-89ea-8ce8209a3626)
rhasspyserver_hermes: Sent 260 char(s) to websocket
rhasspyserver_hermes: Handling NluIntentNotRecognized (topic=hermes/nlu/intentNotRecognized, id=0619ff67-1fb4-4a59-8c54-cdda0b752018)
rhasspyserver_hermes: Handling NluIntentNotRecognized (topic=hermes/nlu/intentNotRecognized, id=9c3bf6c8-fc31-4340-a93f-e63d3e8916ed)
rhasspyserver_hermes: <- NluIntentNotRecognized(input='', site_id='default', id='55a62638-7a5e-479b-98b9-a0d9e94e2d5a', custom_data=None, session_id='55a62638-7a5e-479b-98b9-a0d9e94e2d5a')
rhasspyserver_hermes: Publishing 189 bytes(s) to hermes/nlu/query
rhasspyserver_hermes: -> NluQuery(input='', site_id='default', id='55a62638-7a5e-479b-98b9-a0d9e94e2d5a', intent_filter=None, session_id='55a62638-7a5e-479b-98b9-a0d9e94e2d5a', wakeword_id=None, lang=None)
rhasspyserver_hermes: Publishing 74 bytes(s) to hermes/asr/stopListening
rhasspyserver_hermes: -> AsrStopListening(site_id='default', session_id='55a62638-7a5e-479b-98b9-a0d9e94e2d5a')
rhasspyserver_hermes: Waiting for intent (session_id=55a62638-7a5e-479b-98b9-a0d9e94e2d5a)
rhasspyserver_hermes: Handling AsrTextCaptured (topic=hermes/asr/textCaptured, id=9918d31e-b55d-46fe-bb16-7e031b9bbdb4)
rhasspyserver_hermes: Publishing 180 bytes(s) to hermes/asr/startListening
rhasspyserver_hermes: -> AsrStartListening(site_id='default', session_id='55a62638-7a5e-479b-98b9-a0d9e94e2d5a', lang=None, stop_on_silence=True, send_audio_captured=True, wakeword_id=None, intent_filter=None)
rhasspyserver_hermes: Waiting for transcription (session_id=55a62638-7a5e-479b-98b9-a0d9e94e2d5a)
Here we can see sentence isn’t recognized so all slot (text, tokens, intent etc.) values are empty.
2. When I write manually
rhasspyserver_hermes: Publishing 21 bytes(s) to rhasspy/handle/toggleOn
rhasspyserver_hermes: -> HandleToggleOn(site_id='default')
rhasspyserver_hermes: Sent 404 char(s) to websocket
rhasspyserver_hermes: Handling NluIntent (topic=hermes/intent/Meeting, id=a92c0bf3-d033-45b2-ba31-832a6e2e0cec)
rhasspyserver_hermes: <- NluIntent(input='Do we have any meeting', intent=Intent(intent_name='Meeting', confidence_score=1.0), site_id='default', id='ffce67eb-0d3d-4ef8-ae54-3cdbd05c0315', slots=[], session_id='ffce67eb-0d3d-4ef8-ae54-3cdbd05c0315', custom_data=None, asr_tokens=[[AsrToken(value='Do', confidence=1.0, range_start=0, range_end=2, time=None), AsrToken(value='we', confidence=1.0, range_start=3, range_end=5, time=None), AsrToken(value='have', confidence=1.0, range_start=6, range_end=10, time=None), AsrToken(value='any', confidence=1.0, range_start=11, range_end=14, time=None), AsrToken(value='meeting', confidence=1.0, range_start=15, range_end=22, time=None)]], asr_confidence=None, raw_input='do we have any meeting', wakeword_id=None, lang=None)
rhasspyserver_hermes: Publishing 211 bytes(s) to hermes/nlu/query
rhasspyserver_hermes: -> NluQuery(input='do we have any meeting', site_id='default', id='ffce67eb-0d3d-4ef8-ae54-3cdbd05c0315', intent_filter=None, session_id='ffce67eb-0d3d-4ef8-ae54-3cdbd05c0315', wakeword_id=None, lang=None)
rhasspyserver_hermes: Publishing 21 bytes(s) to rhasspy/handle/toggleOff
rhasspyserver_hermes: -> HandleToggleOff(site_id='default')
Here It recognized when I write manually.
Help in any form would be appreciated.
Thanks in advance.