Problem in understanding & Intend detection

Mondmonarch · April 7, 2020, 7:34pm

Hello everyone,

somehow Rhasspy has a hard time in understanding my commands. I’m using the English Version , because I feel like it’s working better than the German version (even though I’m not a native speaker).

1.) When I speak commands like: “open blind one”: Rhasspy outputs via MQTT:

rhasspy/en/transition/SnowboyWakeListener listening
rhasspy/en/transition/ARecordAudioRecorder recording
rhasspy/en/transition/SnowboyWakeListener loaded
rhasspy/en/transition/ARecordAudioRecorder started
rhasspy/en/transition/WebrtcvadCommandListener listening
rhasspy/en/transition/ARecordAudioRecorder recording
rhasspy/en/transition/WebrtcvadCommandListener loaded
rhasspy/en/transition/ARecordAudioRecorder started
rhasspy/speech-to-text/transcription on blind one
hermes/asr/textCaptured {"siteId": "default", "text": "on blind one", "likelihood": 1, "seconds": 0}
hermes/nlu/intentNotRecognized {"sessionId": "", "input": ""}
rhasspy/en/transition/SnowboyWakeListener listening
rhasspy/en/transition/ARecordAudioRecorder recording

or understands things just like “on blind one”. This strange detection also happens to other commands i’ve just recently added. The old commands for my light and stuff are still working properly. I also added the word blind, open etc. as custom words. Sometimes the correct sentence is even displayed in hermes/asr/textCaptured {"siteId": "default", "text": but it still says intendNotRecognized hermes/nlu/intentNotRecognized {"sessionId": "", "input": ""}

The Log outputs:

[DEBUG:167636427] ARecordAudioRecorder: Recording from microphone (arecord)
[DEBUG:167636424] ARecordAudioRecorder: ['arecord', '-q', '-r', '16000', '-f', 'S16_LE', '-c', '1', '-t', 'raw', '-D', 'hw:CARD=MATRIXIOSOUND,DEV=0']
[DEBUG:167636422] ARecordAudioRecorder: started -> recording
[DEBUG:167636421] SnowboyWakeListener: loaded -> listening
[DEBUG:167636419] DialogueManager: ready -> asleep
[INFO:167636417] DialogueManager: Automatically listening for wake word
[DEBUG:167636413] DialogueManager: handling -> ready
[DEBUG:167636412] HermesMqtt: Published intent to hermes/nlu/intentNotRecognized
[DEBUG:167636411] WebSocketObserver: {"text": "", "intent": {"name": "", "confidence": 0}, "entities": [], "speech_confidence": 1, "slots": {}}
[DEBUG:167636410] DialogueManager: recognizing -> handling
[DEBUG:167636409] DialogueManager: {'text': '', 'intent': {'name': '', 'confidence': 0}, 'entities': [], 'speech_confidence': 1}
[ERROR:167636403] FsticuffsRecognizer: in_loaded
Traceback (most recent call last):
  File "/usr/share/rhasspy/rhasspy/intent.py", line 208, in in_loaded
    assert recognitions, "No intent recognized"
AssertionError: No intent recognized
[DEBUG:167636398] DialogueManager: decoding -> recognizing
[DEBUG:167636397] DialogueManager: on blind one (confidence=1)
[DEBUG:167630436] KaldiDecoder: ['bash', '/profiles/en/kaldi/model/decode.sh', '/opt/kaldi', '/profiles/en/kaldi/model', '/profiles/en/kaldi/model/graph', '/tmp/tmp3suswlvy.wav']
[DEBUG:167630386] ARecordAudioRecorder: Stopped recording from microphone (arecord)
[DEBUG:167630380] ARecordAudioRecorder: recording -> started
[DEBUG:167630351] APlayAudioPlayer: ['aplay', '-q', '-D', 'sysdefault:CARD=Device', '/profiles/en/sounds/end_of_input.wav']
[DEBUG:167630348] DialogueManager: awake -> decoding
[DEBUG:167630345] WebrtcvadCommandListener: listening -> loaded
[DEBUG:167630344] WebrtcvadCommandListener: Voice command finished
[DEBUG:167628333] WebrtcvadCommandListener: Voice command started

This error does not always appear. Anyone has a clue how I can fix this?

2.) In general: Are there any possibilities to improve the detection of spoken language?

3.) Because of the bad detection rate I tended to write complete sentences instead of using slots. Does this makes any difference to Rhasspy? (Maybe I was just feeling like it’s doing better :D)

My setup:
WakeWord: Snowboy
Voice Detection: webrtcvad
Speech Recognition: kaldi
Intent Recognition: OpenFST
TTS: pico-tts

Thank you and stay healthy