Training for unknown words


I am able to get a basic setup working & Rhasspy is able to control Home Assistant and respond to basic commands e.g. What is temperature ?

I am planning to extend so that Rhasspy can do WIKI search for spoken word or play a song or play internet radio station.

Option 1) Have a python script subscribing to MQTT event XXXX.
Have following in sentences.ini for same intent XXXX
find (apple | microsoft ){title} in wiki
find wiki for (apple | microsoft ){title}

When voice command is “find apple in wiki”, Rhasspy does correct voice to text to intent conversion, python scripts gets the intent, does wiki search and plays it back via text to speech.
But this doesn’t work when any words apart from apple or microsoft is spoken.

Is there a way to create sentences.ini so that if a word is not in sentences.ini, intent handling still works ?

Option 2) Have script subscribe to MQTT hermes/nlu/intentNotRecognized. In this case MQTT event is published but speech to text conversion is not very accurate. Most of times words after "find wiki for " are not accurate, so WIKI search is not working.

Just decided to try your idea…
For better results I’m using kaldi and enabled open transcription mode to recognize all words
But what I’ve found is that in event I have _text and _raw_text the same…

The first line in my log shows full sentence please turn my desk lamp on, but in the event _raw_text contains only words from sentences.ini, ignoring unknown please, desk and my 'raw_text': 'lamp on'…wierd

[DEBUG:950613] DialogueManager: {'text': 'lamp on', 'intent': {'name': 'TurnOn', 'confidence': 0.9}, 'entities': [{'entity': 'device', 'value': 'lamp', 'raw_value': 'lamp', 'start': 0, 'raw_start': 0, 'end': 4, 'raw_end': 4}], 'raw_text': 'lamp on', 'tokens': ['lamp', 'on'], 'raw_tokens': ['lamp', 'on'], 'speech_confidence': 1, 'wakeId': 'snowboy/snowboy.umdl', 'siteId': 'default'}
[INFO:950029] quart.serving: GET / 1.1 200 1029 92220
[DEBUG:946375] DialogueManager: decoding -> recognizing
[DEBUG:946295] DialogueManager: please turn my desk lamp on (confidence=1)
[DEBUG:946181] KaldiDecoder: please turn my desk lamp on

Does anyone know is it a bug or expected behavior?

according to the document it looks like a bug

But can someone confirm I didn’t miss anything? If so I will open an issue on github

some more details

find wiki for (apple | microsoft ){title}

When voice command is 'find wiki for microsoft" MQTT event rhasspy/intent/GetWikiSearch is triggered with expected intent
When voice command is 'find wiki for bill gates", MQTT event hermes/nlu/intentNotRecognized is raised but event data is not good.

Adding Bill Gates to sentences.ini produces expected results

What Speech to text engine do you use?
If Kaldi, is your mic sensitive enough?

Tried PocketSphinx & Kaldi both. Mic is matrix creator card.

Hmmm… If Kaldi can’t recognize words, we are out of luck here :confused:
But if you try to record wav file from your mic, is it good? Can you hear your voice clearly?

*opened an issue regarding _raw_text