Almost there but but just can't deploy :(

Been playing with Rhasspy on a Pi3b+ and 2-mic hat for a month or so now, I wanted to use it to replace an Alexa so we get offline voice assist.

Seems no matter what I try i just can’t get it good enough to deploy it, take today for example - I wanted to turn the Sky box on (which also turns TV on) the sentence is “turn sky on”, there is also a sentence “turn plex on” which turns our media player on.

1st attempt gave no match, 2nd attempt gave the affirmative beep but then proceeded to turn plex on instead of sky :slight_smile:

No matter what i try it seems impossible to set ASR score high enough to block falsies while still allowing it to work, it needs to be up around 0.95 but a sentence like “set a timer for 25 minutes” will always fail as its down around 0.55 - the actual intent is…
[SetTimer]
minutes = (1){min} minute | (2…59){min} minutes
seconds = (1){sec} second | (2…59){sec} seconds
set [a] timer for

It also cannot recognise “bedtime” and just opts to go for “whats the time” and gives me the time instead of turning everything off :slight_smile:

Please don’t take this as a dig, I think Rhasspy is a brilliant project and something that really is needed today but I just can not get it to work well enough to deploy it and turn off Alexa. I’m not sure of what else i can try on it next, I’m not that skilled with pi code so more of a cut-n-paste user.

Have any of you guys got it good enough to use in the real-world???

Maybe it helps if your rephrase sky into skybox and bedtime into something with sleep?

Yeah I can play a bit but just chose those phrases as thats what we have used on Alexa for over a year so its sort of stuck :slight_smile:

It does seem to default to time which is odd.

That could be because it weights between all the sentences it knows, and the timer has tons of sentences because it generates different combinations of minutes and seconds (maybe even all of them) for training. So sentences with fewer slots are represented less and so it defaults to the timer if it has no idea. I had the same problem a while ago and I think I fixed it by switching the default language model for kaldi to text fst, but that should be the default now. You can check that and see if switching helps.

Yes it is on text.fst so i guess it defaults to that.

It never chooses the timer, its most favoured error choice is to just tell me the time :slight_smile: In fact with ASR confidence above 0.7 or so its impossible to set a timer as it scores way lower.

Well, mine defaults to the weather, but then it doesn’t know much more than that at this point.

You could try switching around the oder of the intents and see if that helps, or playing around with the wording of the timer or the time, maybe that is just too similar. Also, you could try and save you asking for each intent and compare them, could be that your pronunciation for timer is closer to what the model expects it to be for time, after all, the model wasn’t trained for you specifically. I had that issue with the default wakeword computer that exists for precise, had to train my own for it to understand me. To test for that, maybe use tts for mic input and see if it still happens, or temporarily use open transcription and see what it actually understands when not matching to the closest intent.

Thats a good idea, i’ll try open transcribe and see what goes on.

I tried turning on open transcription, it downloaded some files but seems to make no difference??

Where is the transcribed text output???

If you are experiencing poor recognition, have you recorded some audio and listened to the result? If your microphone isn’t delivering good quality audio, then you’ve failed to clear the first hurdle.

It tends to sound muffled unless i’m within 1m of it, i have the 2-mic hat but no idea if there are better mics - its also handy as it runs the speaker direct.

The thing with the respeaker 2mic hat is that you really have to tune the settings to get better performance, especially to improve the far field capabilities.
You actually have to manually turn on things like the noise gate and the automatic gain control as they are not on by default. Afterwards tune settings like attack, decay, gain by recording and listening back a ton.
If you google the spec sheet for the audio chip it uses it actually tells you what does what and in which increments.

Really?

That sounds like an advanced course in one go - I have no idea how to do any of that :frowning:

You can do everything in alsamixer. Just type alsamixer in to your shell than F6 to choose the right card and than F4 to change to only output settings.
For a starting point you can also create a bash file somewhere on your Rhasspy system, for example call it settings.sh with this content:

#!/bin/bash

amixer -c "seeed2micvoicec" cset numid=1 0,0
amixer -c "seeed2micvoicec" cset numid=10 235,235
amixer -c "seeed2micvoicec" cset numid=26 3
amixer -c "seeed2micvoicec" cset numid=27 3
amixer -c "seeed2micvoicec" cset numid=28 3
amixer -c "seeed2micvoicec" cset numid=29 1
amixer -c "seeed2micvoicec" cset numid=30 0
amixer -c "seeed2micvoicec" cset numid=32 7
amixer -c "seeed2micvoicec" cset numid=33 7
amixer -c "seeed2micvoicec" cset numid=34 31
amixer -c "seeed2micvoicec" cset numid=35 on
alsactl store

and than run it with sudo bash settings.sh.
This is my current starting point for the 2mic hat. This script uses amixer‚s cset to set some parameters like agc on and the noisegate or target gain directly.
When you run it you will see what did. what in the output of the Shell.
For some exciting reading here is the spec sheet for the chip that the 2 mic uses:

Thanks, those code snips gave me this setting in alsamixer, presume it worked?

I’ll do some testing later today

1 Like

Its a tiny bit better, but i downloaded a wav or two and in mac quicktime player the audio is barely there, extremely quiet.

I don’t have audacity yet as mac needs an OS update so cant do any serious analysis but it is very very quiet

In my mixer pic above, should “capture” be on zero???

if agc is on capture will not do anything. You will have to tune alc target gain and alc max gain for more loudness. But even with the settings above its not quite for me at all.