STT Speeds on local pi setup

shedz · January 13, 2020, 5:29pm

I just finished setting up my environment including homeassistant pocketsphinx and a deconz interface. However the decoding times of the Pocketspinx Decoder are arround three to five seconds, while the rest just takes milliseconds. Terminating homeassistant and deconz running on the pi doesn’t seem to change much.
I know i could include a home server for these computations for speed up but since I’m trying to do everything on one pi I was wondering if there are any ways of speeding up the process.

Are my times average or unusually high, what about your computation times?
Running on a Pi 3B+ btw…

adrianofoschi · January 13, 2020, 5:40pm

By my experience I have similar times with Raspberry Pi 3b+ but It could depend from microphone or environment noise and size of wav registration (more seconds, more size).
I am trying two different setups:

Respeaker v1 (low cost) microphone, 2/3 seconds
Respeaker v2 (optimized) microphone, 3/5 seconds (probably slow for built-in algorithms or usb)

I have similar times with Raspberry 2 but I will try with Raspberry 4.

Often It happens that I stop to speak but the registration Is still Active becouse of environment noise. To resolve this I tried to decrease the timeout of the “command” section.

Suggestions are appreciated

PS for good comparation we should specify our environment, I am using docker

ED_Snowden · January 14, 2020, 12:36am

I get the same with my RPi3b. Takes three / four seconds…

Sikk · January 14, 2020, 2:18am

I’m on a RPi3b running Hassio - as Mic I’m using the PS3 Cam - Decode Times are 1.5 - 2.5 seconds

fastjack · January 14, 2020, 4:56am

Regarding decoding time, are you using Kaldi? Using a smaller acoustic model (TDNN-250 instead of TDNN-F) reduced decoding times for me without any impact on accuracy.

@synesthesiam Maybe the smaller acoustic model can be provided as default?

Online decoding should also help to speed things up when Rhasspy supports it natively.

schnopsi · January 14, 2020, 8:19am

How to change the acoustic model?

frkos · January 14, 2020, 8:30am

On my rpi 3b+ with Kaldi I have to wait 4 seconds…
But here are some good news

fastjack · January 14, 2020, 8:48am

Until @synesthesiam updates the model provided for Kaldi profiles with smaller ones (as I think he will ) and if your language has such model (english, german and french do), you can simply replace the files in the {profile_dir}/kaldi/model/model folder by the TDNN-250 model files:

cmvn_opts
den.fst
final.mdl
normalization.fst
tree

The models should be available here:

Hope this helps.

schnopsi · January 14, 2020, 9:00am

It does. Thanks a lot @fastjack

KiboOst · January 14, 2020, 9:04am

Maybe you could create a new topic showing how to install kaldi, and set it in Rhasspy ?
Not sure to have it all

fastjack · January 14, 2020, 9:10am

There is not much to do I afraid… When your profile is all setup using the Kaldi Rhasspy profile, simply download the model from the link and swap the mentionned files. No config changes necessary (maybe retrain though?).

I pretty sure @synesthesiam will include these models natively in the near future as the narrow and specific language model trained by Rhasspy mitigates the “lightness” of the acoustic model.

KiboOst · January 14, 2020, 9:23am

To install Kaldi you just do that, then ?
https://kaldi-asr.org/doc/install.html

fastjack · January 14, 2020, 9:26am

Ah I see what you mean… I did not build Kaldi as I’m using the Docker image so Kaldi is already installed with Rhasspy. The model files are downloaded by Rhasspy into the profile folder.

If you are not using Docker, you will have to build Kaldi using the link you provided (can be pretty complex though as Kaldi does not easily build on ARM).

KiboOst · January 14, 2020, 9:29am

Ah didn’t know that !! I use docker, so I will give it a try ! Thanks
Does number (1…100) builtin works with kaldi ?

fastjack · January 14, 2020, 9:31am

Everything works with Kaldi
Much better than Pocketsphinx.

KiboOst · January 14, 2020, 9:31am

Thanks again, now I know what to do this evening

frkos · January 14, 2020, 9:51am

Hi @fastjack
Thanks for sharing… will try it
But do I need to replace Kaldi files after every rasspy update?

fastjack · January 14, 2020, 12:19pm

I’m afraid so… If you download the Kaldi profile using the web UI, you’ll have to replace the file again. This is a quick fix to speed up the ASR transcription.

adrianofoschi · January 14, 2020, 1:05pm

Does kaldi support italian language?

schnopsi · January 14, 2020, 1:19pm

Not now as it seems: https://rhasspy.readthedocs.io/en/latest/reference/