Possible to use google speech to text in Rhasspy?

Is it possible to use the Google Speech to text with Rhasspy?
I’ve tried to add a remote Rhasspy server for speech recognition in the settings with this:
https://speech.googleapis.com/v1/speech:recognize?key=[MY_API_CREDENTIAL_KEY] but it does not work.
Am I doing it wrong or is Rhasspy not compatible with Google STT?

Rhasspy does currently not work very well with Swedish, that is why I want to use Google STT with Rhasspy to make custom voice commands. There is currently no way to use custom voice commands with Google Home, IFTTT or other services in Swedish.

Yes, I think it should be possible but not the way you try it.

First, set the Speech To Text to “command”:
https://rhasspy.readthedocs.io/en/latest/speech-to-text/#command

Create a program (find speech2text.sh in the link above)

Follow the documentation of Google Speecht To Text here:
https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize

Rhasspy uses 16-bit 16 kHz mono, so have to configure that in the RecognitionConfig and AudioEncoding
The program shoud process the SpeechRecognitionResult and print the text transcription to standard out.

This is the path I would follow if I wanted to do this, but have not tried it.

The remote Rhasspy server does not work, because you need specify with settings of the audio.

1 Like

Thanks for helping me, I appreciate it. I’ve downloaded speech2text.sh and set Speech To Text to “command”, pointing to the file. But I’m not sure how to configure RecognitionConfig and AudioEncoding in the program. Should I just copy the JSON containing RecognitionConfig and AudioEncoding into the speech2text.sh?

No, you cannot use the speech2text.sh. That is just an example program.
You need to create one by yourself, assuming you can.

1 Like

Oh. I’m not really good with coding, so that won’t work…
Well, thanks anyway.
If you or anyone else could make a custom speech2text.sh that works with Google STT and Rhasspy, I would really appreciate it. It would mean a lot

Hey @solid!

I’m a little bit late but I created what you wished for. I did not test it extensively but you might want to give it a try: https://github.com/NullEnt1ty/GCloudSpeech

Make sure that you’re providing valid authentication credentials in order to use the Google Cloud API. Head over to https://cloud.google.com/docs/authentication/getting-started for more information.

Hit me up if you find any issues! :slight_smile:

Thank you so much for this. I’ll give it a try!

FYI a PR recently made its way into Rhasspy 2.4. Unfortunately it hasn’t been ported to 2.5 yet, so I guess that’s up to me :slight_smile: I’ll need it sooner or later anyway so…

Yeah I saw that. Guess that makes my solution obsolete now :smile:

Thanks for your contribution for 2.5 btw!

1 Like

Hi.
Sorry for revive this post. I would like to use Google for speech to text. I think that the simple solution for me its copy the python code from Google example (I have de credentials, virtual enviorement…), but i don’t know wich file I must to send to transcrip (the .wav where the Rhasspy save the sound).

Could you help me?

Thanks for your contribution :wink:

Thanks.

Finally I send the audio to Google but I think Rhasspy are trucating my audio. I explain the problem in a new thread Truncated text from Google Speech to Text - Help - Rhasspy Voice Assistant

Thanks.

Hello,

rhasspy I have installed on RPi in docker.

I want to change STT to google STT = I want to call google via my own python script.
The problem is that I´m not able to understand how should I correctly write path where is my script stored.
My python script is stored in: /home/mirek/.config/rhasspy/profiles/en/stt.py

I tried put exactly this path to setting, but it does not work.

Can you please help me with?

Thank you,
Mirek

1 Like

Hi.

You always must take a look the path’s inside of the continer. Inside the container the profiles path isn’t .config/rhasspy/profiles, it’s /profiles, then the path you must put in Rhasspy configuration should be /profiles/en/stt.py :wink:

See you soon.

Hi,
thank you it works!

I´m not familiar with linux. Unfortunately I´m not able to solve next problem.
I went according to this repository: GitHub - NullEnt1ty/GCloudSpeech: Transcribe voice data to text using Google Cloud Speech-to-Text . There is example how to test whether this repository works:

$ cat podcast.wav | ./run.sh --language en-US
according to my recorded wav and my language I used:
$ cat zvuk.wav | ./run.sh --language cs-CZ

Out of the docker it works. Result is sentence: Ahoj jak se máš Já se mám dobře Dal bych si pivo (red marked 1 on the picture)

But when I switch to docker and run the command. It not works: ModuleNotFoundError: No module named “google”. (green marked 2 on the picture)

Can you please help how to run this repository in the rhasspy docker? All repository files I have stored in /profiles/en/.

Thank you,
Mirek

Hi Miroslav_22
These project are archived and haven’t mantenance. I’m sure works 3 years a go but now I found some errors, specially in python enviorement. I upload my own version (works for me) GitHub - naudor/rhasspy_Google_STT: Rhasspy with a addon for use Google Speech to Text .
It’s my first git docker image, then maybe I have forgot something or did something wrong. It’s a image for arm64, but if you are working in another system you only need change the Dockerfile to refer another Rhasspy docker image.

I must advice that I only found a “easy” solution to work with Rhasspy. I try some microphones but was impossible to set well in Rhasspy, because they had bad quality and Rhasspy it not know when you finish the voice command (maybe will improve with omnidirectional microphone). If you get better results, please share the configuration parameters ;-).

Finally the only solution was install the Rhasspy Android client in my cellphone. This way works perfect, better than Google Home for example, but you need to acces to the app and click to start to talk and another click to close the voice command.

Best regards.

Regarding mic I have good experince with Jabra 510. It works with Rhasspy and Mycroft. Sensitivity is very good for room size 40 m2.

I’m sure that these Jabra works very well, but it’s too expensiven to put satellite un some rooms (jabra + raspberry). Thanks for share.

Have you try the docker image?

Best regards.