Possible to use google speech to text in Rhasspy?

Is it possible to use the Google Speech to text with Rhasspy?
I’ve tried to add a remote Rhasspy server for speech recognition in the settings with this:
https://speech.googleapis.com/v1/speech:recognize?key=[MY_API_CREDENTIAL_KEY] but it does not work.
Am I doing it wrong or is Rhasspy not compatible with Google STT?

Rhasspy does currently not work very well with Swedish, that is why I want to use Google STT with Rhasspy to make custom voice commands. There is currently no way to use custom voice commands with Google Home, IFTTT or other services in Swedish.

Yes, I think it should be possible but not the way you try it.

First, set the Speech To Text to “command”:

Create a program (find speech2text.sh in the link above)

Follow the documentation of Google Speecht To Text here:

Rhasspy uses 16-bit 16 kHz mono, so have to configure that in the RecognitionConfig and AudioEncoding
The program shoud process the SpeechRecognitionResult and print the text transcription to standard out.

This is the path I would follow if I wanted to do this, but have not tried it.

The remote Rhasspy server does not work, because you need specify with settings of the audio.

1 Like

Thanks for helping me, I appreciate it. I’ve downloaded speech2text.sh and set Speech To Text to “command”, pointing to the file. But I’m not sure how to configure RecognitionConfig and AudioEncoding in the program. Should I just copy the JSON containing RecognitionConfig and AudioEncoding into the speech2text.sh?

No, you cannot use the speech2text.sh. That is just an example program.
You need to create one by yourself, assuming you can.

1 Like

Oh. I’m not really good with coding, so that won’t work…
Well, thanks anyway.
If you or anyone else could make a custom speech2text.sh that works with Google STT and Rhasspy, I would really appreciate it. It would mean a lot

Hey @solid!

I’m a little bit late but I created what you wished for. I did not test it extensively but you might want to give it a try: https://github.com/NullEnt1ty/GCloudSpeech

Make sure that you’re providing valid authentication credentials in order to use the Google Cloud API. Head over to https://cloud.google.com/docs/authentication/getting-started for more information.

Hit me up if you find any issues! :slight_smile:

Thank you so much for this. I’ll give it a try!

FYI a PR recently made its way into Rhasspy 2.4. Unfortunately it hasn’t been ported to 2.5 yet, so I guess that’s up to me :slight_smile: I’ll need it sooner or later anyway so…

Yeah I saw that. Guess that makes my solution obsolete now :smile:

Thanks for your contribution for 2.5 btw!

1 Like