Hi all, so I am using an external program to actually perform the STT because I found that selecting “open transcription mode” for deepspeech does not work via Rhasspy interface. I am running into an error where running a python script from Rhasspy does not recognize the import modules.
So, my python script consists of this:
import deepspeech
import wave
import numpy as np
import sys
model = deepspeech.Model('deepspeech-0.9.3-models.tflite')
model.enableExternalScorer('deepspeech-0.9.3-models.scorer')
w = wave.open('temp_file.wav', 'r')
#wav = sys.stdin.read()
data16 = np.frombuffer(w.readframes(w.getnframes()), dtype=np.int16)
text = model.stt(data16)
print(text)
I haven’t wrangled with the wav file yet, so I am just leaving in a temporary wav file that I pass in. Ignore any errors from that (errors should produce running from Rhasspy but not manually, you’ll see why below).
My profile contains this:
"speech_to_text": {
"command": {
"program": "/profiles/en/sttpy.py"
},
"system": "command"
},
When I run this via Rhasspy (taken from docker logs rhasspy
):
[DEBUG:2022-01-06 11:09:04,374] rhasspyremote_http_hermes: Traceback (most recent call last):
File "/profiles/en/sttpy.py", line 8, in <module>
model = deepspeech.Model('deepspeech-0.9.3-models.tflite')
AttributeError: module 'deepspeech' has no attribute 'Model'
Because I am passing in a file, I expect an error, but not this one.
When I run this locally:
pi@raspberrypi:~/.config/rhasspy/profiles/en $ ./sttpy.py < temp_file.wav
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
why should one halt on the way
pi@raspberrypi:~/.config/rhasspy/profiles/en $ python3
Python 3.7.3 (default, Jan 22 2021, 20:04:44)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import deepspeech
>>> model = deepspeech.Model('deepspeech-0.9.3-models.tflite')
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
I found a similar error for my intent recognition program that previously worked – I import a module called text2digits
and was successful, but it is now giving me a module not found error. Which is super weird.
Anyone have any thoughts as to why running something from Rhasspy won’t recognize a python module?
EDIT
I think this may be a problem with how Rhasspy calls on these files? I also tried with a shell script and also got errors:
pi@raspberrypi:~/.config/rhasspy/profiles/en $ cat stt.sh
#!/bin/bash
# WAV data is avaiable via STDIN
wav_file=$(mktemp)
trap "rm -f $wav_file" EXIT
cat | sox -t wav - -r 16000 -e signed-integer -b 16 -c 1 -t wav - > "$wav_file"
deepspeech --model "deepspeech-0.9.3-models.tflite" --scorer "deepspeech-0.9.3-models.scorer" --audio "$wav_file"
pi@raspberrypi:~/.config/rhasspy/profiles/en $ docker logs rhasspy
DEBUG:2022-01-06 11:39:28,516] rhasspyremote_http_hermes: ['/profiles/en/stt.sh']
[DEBUG:2022-01-06 11:39:28,678] rhasspyremote_http_hermes: /profiles/en/stt.sh: line 8: deepspeech: command not found
pi@raspberrypi:~/.config/rhasspy/profiles/en $ deepspeech --model deepspeech-0.9.3-models.tflite --scorer deepspeech-0.9.3-models.scorer --audio temp_file.wav
Loading model from file deepspeech-0.9.3-models.tflite
TensorFlow: v2.3.0-6-g23ad988
DeepSpeech: v0.9.3-0-gf2e9c85
Loaded model in 0.0717s.
Loading scorer from files deepspeech-0.9.3-models.scorer
Loaded scorer in 0.0163s.
Running inference.
why should one halt on the way
Inference took 9.948s for 2.735s audio file.
All in all, I’m confused as to why this isn’t working.