Hello,
I would like to be able to save a recorded version of a spoken command to a file and be able to access it. I noticed I can play the last spoken command by clicking the play button in the web interface
Is there a way of saving that to a file and access it? Is Rhasspy saving that already anywhere? I don’t mind changing a script somewhere but I cannot find where the audio is acquired in the first place.
I am using Rhasspy 2.5.0 on a Raspberry PI 4 with the latest Raspi OS lite and I am triggering the recognition remotely using the HTTP API api/start-recording and api/stop-recording
Any help would be greatly appreciated!
When you look at https://github.com/rhasspy/rhasspy-server-hermes/blob/master/rhasspyserver_hermes/main.py
You can see in line 1527 that it plays the last captured audio from memory. You could change that script and add a new endpoint to download the audio file.
But I think you can also get the audio through MQTT (not sure).
I’ll be extending the /api/play-recording
endpoint to handle this soon. Right now, you POST
to it to get the last recorded command to play. With the next version, a GET
will pull the WAV data.
Another option, as @DanielW suggested, would be to listen for the rhasspy/asr/<siteId>/<sessionId>/audioCaptured
MQTT message.
@DanielW thank you so much for the pointer, that’s the section of the code I was looking for.
@synesthesiam that sounds great! It would be exatcly what I need! Right before you posted I was thinking of doing something along these lines (please forgive my terrible coding skills):
@app.route("/api/save-recording", methods=["GET"])
# here we want to get the value of filename (i.e. ?filename=_MyFileName.wav)
async def api_save_recording():
assert core is not None
_filename = str(request.args.get("filename"))
if core.last_audio_captured:
wav_bytes = core.last_audio_captured.wav_bytes
_LOGGER.debug("Saving audio file, size: %s byte(s)", len(wav_bytes))
await buffer_to_wav_file(os.path.join(/home/pi,_filename),wav_bytes)
return "OK"
with adding something like this to utils.py
def buffer_to_wav_file(filename,wav_buffer):
wf = wave.open(filename, mode="wb")
wf.setframerate(16000)
wf.setsampwidth(2)
wf.setnchannels(1)
wf.writeframes(wav_buffer)
wf.close()
Possibly not the most elegant solution… but do you think it could work?