Playing Wav or Mp3 Audio on Rhasspy

rlongfield · November 19, 2021, 2:15pm

I was wondering if it is possible to have Rhasspy play audio either an mp3 or wav (prefer the former) as part of a response to a spoke Query.

Specifically I have a Good Morning routine in Node-Red that when I say Good Morning it creates a response that says ‘Good morning, it is the of . Here is the forecast for today .’
I would like to add “and now for the news” at the end and then play the hourly news which is available online in mp3 format.

Is it possible to mix text to speech and audio in the manner?

I am connecting Rhasspy and Node-Red via websockets.

romkabouter · November 19, 2021, 2:32pm

No MP3, but WAV file yes.

For a WAV file, post the wav to api/play-wav. This can be done with three modes:

If the Content-Type header is set to audio/wav, the data is interpreted as WAV audio (otherwise, it’s decoded as a string)
If the string starts with a / then it’s interpreted as a file path
Otherwise, the string is interpreted as a URL

rlongfield · November 19, 2021, 3:35pm

@romkabouter

So I will need to grab the mp3, convert to wav and store it locally. Then I should be able to send the wav file to Rhasspy.

I’m not sure how I would accomplish option 1, but I think I know how to do option 2.
For option 2, does the file need to be located locally for Rhasspy?

romkabouter · November 19, 2021, 5:32pm

Yes, so if you are using the Add-on /share/file.wav should be ok

rlongfield · November 19, 2021, 8:25pm

Turns out this was way simplier than I thought. Combined with the node-red-contrib-media-utils I was able to get nearly realtime playback.

The only issue is that I’m sending text to be converted to speech to /api/text-to-speech and the wav as you said is sent to /api/play-wav.
Although I might split the news out into it’s own request, keeping it in the ‘Good Morning’ routine might make the experience too long.