Stream audio to Snapcast server

cristianpb · February 6, 2021, 10:54pm

Hi,

I have a main server running mopidy and snapcast server, on the same network I have a couple of RPI with snapcast clients.
These RPI have also rhasspy running, but the thing it’s that I don’t know how to simultaneously listen to music and audio output from rhasspy.

I though about piping audio from rhasspy to the snapcast server, so that it can be heard in all snapcast clients, which is what I’m actually doing with mopidy. Which sends output to snapcast server.

I defined a snapcast stream using a TCP server:

# /etc/snapserver.conf
stream = tcp://127.0.0.1:3333?name=mopidy_tcp

Mopidy configuration have the following:

# /etc/mopidy/mopidy.conf
output = audioresample ! audioconvert ! audio/x-raw,rate=48000,channels=2,format=S16LE ! tcpclientsink host=127.0.0.1 port=3333

Then in the RPI satellites I tried the following custom command for audio output for rhasspy:

To use netcat to pipe the stdin audio to the snapcast server:

nc 192.168.IP.SERVER 3333 -

Use gstreamer (as used in mopidy) to pipe the audio:

gst-launch-1.0 -v fdsrc ! audio/x-raw,format=S16BE,channels=1,rate=8000 ! audioconvert ! audio/x-raw,format=S16LE,channels=1,rate=8000 ! wavenc ! tcpclientsink host=192.168.IP.SERVER port=3333

But both of them have unsuccessful results. I hear a noise sound in the snapcast clients. I think that it might be related with encoding of the audio. It’s difficult to debug gstreamer.

I saw that this have been discussed here Stream music or radio

But still there is no concrete answer.

romkabouter · February 7, 2021, 8:44am

If you hear noise, than the piping works.
So I think the key here is to find if the input is actually audio/x-raw,format=S16BE,channels=1,rate=8000

If you set the TTS to GoogleWavenet, you can set the rate to 22050.
The output will be S16LE mono with 22050 sample rate.

You could also try and save the audio to a file and inspect it with VLC or some other software

cristianpb · February 7, 2021, 11:23am

Thanks for you answer. I used gstream to dump the audio with the custom command:

gst-launch-1.0 -v fdsrc ! filesink location=captureRaw.wav

Then I can see the properties:

Playing /captureRaw.wav.
libavformat version 58.45.100 (external)
Audio only file format detected.
Load subtitles in /home/arch/
==========================================================================
Opening audio decoder: [pcm] Uncompressed PCM audio decoder
AUDIO: 16000 Hz, 1 ch, s16le, 256.0 kbit/100.00% (ratio: 32000->32000)
Selected audio codec: [pcm] afm: pcm (Uncompressed PCM)
==========================================================================

I uploaded the audio here, just in case: https://ufile.io/517tidlt

When I try just piping the raw input file to snapcast server I only hear noise and I receive the following message in the snapcast server:

(AsioStream) Error reading message: End of file, length: 2092

I think this might be related with the format. I think snapcast expects audio/x-raw,rate=48000,channels=2,format=S16LE. I’m trying to build the conversion pipeline but it’s not easy with gstreamer.

cristianpb · February 7, 2021, 11:38am

I just found the good pipeline:

gst-launch-1.0 -v fdsrc ! wavparse ! audioresample ! audioconvert ! audio/x-raw,rate=48000,channels=2,format=S16LE ! wavenc !  tcpclientsink host=192.168.IP.SERVER port=3333

romkabouter · February 7, 2021, 6:42pm

Great! gstreamer is indeed very hard to do.
Good work.
I tried it as well in the settings but I got errors, can you share a screenshot of your Audio Playing setting? Maybe I am entering the wrong parameters…

pip · February 15, 2021, 7:54pm

Thanks for the pipline. I have adapted it to use it with pulseaudio including ducking module in order to lower the volume of music when announcement is played. The rest of the setup seems similar, I also use snapcast to distribute the stream to several satellites.

Pipeline

gst-launch-1.0 -v fdsrc ! wavparse ! audioresample ! audioconvert ! audio/x-raw,rate=48000,channels=2,format=S16LE ! pulsesink server=192.168.x.xx device=xx stream-properties=“props,media.role=announcement”

Since I did not manage to enter this via the Rhasspy web interface I decided to put this into a script and just put the script name as program in the audio playing section. (I just linked the script into the docker container: $PWD/pulse_tts.sh:/bin/pulse_tts.sh)

/edit
In order to make pulsesink work I had to install gstreamer pulse plugin (docker exec -u 0 -it Rhasspy apt install gstreamer1.0-pulseaudio)

cristianpb · March 6, 2021, 11:38pm

Here is a screenshot of my audio playing settings. I have been using for a while and it’s working nice with a snapcast meta source to give more priority to rhasspy announcements

romkabouter · March 7, 2021, 7:49am

Thanks, I will check what I have been doing

schmacka · June 18, 2021, 7:26am

Hi @pip ,

would you mind sharing your pulse_tts.sh script ? I have trouble with the script.
Result when using speak:

AudioServerException: Command '['/tmp/pulse_tts.sh']' returned non-zero exit status 1.

Script

#!/bin/bash
gst-launch-1.0 -v fdsrc ! wavparse ! audioresample ! audioconvert ! audio/x-raw,rate=48000,channels=2,format=S16LE ! pulsesinkserver=redacted stream-properties=props,media.role=announcement

Trying to play a local wav file works:

 gst-launch-1.0 -v filesrc location=/home/pi/piano2.wav ! wavparse ! audioresample ! audioconvert ! audio/x-raw,rate=48000,channels=2,format=S16LE ! pulsesink server=redacted stream-properties=props,media.role=announcement

rolyan_trauts · June 18, 2021, 1:04pm

Not that I have tried but if the snapcast server and rhasspy client are running pulseaudio then could you not just create a tunnel and use that as the output from Rhasspy?
load-module module-tunnel-sink-new server=192.168.0.1 sink_name=Remote channels=2 rate=44100
I grabbed that from PulseAudio: Sound over the network | manurevah which looks a good tutorial.

Dunno why snapcast doesn’t have pulseaudio sources but guess the pulse plugin for alsa would fix that?

Archlinux always has good info PulseAudio - ArchWiki

Then its just set up a alsa source in snapcast ? I guess this way your sinks would show up in rhasspy its just a asound.conf for the pulse plugin snapcast side and setting up the source. Thought I would mention it as @pip mentioned ducking that will lower music for a Rhasspy announcement.
Also if session-id or default-id could be expanded to have room/zones then pulseaudio is actually a really good tool for switching and connecting to sinks/sources even when running.
I wouldn’t say the above as any good for audio streaming as leave that to specific tools such as snapcast or squeezelite but for announcements that can connect remotely and the ducking module its almost perfect for that job.

github.com

badaix/snapcast/blob/master/doc/configuration.md#alsa

# Configuration

## Sources

Audio sources are added to the server as `source` in the `[stream]` section of the configuration file `/etc/snapserver.conf`. Every source must be fed with a fixed sample format, that can be configured per stream (e.g. `48000:16:2`).  

The following notation is used in this paragraph:

- `<angle brackets>`: the whole expression must be replaced with your specific setting
- `[square brackets]`: the whole expression is optional and can be left out
- `[key=value]`: if you leave this option out, "value" will be the default for "key"

The general format of an audio source is:

```sh
TYPE://host/path?name=<name>[&codec=<codec>][&sampleformat=<sampleformat>][&chunk_ms=<chunk ms>]
```

parameters have the form `key=value`, they are concatenated with an `&` character.
Parameter `name` is mandatory for all sources, while `codec`, `sampleformat` and `chunk_ms` are optional

This file has been truncated. show original

I guess also you modprobe snd_aloop and choose a loopback channel and install snapcast in reverse so that the latency compensation runs both ways?
If you install snapcast server in rhasspy and the snapcast client in snapcast again another loopback could be the source for snapcast server?
Also there is the meta source again never tried but if announcements where the higher priority mixed with music as didn’t work out ‘mix’ as sounds more like a priority list but using snapcast/configuration.md at master · badaix/snapcast · GitHub could also make a ‘ducking’ style option?
snapcast/player_setup.md at master · badaix/snapcast · GitHub

pip · June 20, 2021, 1:22pm

#!/bin/bash
gst-launch-1.0 -v fdsrc ! wavparse ! audioresample ! audioconvert ! audio/x-raw,rate=48000,channels=2,format=S16LE ! pulsesink server=192.168.x.xx device=xx stream-properties="props,media.role=announcement"