Simultaneous audio output from Rhasspy and PyGame

Hello!
I am using a python app (with rhasspy-hermes-app) to respond to my intents, and one of the things my application does is stream sound. I have not been able to get this to work, as the aplay command that Rhasspy uses seems to want exclusive use of the audio device.

When I start multiple pygame instances that all play sound, that does work fine.

My client application is playing sound through PyGame, which uses SDL2. Rhasspy uses aplay which plays through ALSA. I have seen some solutions posted that propose creating a software mixer in my ALSA config, but since I am playing my other sound through SDL, I am not sure this will work (I did try, it didn’t work, but then I might need to do more debugging).

I am wondering if anyone has tips on how to approach this? I think I can take a few approaches:

  • implement a python script that reads WAV from stdin and plays it through PyGame
  • set Rhasspy to play through MQTT, and handle those MQTT messages inside my app, playing sounds with PyGame. I have not been able to find any docs around how to respond to these messages.
  • something else?

Any advice here would be highly appreciated!

One more thing, I have been playing to write a script to play sounds using PyGame, but I’m getting stuck specifying the path to this script in Rhasspy. I am trying “/home/pi/voice-clocky/playsound.py” as a path, which obviously points to my host machine, not the Rhasspy container. Unsurprisingly, I am getting a file not found error when I try to play sound… How should I use the command audio output?

Copying my script to play sound using pygame to the profile directory and setting its path to ${RHASSPY_PROFILE_DIR}/playsound.py

still gives me:
AudioServerException: [Errno 2] No such file or directory: '${RHASSPY_PROFILE_DIR}/playsound.py': '${RHASSPY_PROFILE_DIR}/playsound.py'

I realised the profile path points to the base path, and there are subfolders in there. I put my script in /en, so the path should be
${RHASSPY_PROFILE_DIR}/en/playsound.py

I no longer get a file not found error, now I get:

[ERROR:2021-04-21 10:44:34,508] rhasspyserver_hermes: 
Traceback (most recent call last):
  File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/quart/app.py", line 1821, in full_dispatch_request
result = await self.dispatch_request(request_context)
  File "/usr/lib/rhasspy/.venv/lib/python3.7/site-packages/quart/app.py", line 1869, in dispatch_request
return await handler(**request_.view_args)
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__main__.py", line 1692, in api_text_to_speech
results = await asyncio.gather(*aws)
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__main__.py", line 1678, in speak
say_chars_per_second=say_chars_per_second,
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 616, in speak_sentence
handle_finished(), messages, message_types, timeout_seconds=timeout_seconds
  File "/usr/lib/rhasspy/rhasspy-server-hermes/rhasspyserver_hermes/__init__.py", line 994, in publish_wait
result_awaitable, timeout=timeout_seconds
  File "/usr/lib/python3.7/asyncio/tasks.py", line 423, in wait_for
raise futures.TimeoutError()
concurrent.futures._base.TimeoutError
[DEBUG:2021-04-21 10:44:04,472] rhasspyserver_hermes: Publishing 132 bytes(s) to hermes/tts/say

The source of the play script is pretty simple, but might be incorrect as I’m not sure how to test it (i.e. pipe a raw wave stream to it outside of Rhasspy):

    #!/usr/bin/python3
    import pygame
    import sys

    rawsample = sys.stdin.read()
    pygame.mixer.pre_init(frequency=44100, size=-16, channels=1)
    pygame.init()
    sample = pygame.mixer.Sound(buffer=rawsample)
    pygame.mixer.Sound.play(sample)
    pygame.quit()

Looks like playing sound using an external command is quite a hassle to get to work, because the command is run from within the docker container. The container lacks the dependencies I need to play (pygame or ffplay, both seem to work).

So I think playing based on MQTT messages would be a better approach, that way I can embed playing these sounds into my intent handling application.

However, I can’t seem to find any documentation on handling messages related to playing audio. Does anyone know of any examples? Or even docs of the messages that would be triggered?

I’ve made some more progress. I am now using MQTT for audio playing.

In my MQTT client (based off rhasspy-hermes-app) I respond to playBytes messages. I am receiving data, and I am able to play sound from the payload I receive!

The only thing not working well is that the sounds that Rhasspy sends through are of varying bitrates. The wake words are 44khz it seems, and the voice messages generated by NanoTTS are 16Khz. This is an issue, because playing those through the same mixer will not work: when I set my mixer to 44khz the TTS samples will play way too fast!

I am wondering if anyone has an idea on how to address this, although I realise that at this point, my problem has little to do with Rhasspy :slight_smile:

Have you tried creating your own Alsa config?
I had the same issues with playing Snapcast & Rhasspy at the same time.
Ended up basically just making “virtual” devices as to have both audio being able to play at the same time.
Works like a charm.

This is the asound.conf i currently have in one of my pies. Emplyoing the 3.5mm Output for both snapcast and tts

# The IPC key of dmix or dsnoop plugin must be unique
# If 555555 or 666666 is used by other processes, use another one

# use samplerate to resample as speexdsp resample is bad
defaults.pcm.rate_converter "samplerate"

pcm.!default {
    type asym
    playback.pcm "playback"
    capture.pcm "ac108"
}

pcm.playback {
    type plug
    slave.pcm "hw:ALSA"
}

# pcm.dmixed {
#     type dmix
#     slave.pcm "hw:0,0"
#     ipc_key 555555 
# }

pcm.ac108 {
    type plug
    slave.pcm "hw:seeed4micvoicec"
}

# pcm.multiapps {
#     type dsnoop
#     ac108-slavepcm "hw:1,0"
#     ipc_key 666666
# }

Actually when I look back now at this config, I dont understand anymore what I was doing. I know that I was playing around with slave pcm and the dmix type to make it work. But I forgot exactly what it was that made it work in the end. Hope this is helpful anyway…

You can manually resample the wake sounds, they are in the profile folders of Rhasspy.

I’ve managed to fix the issue and it was simpler than I had though. I was sending the buffer straight into PyGame, and that caused the issues. Instead I had to read the buffer into a BytesIO object and create a sound from that, like so:

f=BytesIO(buffer)
snd = pygame.mixer.Sound(f)

That removed both the frequency issues I had, as well as some clipping I heard.

Thanks for the responses everyone!

1 Like