Rebranded the Matrix Voice to esp32-rhasspy-satellite

Well, the Matrix Voice can only play 44100 sample rate.
Receiving audio on that rate does not work well and you will hear hissing sounds very often.
I therefore recommend not higher than 22050 samplerate, the software does resampling to play it on the Matrix Voice. It does not do a very good job at that however.

The Matrix Voice is a nice device, but that lack of support for audio playing makes me say that it might be better to have a AudioKit or an M5 Atom Echo. Both of them are much cheaper. I do not own an AudioKit but it has the same I2S support as the M5 Atom Echo.

If you have no need to play audio then I think the Matrix Voice might be better. Although much bigger, is has shiny leds :smiley:
The M5 Atom Echo on the other side is much more a finished device, coming in a nice little case an all.
The AudioKit does not have a case, neither has the Matrix Voice.

Basically it boils down to, as always, “it depends”.
If you want great sound quality, you can build a device yourself with a good speaker powered by an esp32 running this software. I will accept pull requests for new devices :slight_smile:

Still good @romkabouter as https://uk.banggood.com/ESP32-Aduio-Kit-WiFi-bluetooth-Module-ESP32-Serial-to-WiFi-Audio-Development-Board-with-ESP32-A1S-p-1449256.html is £10 and has a AC101 codec which should be pretty good quality audio.

The audiokit has bugged me for a while as for me the esp32 and audio is great but the rest of the dev board is redundant.

I can not find a simple small dev kit anywhere so I will let you know how soldering those go and to be honest I could just solder the audio inputs and 3.3 direct but those adapter boards are so cheap thought I would give it a go.

I got x2 with 2xa1s for £10 so will let you know how they go on after the slow boat from china

I also have a what is hopefully a killer KWS but so you don’t get trapped by the obsolesce of a system its from KW they will broadcast until they get a mqtt message to stop and that is it, no rhasppy specifics as a simple app server side will have to act as a bridge/relay.
I haven’t found vad apart from the ADF and haven’t checked how well that works so if not a server can still run vad on the incoming chunks.

@romkabouter Thank you for your fast reply. With audio quality a meant the quality of recorded audio that is send to rhasspy. So how is the speech recognition performance with rhasspy? Is the quality good enough to cover one room? What do you think is the best one?
Thank you!

Nice, I’d like to check it when your done :slight_smile:

I think that is fine, I had no problems with Rhasspy with it. I was in a room about 30m2, but your miles may vary. It is also dependant on your surroundings.

Small update: I have got the cores switched. Default core for tasks is 1.
The audio task should therefore not run on 1 but on 0 for better performance.
I was getting fallout off the messages.

Please check release 7.1

@romkabouter I now noticed the same behaviour when using my laptop as a satellite, but just once. So I don’t think it’s an issue in your code: it’s just that it’s triggered much more frequently with the Atom Echo’s lower-quality microphone and/or speaker.

ok great, thank you for the feedback :slight_smile:

The wifi code runs on core 0 and think it consumes a lot of the cores capability depending on action.
Should be OK but apparently you need to be careful as its quite easy to set off a core 0 panic.

The task priority is set to 3, so the wifi task should be able to handle it.
Setting the audiostream task to core 1 gave to much pressure on core 1 (since that is the default core for arduino code if I am not mistaken)

In any case, with the streamtask pinned to 1 there audioflow was flaky.
With the task running on core 0, it works well.

Yeah arduino code runs on Core 1 as Core 0 is running freertos & networking stack

PS I got the 2x Ai Thinker A1S modules for £4 each with a AC101 audio codec onboard the make the new raspberry Pico look a poor choice.
The breakout board where for standard esp32 so may just solder direct to the back with with my MS eyes and hands it might be optimism will just have to be patient.

Yeah, the pico does not cut it I guess. Good luck soldering!

Not for Audio or Wifi/Bt but pico has USB which the ESP32 doesn’t but ESP32 is also 240Mhz.

I have a A1S AudioDevKit to test on as not sure if I might get some small boards built or solder direct.

@romkabouter
Thank you so much for the rewrite. I have successfully compiled the code and flashed it on my M5 Echo. I can trigger recording via the button and Rhasspy successfully recognizes and handles the intent :slight_smile:
Somehow I cannot make the remote hotword detection work (I understood that local hotwork detection was removed) but remote should work, right?

In the Rhasspy log I have:
1611951144: New client connected from 192.168.x.x as satellite_kitchenAudio (c1, k15, u’pip’).,
1611951144: New client connected from 192.168.x.x as satellite_kitchen (c1, k15, u’pip’).

I made sure the the M5 is set to remote hotword via the webinterface

Rhasspy itself:
AudioRecording: Hermes MQTT
WakeWord: Porcupine - Satellite id “satellite_kitchen” is listed (I have also tried to add satellite_kitchenAudio as well)

I have a second satellite set up using the Android App which works fine (Hotword via UDP Audio).

Do you have any idea why it does not work with the M5 Echo?

/edit
After reboot hotword detection does not work at all for my setup anymore, hmm

Ok, does recognition etc still work?
Because then the AudioSettings are correct for the M5.
Only satellite_kitchen is needed by the way.

I believe satellite_kitchen should be on all settings as well

yes, with button everything works fine with the M5 (leds, speech to text, recognition,…) . I have disabled Audio Recording, now Hotword works again for Android satellite, but for M5 only button works. Both satellites are listed in all active Rhasspy server settings (all but audio recording). Hmm, I will experiment a little bit… maybe sth related with UPD streaming from other satellite :slight_smile:

for hotword detection only hotword would be needed, right? Audio Recording on Server is not needed?

Well, the software publishes audio to Hermes MQTT, so Audio Recording on your server should be set to Hermes MQTT.
Can you post some screenshots from your settings?

I think it has to do with the UDP settings somehow.

okay, that’s what I thought.
Here is my profile.json
{ "command": { "webrtcvad": { "before_sec": "0.5", "max_sec": "7", "min_sec": "1", "silence_sec": "0.5" } }, "dialogue": { "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "rhasspy" }, "handle": { "satellite_site_ids": "satellite_cell,satellite_kitchen", }, "intent": { "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "fsticuffs" }, "microphone": { "system": "hermes" }, "mqtt": { "site_id": "Central" }, "sounds": { "command": { "play_arguments": "", "play_program": "pulse_tts.sh" }, "recorded": "", "system": "command" }, "speech_to_text": { "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "kaldi" }, "text_to_speech": { "nanotts": { "language": "en-US" }, "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "nanotts", "wavenet": { "sample_rate": "44100" } }, "wake": { "porcupine": { "keyword_path": "computer_linux.ppn", "sensitivity": "1.0", "udp_audio": "172.17.0.2:20000:satellite_cell" }, "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "porcupine" } }

satellite_cell works as intended and M5 (satellite_kitchen) only via button

/edit
Hmm, maybe it works afterall but only very bad (maybe due to noise or sth). I just managed to trigger via the hotword on M5 once.

Did you try with version 7.1 are earlier?
I suggest using 22050 as google wavenet by the way. Higher than that will probably cause static sound (hissing) on the M5

Also on wakeword I see udp_audio set, might also be an issue. I do not know, never tried.

yes, version 7.1. I just managed to use the wakeword on the M5. It seems to work but quite bad. I had to put the M5 in a box to kind of isolate it and then after a few tries it picked up the wakeword. Only wakeword is that bad, the comand itself is picked up without any problems (when using the button).
Anyways, in general everything seems to work. I will try to find out whats interfering.
Thanks for your help and your great work on the code, I really appreciate it.

Maybe try some different wakeword and/or systems.
Since the commands are picked up ok that might be the issue.

The device does nothing more than stream audio.