Rebranded the Matrix Voice to esp32-rhasspy-satellite

romkabouter · January 20, 2021, 10:45pm

Well, the Matrix Voice can only play 44100 sample rate.
Receiving audio on that rate does not work well and you will hear hissing sounds very often.
I therefore recommend not higher than 22050 samplerate, the software does resampling to play it on the Matrix Voice. It does not do a very good job at that however.

The Matrix Voice is a nice device, but that lack of support for audio playing makes me say that it might be better to have a AudioKit or an M5 Atom Echo. Both of them are much cheaper. I do not own an AudioKit but it has the same I2S support as the M5 Atom Echo.

If you have no need to play audio then I think the Matrix Voice might be better. Although much bigger, is has shiny leds
The M5 Atom Echo on the other side is much more a finished device, coming in a nice little case an all.
The AudioKit does not have a case, neither has the Matrix Voice.

Basically it boils down to, as always, “it depends”.
If you want great sound quality, you can build a device yourself with a good speaker powered by an esp32 running this software. I will accept pull requests for new devices

rolyan_trauts · January 21, 2021, 12:27am

Still good @romkabouter as https://uk.banggood.com/ESP32-Aduio-Kit-WiFi-bluetooth-Module-ESP32-Serial-to-WiFi-Audio-Development-Board-with-ESP32-A1S-p-1449256.html is £10 and has a AC101 codec which should be pretty good quality audio.

The audiokit has bugged me for a while as for me the esp32 and audio is great but the rest of the dev board is redundant.

I can not find a simple small dev kit anywhere so I will let you know how soldering those go and to be honest I could just solder the audio inputs and 3.3 direct but those adapter boards are so cheap thought I would give it a go.

I got x2 with 2xa1s for £10 so will let you know how they go on after the slow boat from china

I also have a what is hopefully a killer KWS but so you don’t get trapped by the obsolesce of a system its from KW they will broadcast until they get a mqtt message to stop and that is it, no rhasppy specifics as a simple app server side will have to act as a bridge/relay.
I haven’t found vad apart from the ADF and haven’t checked how well that works so if not a server can still run vad on the incoming chunks.

alex4444 · January 21, 2021, 7:33am

@romkabouter Thank you for your fast reply. With audio quality a meant the quality of recorded audio that is send to rhasspy. So how is the speech recognition performance with rhasspy? Is the quality good enough to cover one room? What do you think is the best one?
Thank you!

romkabouter · January 21, 2021, 8:25am

Nice, I’d like to check it when your done

I think that is fine, I had no problems with Rhasspy with it. I was in a room about 30m2, but your miles may vary. It is also dependant on your surroundings.

romkabouter · January 21, 2021, 7:43pm

Small update: I have got the cores switched. Default core for tasks is 1.
The audio task should therefore not run on 1 but on 0 for better performance.
I was getting fallout off the messages.

Please check release 7.1

koan · January 26, 2021, 10:29am

@romkabouter I now noticed the same behaviour when using my laptop as a satellite, but just once. So I don’t think it’s an issue in your code: it’s just that it’s triggered much more frequently with the Atom Echo’s lower-quality microphone and/or speaker.

romkabouter · January 26, 2021, 11:26am

ok great, thank you for the feedback

rolyan_trauts · January 26, 2021, 12:17pm

The wifi code runs on core 0 and think it consumes a lot of the cores capability depending on action.
Should be OK but apparently you need to be careful as its quite easy to set off a core 0 panic.

romkabouter · January 26, 2021, 7:08pm

The task priority is set to 3, so the wifi task should be able to handle it.
Setting the audiostream task to core 1 gave to much pressure on core 1 (since that is the default core for arduino code if I am not mistaken)

In any case, with the streamtask pinned to 1 there audioflow was flaky.
With the task running on core 0, it works well.

rolyan_trauts · January 27, 2021, 12:00am

Yeah arduino code runs on Core 1 as Core 0 is running freertos & networking stack

PS I got the 2x Ai Thinker A1S modules for £4 each with a AC101 audio codec onboard the make the new raspberry Pico look a poor choice.
The breakout board where for standard esp32 so may just solder direct to the back with with my MS eyes and hands it might be optimism will just have to be patient.

romkabouter · January 27, 2021, 11:41am

Yeah, the pico does not cut it I guess. Good luck soldering!

rolyan_trauts · January 27, 2021, 2:06pm

Not for Audio or Wifi/Bt but pico has USB which the ESP32 doesn’t but ESP32 is also 240Mhz.

I have a A1S AudioDevKit to test on as not sure if I might get some small boards built or solder direct.

pip · January 29, 2021, 8:25pm

@romkabouter
Thank you so much for the rewrite. I have successfully compiled the code and flashed it on my M5 Echo. I can trigger recording via the button and Rhasspy successfully recognizes and handles the intent
Somehow I cannot make the remote hotword detection work (I understood that local hotwork detection was removed) but remote should work, right?

In the Rhasspy log I have:
1611951144: New client connected from 192.168.x.x as satellite_kitchenAudio (c1, k15, u’pip’).,
1611951144: New client connected from 192.168.x.x as satellite_kitchen (c1, k15, u’pip’).

I made sure the the M5 is set to remote hotword via the webinterface

Rhasspy itself:
AudioRecording: Hermes MQTT
WakeWord: Porcupine - Satellite id “satellite_kitchen” is listed (I have also tried to add satellite_kitchenAudio as well)

I have a second satellite set up using the Android App which works fine (Hotword via UDP Audio).

Do you have any idea why it does not work with the M5 Echo?

/edit
After reboot hotword detection does not work at all for my setup anymore, hmm

romkabouter · January 30, 2021, 10:26am

Ok, does recognition etc still work?
Because then the AudioSettings are correct for the M5.
Only satellite_kitchen is needed by the way.

I believe satellite_kitchen should be on all settings as well

pip · January 30, 2021, 11:04am

yes, with button everything works fine with the M5 (leds, speech to text, recognition,…) . I have disabled Audio Recording, now Hotword works again for Android satellite, but for M5 only button works. Both satellites are listed in all active Rhasspy server settings (all but audio recording). Hmm, I will experiment a little bit… maybe sth related with UPD streaming from other satellite

for hotword detection only hotword would be needed, right? Audio Recording on Server is not needed?

romkabouter · January 30, 2021, 11:53am

Well, the software publishes audio to Hermes MQTT, so Audio Recording on your server should be set to Hermes MQTT.
Can you post some screenshots from your settings?

I think it has to do with the UDP settings somehow.

pip · January 30, 2021, 1:09pm

okay, that’s what I thought.
Here is my profile.json
{ "command": { "webrtcvad": { "before_sec": "0.5", "max_sec": "7", "min_sec": "1", "silence_sec": "0.5" } }, "dialogue": { "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "rhasspy" }, "handle": { "satellite_site_ids": "satellite_cell,satellite_kitchen", }, "intent": { "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "fsticuffs" }, "microphone": { "system": "hermes" }, "mqtt": { "site_id": "Central" }, "sounds": { "command": { "play_arguments": "", "play_program": "pulse_tts.sh" }, "recorded": "", "system": "command" }, "speech_to_text": { "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "kaldi" }, "text_to_speech": { "nanotts": { "language": "en-US" }, "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "nanotts", "wavenet": { "sample_rate": "44100" } }, "wake": { "porcupine": { "keyword_path": "computer_linux.ppn", "sensitivity": "1.0", "udp_audio": "172.17.0.2:20000:satellite_cell" }, "satellite_site_ids": "satellite_cell,satellite_kitchen", "system": "porcupine" } }

satellite_cell works as intended and M5 (satellite_kitchen) only via button

/edit
Hmm, maybe it works afterall but only very bad (maybe due to noise or sth). I just managed to trigger via the hotword on M5 once.

romkabouter · January 30, 2021, 1:25pm

Did you try with version 7.1 are earlier?
I suggest using 22050 as google wavenet by the way. Higher than that will probably cause static sound (hissing) on the M5

Also on wakeword I see udp_audio set, might also be an issue. I do not know, never tried.

pip · January 30, 2021, 1:35pm

yes, version 7.1. I just managed to use the wakeword on the M5. It seems to work but quite bad. I had to put the M5 in a box to kind of isolate it and then after a few tries it picked up the wakeword. Only wakeword is that bad, the comand itself is picked up without any problems (when using the button).
Anyways, in general everything seems to work. I will try to find out whats interfering.
Thanks for your help and your great work on the code, I really appreciate it.

romkabouter · January 30, 2021, 1:38pm

Maybe try some different wakeword and/or systems.
Since the commands are picked up ok that might be the issue.

The device does nothing more than stream audio.