Well, the Matrix Voice can only play 44100 sample rate.
Receiving audio on that rate does not work well and you will hear hissing sounds very often.
I therefore recommend not higher than 22050 samplerate, the software does resampling to play it on the Matrix Voice. It does not do a very good job at that however.
The Matrix Voice is a nice device, but that lack of support for audio playing makes me say that it might be better to have a AudioKit or an M5 Atom Echo. Both of them are much cheaper. I do not own an AudioKit but it has the same I2S support as the M5 Atom Echo.
If you have no need to play audio then I think the Matrix Voice might be better. Although much bigger, is has shiny leds 
The M5 Atom Echo on the other side is much more a finished device, coming in a nice little case an all.
The AudioKit does not have a case, neither has the Matrix Voice.
Basically it boils down to, as always, “it depends”.
If you want great sound quality, you can build a device yourself with a good speaker powered by an esp32 running this software. I will accept pull requests for new devices 