Hi,
i try to use speech to text via mqtt protocoll but i don’t get a result.
According to the docs Speech to Text - Rhasspy
I first should send hermes/asr/startListening
with a unique sessionId
then the data in chunks to hermes/audioServer/<siteId>/audioFrame
i also tried hermes/audioServer/<siteId>/<sessionId>/audioSessionFrame
and to finish up hermes/asr/stopListening
What i tried: (“default” is the name of my base station)
topic: "hermes/asr/startListening"
payload: "{"sessionId":"0cfcb630-df3b-46c5-882b-0476d7102de5","stopOnSilence":false,"sendAudioCaptured":true,"siteId":"default"}"
(I chose 8236 because i’ve seen rhasspy using this)
topic: “hermes/audioServer/default/audioFrame”
payload: buffer[8236]
topic: "hermes/asr/stopListening"
payload: "{"sessionId":"0cfcb630-df3b-46c5-882b-0476d7102de5","siteId":"default"}"
I also tried to call recordingFinished
topic: "rhasspy/asr/recordingFinished"
payload: "{"sessionId":"415449cf-f11a-461b-8cff-a4980ba15662","siteId":"default"}"
Afterwards the base station doesn’t show any logs or posts anything on neither hermes/asr/textCaptured nor hermes/error/asr
I’m not sure if the siteId in the MQTT Messages is correctly - i should use the id of the base station which should transcripe the speech into text, right?
Does every data chunk need a wav header in the beginning? is the other question.
Would be very nice iv someone can help me!
@synesthesiam please take a look on this