I have built a new wyoming asr building on the whipser version…
i am ready to test… it starts, (using script/run) but the handler isnt started…
the source for wyoming-faster-whisper doesn’t respond to tcp connections (or websocket) , from its readme
Run a server anyone can connect to:
script/run --model tiny-int8 --language en --uri 'tcp://0.0.0.0:10300' --data-dir /data --download-dir /data
and as I modeled my asr on this, it also doesn’t respond…
I added debug right before the server.run()
INFO:__main__:Ready
INFO:__main__:Info(asr=[AsrProgram(name='faster-whisper', attribution=Attribution(name='Guillaume Klein', url='https://github.com/guillaumekln/faster-whisper/'), installed=True, description='Faster Whisper transcription with CTranslate2', models=[AsrModel(name='tiny-int8', attribution=Attribution(name='rhasspy', url='https://github.com/rhasspy/models/'), installed=True, description='tiny-int8', languages=['af', 'am', 'ar', 'as', 'az', 'ba', 'be', 'bg', 'bn', 'bo', 'br', 'bs', 'ca', 'cs', 'cy', 'da', 'de', 'el', 'en', 'es', 'et', 'eu', 'fa', 'fi', 'fo', 'fr', 'gl', 'gu', 'ha', 'haw', 'he', 'hi', 'hr', 'ht', 'hu', 'hy', 'id', 'is', 'it', 'ja', 'jw', 'ka', 'kk', 'km', 'kn', 'ko', 'la', 'lb', 'ln', 'lo', 'lt', 'lv', 'mg', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms', 'mt', 'my', 'ne', 'nl', 'nn', 'no', 'oc', 'pa', 'pl', 'ps', 'pt', 'ro', 'ru', 'sa', 'sd', 'si', 'sk', 'sl', 'sn', 'so', 'sq', 'sr', 'su', 'sv', 'sw', 'ta', 'te', 'tg', 'th', 'tk', 'tl', 'tr', 'tt', 'uk', 'ur', 'uz', 'vi', 'yi', 'yo', 'zh'])])], tts=[], handle=[], wake=[])
no other source file change.
shouldn’t I be able to send a {type:“describe”} and get a response?
(or any of the three non audio ones… describe, transcribe and audioStop
post and put time out
if I have to test by using HA, how do I get my asr into the selection list for assistant config?
in my asr I added a debugging clause to the end of the
async def handle_event(self, event: Event) → bool: function, so if called with an unknown event it will output… nothing…
the handler.py is never started…added debug there too
SO… I cleaned up some of the python and now get my asr handler to fire…
but… neither it nor faster-whisper respond to the transcribe event…
I added debugging and its not handled correctly
DEBUG:__main__:Info(asr=[AsrProgram(name='google-streaming', attribution=Attribution(name='Sam Detweiler', url='https://github.com/sdetweil/google-streaming'), installed=True, description='google cloud streaming asr', models=[AsrModel(name='google-streaming', attribution=Attribution(name='rhasspy', url='https://github.com/rhasspy/models/'), installed=True, description='google cloud streaming asr', languages=['af', 'am', 'ar', 'as', 'az', 'ba', 'be', 'bg', 'bn', 'bo', 'br', 'bs', 'ca', 'cs', 'cy', 'da', 'de', 'el', 'en', 'es', 'et', 'eu', 'fa', 'fi', 'fo', 'fr', 'gl', 'gu', 'ha', 'haw', 'he', 'hi', 'hr', 'ht', 'hu', 'hy', 'id', 'is', 'it', 'ja', 'jw', 'ka', 'kk', 'km', 'kn', 'ko', 'la', 'lb', 'ln', 'lo', 'lt', 'lv', 'mg', 'mi', 'mk', 'ml', 'mn', 'mr', 'ms', 'mt', 'my', 'ne', 'nl', 'nn', 'no', 'oc', 'pa', 'pl', 'ps', 'pt', 'ro', 'ru', 'sa', 'sd', 'si', 'sk', 'sl', 'sn', 'so', 'sq', 'sr', 'su', 'sv', 'sw', 'ta', 'te', 'tg', 'th', 'tk', 'tl', 'tr', 'tt', 'uk', 'ur', 'uz', 'vi', 'yi', 'yo', 'zh'])])], tts=[], handle=[], wake=[])
I sent a describe like this
{ "type": "describe"}
and got this on the handler
DEBUG:wyoming_google.handler:Sent info
then sent a transcribe
{"type":"transcribe","data":{"language":"en","name":"foo"}}
and got my extra debugging (from the end of the handler)
INFO:wyoming_google.handler:unknown event=transcribe name=foo language=en
_LOGGER.info("unknown event="+event.type+" name="+event.data["name"]+" language="+event.data["language"])
return True
but the code here https://github.com/rhasspy/wyoming/blob/master/wyoming/asr.py
says its ‘transcribe’
I changed the handler code from
if Transcribe.is_type(event.type):
to
if event.type == "transcribe":
and it works…
SO the Transcribe.is_type() is failing for some reason
the design of the handler appears to support persistent data across events…
the whisper handler does
if AudioChunk.is_type(event.type):
if not self.audio:
_LOGGER.debug("Receiving audio")
chunk = AudioChunk.from_event(event)
chunk = self.audio_converter.convert(chunk)
self.audio += chunk.audio ## adding on to buffer
but in my handler, the self. xxx variables are not persistent…
(dummy transcription)
on audioStop, I set self.text=“something”
it gets printed out and returned (altho the doc says you need a transcript event to get the results)
the wyoming-faster-whipser handler does not handle transcipt events, but returns the data only on audioStop
anyhow, i added a transcript event to return the text… and its null… oops…
my init sets it that way, same as the whipser one sets self.audio=bytes().
so, what is wrong
just like the whipser asr I am ‘counting’ on data persistence between events… as google streaming supports intermediate results… so it can be faster to produce results than the existing asr’s
There was a misspelling of “transcribe” in an earlier version of Wyoming
It’s fixed now, but this is why Transcribe.is_type
was failing for you.
A new handler is spun up for each TCP connection. So if you’re breaking the connection between each message, the self
variables will not persist.
thanks… I suspected as much on the variables… and I had 1.4.2, 1.5.0 works better
BUT there was a breaking change on the audio event names… wtf… now have embedded dashes
audio-stop
AND the package doesn’t have an updated CHANGELOG, and shows 1.3 but master is 1.5
and the answer on testing, is nothing existed… I created a new test app to send two wav files to the asr to verify the reco…