Yet Another ASR WeNet

rolyan_trauts · November 24, 2022, 11:52pm

Look quite interesting as quite new and currently only has models for En, Cn.
Acknowledge

We borrowed a lot of code from ESPnet for transformer based modeling.
We borrowed a lot of code from Kaldi for WFST based decoding for LM integration.
We referred EESEN for building TLG based graph for LM integration.
We referred to OpenTransformer for python batch inference of e2e models.

As they state they have started afresh by borowing a lot of code from older previous ASR and tried to create a best of as haven’t checked it out yet but its on the radar.
OPenAi’s Whisper seems king of the hill, but they only give the model and not the training routines.

It also has an Android apk that you could test on a mobile if you have one as should give approx reference to what it would be like on Pi like hardware depending on your phone

Prob the easiest x86 demon also
Clone the repo
Download the gigaspeech https://wenet-1256283475.cos.ap-shanghai.myqcloud.com/models/gigaspeech/20210728_u2pp_conformer_libtorch.tar.gz
and extract into the repo folder
The folder name is slightly different than stated so just copy the folder name you have extracted
run the docker command
As its says runtime/libtorch/web/templates/index.html in the browser directly
Your up and running and seems extremely light on my x86 machine but would have to test on a pi.

Really nice repo simple and concise with some very good if limited in languages SOTA ASR