How to spend a large numerical value without having a huge recognition time

arpagor62970 · January 11, 2021, 6:37am

how to spend a large numerical value without having a huge recognition time
example “one hundred and thirty million two hundred four thousand six hundred and eighty six”

Arpagor

rolyan_trauts · January 11, 2021, 7:10am

Recognition time is purely processor speed because I can approx quote with current deepspeech a Pi4 is approx just faster than real time and it can take a streaming input.

So firstly if you don’t stream then you have to wait for sentence end before send then recognition is sentence length + recognition length.
If you stream you will have to check some perf data on the SoC and ASR you use but there it would be max chunk length + recognition length.

You also then have NLU process time and TTS process time and again its all dependent on the SoC you use.

arpagor62970 · January 14, 2021, 6:10pm

I found a solution for those who are interested:
currently the latest version (2.5.8) uses Kaldi FST and allows to have intents with number ranges
such as
<[testnumber]
test (1.. 10) {value}

But I’ve noticed that if you need a wide enough range … training takes a huge amount of time and the /profiles/en/kaldi/language_model.txt file exponentially expands.
So I looked for a solution that was faster and less greedy by using numbers in text form
I was inspired by other posts and here is the result for 1 to 9999 (modifiable to millions etc …):
new sentences files -> name Numbers

[strchiffres]
deux_a_neuf = ( deux | trois | quatre | cinq | six | sept | huit | neuf )
un_a_neuf = ( (un | et un) | (une | et une) | <deux_a_neuf> )
dix_dixneuf = ( dix | onze | douze | treize | quatorze | quinze | seize | dix sept | dix huit | dix neuf )
diz_simple = ( vingt | trente | quarante | cinquante | soixante )
diz_double = ( soixante dix | quatre vingt )
un_a_cent = ( <un_a_neuf> | <dix_dixneuf> | <diz_simple> [ <un_a_neuf> ] | <diz_double> [ ( <un_a_neuf> | <dix_dixneuf> ) ] )
cent = ( cent | cents )
mille = mille
#pour million
#million = (million | millions)
#nombre = [ [ <un_a_neuf> ] ] [ [ <un_a_cent> ] ] [ [ <deux_a_neuf> ] ] [ [ <un_a_cent> ] ] [ [ <deux_a_neuf> ] ] [ <un_a_cent> ]
#sans million
nombre = [ [ <deux_a_neuf> ] ] [ [ <un_a_cent> ] ] [ [ <deux_a_neuf> ] ] [ <un_a_cent>

the test sentence :

[testNumber]
see <strchiffres.number> {value}

it’s much faster and just need to go through text2num to get the numeric value

Take this test and you will understand !!!

[testStr_Chiffre]
see <strchiffres.nombre> {value}

[testNum]
displays (1..100000) {value}

already the duration of the training of (1..100000) you will see
and the temples of response to
see 99000 (snapshot)
displays 99000 !!! > 13 seconds

Arpagor