how to spend a large numerical value without having a huge recognition time
example “one hundred and thirty million two hundred four thousand six hundred and eighty six”
Arpagor
how to spend a large numerical value without having a huge recognition time
example “one hundred and thirty million two hundred four thousand six hundred and eighty six”
Arpagor
Recognition time is purely processor speed because I can approx quote with current deepspeech a Pi4 is approx just faster than real time and it can take a streaming input.
So firstly if you don’t stream then you have to wait for sentence end before send then recognition is sentence length + recognition length.
If you stream you will have to check some perf data on the SoC and ASR you use but there it would be max chunk length + recognition length.
You also then have NLU process time and TTS process time and again its all dependent on the SoC you use.
I found a solution for those who are interested:
currently the latest version (2.5.8) uses Kaldi FST and allows to have intents with number ranges
such as
<[testnumber]
test (1.. 10) {value}
But I’ve noticed that if you need a wide enough range … training takes a huge amount of time and the /profiles/en/kaldi/language_model.txt file exponentially expands.
So I looked for a solution that was faster and less greedy by using numbers in text form
I was inspired by other posts and here is the result for 1 to 9999 (modifiable to millions etc …):
new sentences files -> name Numbers
[strchiffres]
deux_a_neuf = ( deux | trois | quatre | cinq | six | sept | huit | neuf )
un_a_neuf = ( (un | et un) | (une | et une) | <deux_a_neuf> )
dix_dixneuf = ( dix | onze | douze | treize | quatorze | quinze | seize | dix sept | dix huit | dix neuf )
diz_simple = ( vingt | trente | quarante | cinquante | soixante )
diz_double = ( soixante dix | quatre vingt )
un_a_cent = ( <un_a_neuf> | <dix_dixneuf> | <diz_simple> [ <un_a_neuf> ] | <diz_double> [ ( <un_a_neuf> | <dix_dixneuf> ) ] )
cent = ( cent | cents )
mille = mille
#pour million
#million = (million | millions)
#nombre = [ [ <un_a_neuf> ] ] [ [ <un_a_cent> ] ] [ [ <deux_a_neuf> ] ] [ [ <un_a_cent> ] ] [ [ <deux_a_neuf> ] ] [ <un_a_cent> ]
#sans million
nombre = [ [ <deux_a_neuf> ] ] [ [ <un_a_cent> ] ] [ [ <deux_a_neuf> ] ] [ <un_a_cent>
the test sentence :
[testNumber]
see <strchiffres.number> {value}
it’s much faster and just need to go through text2num to get the numeric value
Take this test and you will understand !!!
[testStr_Chiffre]
see <strchiffres.nombre> {value}
[testNum]
displays (1..100000) {value}
already the duration of the training of (1..100000) you will see
and the temples of response to
see 99000 (snapshot)
displays 99000 !!! > 13 seconds
Arpagor