Unicode decode error when training voice2json

JGKK · April 28, 2020, 9:40am

Hello @synesthesiam I have a strange phenomenon when using voice2json.
When I try to train I started to get this error:

voice2json --profile /home/pi/de_kaldi-zamia-1.0 train-profile
Traceback (most recent call last):
  File "voice2json/__main__.py", line 1766, in <module>
  File "voice2json/__main__.py", line 353, in main
  File "voice2json/__main__.py", line 377, in train
  File "voice2json/train/__init__.py", line 112, in train_profile
  File "voice2json/train/__init__.py", line 598, in _get_intents
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 16: ordinal not in range(128)
[2197] Failed to execute script __main__

########################################
vocab_dict <stdout>:

this is with a newly downloaded profile and the example sentences.ini.
It worked before on the same machine.

Its on a raspberry pi 4 with buster.

Any ideas?
Johannes

JGKK · April 28, 2020, 9:53am

Ok here is something curious @synesthesiam training works if i ssh into the pi from my ipad using shelly but it doesnt if i ssh into it from my mac terminal.
OK this seems to be a zsh problem where zsh does something to the locale settings.

JGKK · April 29, 2020, 7:13pm

The same happens when trying to train a german profile from the docker container as the docker file hasn’t set the locales either. I confirmed that I can use a profile trained on another installation with the docker but I can’t train.
I think the fix would be quite easy by adding the locales to the docker file.
It would be great if this could be fixed.