Speaker identification

Hi all!
Has anyone ever had any experience with speaker recognition/identification/verification?

Kaldi provides examples to do this Using i-vectors and x-vectors but it is a bit over my pay grade :sweat_smile:

Anyway, this could be an awesome addition to Rhasspy’s features to allow for user control and avoid the kids from wreaking havoc in the house.

Cheers :blush:

1 Like

I’ve not tested this yet but with snips I have three wake words, even same phonèmes but different voice and I can run different actions according to who ask something.

Using Snips or Snowboy personal models are indeed a way to do this but I’m not confortable using a solution that is not open sourced.

I’d like a self trainable solution without dependencies on a remote third party service that can disappear at any time. Snips is gone (even if their packages are still functional for now) and Snowboy does not seem to be maintained anymore. How long until their API is deprecated? With the kids growing up their voice print is surely gonna change.

Kaldi’s recipe is worth a try but maybe someone with expertise in this field can help create a new Rhasspy service for this.

Wasn’t aware snowboy wasn’t maintained !!
Indeed we need a viable robust solution !

Hi,
Look this

I tried it and was not able to make it work unfortunately… Have you succeeded in using it?

I also had a problem I think it’s vad that shit.
in a recordings folder
create marie folder and put a wav of marie’s voice
create a polo folder and put a polo voice wav
all without silence (with sox I do:
/ usr / bin / sox -t alsa MONmic /home/poppy/MyProgram/tmp/marie.wav silence 1 0.1 1% 2 1.0 5% 4t
change MONmic)
and try with python with a test.wav
Example to adapt

import …
from piwho import recognition
CURRENT_DIR_PATH = os.path.dirname(os.path.realpath(file))
DATA_DIR_PATH = os.path.join(CURRENT_DIR_PATH, ‘recordings/’)

def find_speaker():
# save WAV file
wavefile = “test.wav”
os.system(“arecord -d 5 -f cd -t wav “+DATA_DIR_PATH+‘test.wav’)
recog = recognition.SpeakerRecognizer()
name = []
name = recog.identify_speaker(DATA_DIR_PATH+‘test.wav’)
print(name[0]) # Recognized speaker
print(name[1]) # Second best speaker
dictn = recog.get_speaker_scores()
print (dictn)
#{‘ABC’:‘0.838262623’,‘CDF’:‘1.939837286’}
if name[0] == ‘polo’:
print (” c’est polo qui parle!”)

Thanks. I’ll try that :blush: