Newbie audio input quality question

parasaurolophus · June 30, 2020, 1:10am

Very new to Rhasspy. Have everything working intermittently end-to-end using the containerized distro of 2.5.0:

PyAudio -> Mycroft Precise -> Kaldi -> Fistifuffs -> MQTT

I have also experimented with Porcupine in place of Mycroft and Pocketsphinx in place of Kaldi with similar results.

The problem is that command recording fails more than it succeeds. Wake word triggering works flawlessly 100% of the time. However, while subsequent speech to text and intent processing work occasionally, most of the time the pipeline hangs immediately after the wake word. Looking at the logs makes it appear that the audio recording process thinks there is just silence or noise rather than an issue with understanding my pronunciation or the like. When it works it works every time. But most of the time it appears not even to try.

I have only been at this for a few hours so don’t have a lot of data. The pattern appears to be that the end-to-end processing only is successful when the environment is absolutely silent. If there is any backgrouund noise – even the subliminal hum of the central A/C fan – then the recording process is stymied.

If I am correct, this seems to be a configuration issue with how the speech to text stage processes the audio input? The hardware / software stack seems up to the challenge of picking out a voice from background noise (even when a ceiling fan and a TV are adding to the confusion). But the speech processing seems to fail from that point onward under any but the most (unrealistically) ideal conditions.

There seem to be very few tunable parameters in the audio and speech recognition sections of the web UI. Are there lower-level settings that might help?

Note that I have done my initial experiments using only the not-so-impressive mic built into a laptop. I am awaiting delivery of a schmancy ReSpeaker mic array that I plan to use in production. Is it possible that noise cancellation and similar pre-processing built into such a device will solve my problem?

Fingers crossed, since I would love to get this working and get Google Assistant out of my home!

Thanks in advance for any insights

FredTheFrog · July 2, 2020, 12:08am

You’re getting further than am I. My JabraSpeak 410 USB speaker/mic arrived today, so I quickly plugged it into the Pi, loaded the USB rules to recognize it properly every boot, and placed the .asoundrc file as needed. I can now play WAV audio files from Raspbian and from within the rhasspy Docker container. But I cannot for the life of me get the Porcupine wake word to be recognized. Now, I know the USB microphone is working. If I press the record button in the Rhasspy web GUI, it will record my voice, and the words are recognized rather well. It just does not respond to the ‘porcupine’ wake word as expected.

Well, this snippet from supervisord.log is informative:

2020-07-02 01:10:21,221 INFO exited: wake_word (exit status 1; not expected)
2020-07-02 01:10:21,247 INFO spawned: 'wake_word' with pid 5121
2020-07-02 01:10:21,862 INFO exited: wake_word (exit status 1; not expected)

It does this several times, then sort of fails the wake_word task altogether. That would certainly explain why it doesn’t seem to be working correctly.

And it would be that I did not (or Rhasspy did not) download the necessary files for Porcupine. After using wget to grab the three necessary files from github, I’ll see what happens next. Nope, still no bueno.

2020-07-02 01:33:54,820 INFO exited: wake_word (exit status 1; not expected)
2020-07-02 01:33:55,821 INFO gave up: wake_word entered FATAL state, too many start retries too quickly

FYI, the contents of my profile.json for ‘wake’:

    "wake": {
        "porcupine": {
            "keyword_path": "porcupine/porcupine_raspberry-pi.ppn",
            "library_path": "porcupine/libpv_porcupine.so",
            "model_path": "porcupine/porcupine_params.pv",
            "sensitivity": "0.65"
        },
        "system": "porcupine"
    },

I thought it was a file download problem with the libpv_porcupine.so file (it came down as HTML instead of binary) but corrected that issue and still get the restarts.

It seems it MAY be an RPi4 error?
Github issue 1104898