Continue conversation and unknown sentences

Siparker · December 5, 2021, 4:28pm

I wanted to try and get a bit further with my rhasspy install and one feature i would like to do is a two part conversational style answer / reply. once i get how to do that im sure i could expand it out.

So wake rhasspy and say “Search Google”
Rhasspy recognises the intent with fisticuffs and then needs to tts “what would you like to search for”
restarting the listen so I can say “worlds tallest building.”

The bits I don’t quite understand atm are would I continue to use home assitant for the intent handling here? so i would receive the rhasspy_searchgoogle event and then restart the session by publishing to to hermes/dialogueManager/startSession with the Site ID as per this thread Start Conversation with TTS and start listening

or would i use the rhasspy dialogue management to do it?

If I continue the dialogue through HA with the start session would I pass across the original intent that was matched? so I know what the continuation is meant to be related to? Is this a HA thing or should this be done within Rhasspy.

Last query is if I then allow any sort of random google search (how i parse and return the content is another story but for this example imagine it can be done) how would this get back to home assistant. as Rhasspy only recognises its trained sentences the use of a potentially random set of queries for the search phrase in this case would fail as not recognised?

or am i missing something?

If anyone has any examples in Rhassspy HA or Node-red i would welcome the knowledge.

romkabouter · December 5, 2021, 8:07pm

If yo have an event in which you want to continue a session, you can use hermes/dialogueManager/continueSession as topic and pass the original sessionId

Here is the mqtt api : Reference - Rhasspy
And here an example of someone who is using it.

That is not exactly what you require, but might help.
There is only one problem with google search though: the text you want to be searched is probably not in intent, so after the first continueSession you will most likely get a IntentNotRecognized just like you say.
You will have to find a way to transform that last bit into an actual search on google and process the response.
I think that will be the hardest part.

Siparker · December 6, 2021, 1:25pm

Hi @romkabouter Thanks for the quick reply.
I have just started to use Nodered which has opened up quite a lot of additional possibilities for me.
Current plan would be to accept the usual intent through homeassistant with nodered listening for the specific event to start the google search.
Nodered then takes the session id and site id to continue session listening for the search.
Figure out how to make rhasspy allow just audio streaming for this section. ignoring any intents etc and just sending the audio.
again nodered gets the input and finally…
Api call to the google assistant api through python to get the result. and send this back to the siteid with TTS

I guess i need to know how rhasspy will deal with unknown auidio. or if i should even be using rhasspy at that point or trying to bypass it. the search section i just need the audio to be piped to somewhere nodered can access it.

romkabouter · December 6, 2021, 3:30pm

If you want to use Node-Red, you can just as well let NR handle that intent
No need for HA to be involved.

Rhasspy has no audio passthrough I believe, but when there is no intent recognized the transcibed text will be in the mqtt message.
So you can let NR handle a IntentNotRecognized and let it do somethign with that text.
Interesting topic, I’ll see if I can work something out. Just to see if it is possible

sajov · January 27, 2022, 5:30pm

@Siparker
Hi did you got it working?
i tried also something like this but instead of not recognized, i tried
hotword → ask google (as sentence) → but here i struggled with start recording session and pass audio stream via NR to google assist.

jens-schiffke · January 27, 2022, 6:41pm

One option would be to subscribe to the speech files when needed. You can use continue session for this. There comes an intentNotRecognized. The sendIntentNotRecognized true option leaves the session open and you can complete the session normally.

Rhasspy, search on Google.
…
hermes/dialogueManager/sessionStarted
hermes/nlu/intentParsed {“input”: “search on google”, “intent”: {“intentName”: “google_search”, “confidenceScore”: 1.0}, “siteId”: “room”,…
…
hermes/dialogueManager/continueSession {“customData”: “rhasspy”,“siteId”: room",“text”: “what do you want to know?”,“sendIntentNotRecognized”: true,“intentFilter”:[“google_search_action”…
if hermes/asr/startListening then subscribe hermes/audioServer/room/audioFrame
if hermes/asr/stopListening then unsubscribe hermes/audioServer/room/audioFrame
send received audio data to Google-Audio-Search
hermes/tts/say {“text”: “$google-reply”, “siteId”: “room”,“sessionId”: "…

Greetings, Jens

sajov · January 27, 2022, 7:29pm

@jens-schiffke
Thank you, i will try that

jens-schiffke · January 28, 2022, 8:56pm

Unfortunately, my attempt doesn’t work with audioFrame. It works with playBytes. What can be the reason? Is the recording stream different?

#!/usr/bin/env python
import paho.mqtt.client as mqtt
import wave
import re
import io
MQTThost ='192.168.100.1'
MQTTport = 1883
soundfile = wave.open('/root/audio.wav', 'wb')
soundfile.setframerate(16000)
soundfile.setsampwidth(2)
soundfile.setnchannels(1)

def on_connect(client, userdata, flags, rc):
    print("Connected with result code " + str(rc))
    client.subscribe("hermes/asr/startListening")
    client.subscribe("hermes/asr/stopListening")
def on_message(client, userdata, msg):
    if msg.topic == "hermes/asr/startListening":
      client.subscribe("hermes/audioServer/+/audioFrame")
    elif msg.topic == "hermes/asr/stopListening":
      client.unsubscribe("hermes/audioServer/+/audioFrame")
    elif bool(re.search(r'^hermes\/audioServer\/.*?\/audioFrame$', msg.topic)):
      with io.BytesIO(msg.payload) as wav_buffer:
        with wave.open(wav_buffer, 'rb') as wav:
          audiodata = wav.readframes(wav.getnframes())
          soundfile.writeframes(audiodata)
client = mqtt.Client()
client.on_connect = on_connect
client.on_message = on_message
client.connect(MQTThost, MQTTport, 60)
client.loop_forever()

p.s. I revised it again and it works now. thanks to @romkabouter

romkabouter · January 28, 2022, 9:51pm

Yes, playBytes is on big wavefile.
audioFrame is a lot of small wave files.

You can record from audioFrame with a script like this:

github.com

Romkabouter/ESP32-Rhasspy-Satellite/blob/voco/record.py

#!/usr/bin/python

import time
import sys
import os
import paho.mqtt.client as paho
import wave
import io
import pyaudio
broker = "192.168.43.79"

# folder = './tmp/'
# file_name = folder + 'wav_'

output = io.BytesIO()
p = pyaudio.PyAudio()
streamOut = p.open(format=pyaudio.paInt16,
                    channels=1,
                    rate=16000,
                    output=True)

This file has been truncated. show original