rhasspyBatcher : fine tune your intents / slots

KiboOst · December 25, 2019, 8:32pm

Hi,

It’s christmas, so here is a little python script I previously wrote for SNIPS, and have converted to Rhasspy.

It allow you easily test dozens of sentences for intent recognition.

I have use it to fine tune my intents / slots and got 100% matched over 52 test sentences

Still waiting for number, datetime and duration slots to be able to convert my last snips intents to rhasspy.

Load it in sublime text or whatever, in main enter rhasspy ip and port, and choose ingletest() or batchtest()

Here is json format for batch test:

{
	"allume la lumière de la cuisine": "lightsTurnOnJeedom",
	"balance de la pop": "turnOnJeedom",
}

And the result:

[ MATCHED] lightsTurnOnJeedom | query: allume la lumière de la cuisine | confidence:1.0 | Slots: house_room : cuisine
[ MATCHED] turnOnJeedom | query: balance de la pop | confidence:1.0 | Slots: device_name : musique | music_genre : pop

matched: 2 | unmatched: 0 | total: 2

script

#!/usr/bin/env python
#-*- coding: UTF-8 -*-

import sys, os
import time

import json
import urllib

import urllib.request
requestUrl = urllib.request
parseUrl = urllib.parse
from http.cookiejar import CookieJar, LWPCookieJar

try:
	reload(sys)
	sys.setdefaultencoding('utf-8')
except:
	pass

_debug = False
#_debug = 3

class rhasspyBatcher():
	def __init__(self, _adrr='', _port=''):
		self._version = 0.12
		self._adrr = _adrr
		self._port = _port

		self._urlHost = self._adrr+':'+self._port
		self._reqHdl = None

		self.matched = 0
		self.unmatched = 0
		self.tested = 0

		if self.connect() == True:
			if _debug: debug(1, "__Rhasspy connected__")
	#

	def connect(self):
		answer = self.request('GET', self._urlHost, '/api/version')
		if _debug: debug(1, 'answer connect: %s'%answer)
		if answer[0] == 200 : return True
		return False
	#

	def testQuery(self, inputQuery):
		inputQuery = inputQuery.lower()
		answer = self.request('POST', self._urlHost, '/api/text-to-intent?nohass=true', inputQuery)
		if _debug: debug(6, "testQuery:answer %s"%answer)
		return answer
	#

	def showResult(self, nluInference, matchIntent='', query='', showOnlyUnmatched=False):
		self.tested += 1
		displayResult = ''
		showThis = True

		intentInput = nluInference
		intent = intentInput['intent']
		confidence = intent['confidence']
		intentName = intent['name']

		entities = nluInference['entities']
		slots = nluInference['slots']

		#no intent found:
		if intentName == '':
			displayResult = '[UNFOUND     ]'
			self.unmatched += 1
			if matchIntent:
				displayResult += ' should match: %s | query: %s'%(matchIntent, query)
			print(displayResult)
			return False

		if matchIntent != '':
			if matchIntent == intentName:
				if showOnlyUnmatched: showThis = False
				self.matched += 1
				displayResult += '[     MATCHED] %s | query: %s'%(matchIntent, query)
			else:
				self.unmatched += 1
				displayResult += '[---UNMATCHED] %s  should: %s | query: %s'%(intentName, matchIntent, query)

		displayResult += ' | confidence:%s'%round(confidence, 2)

		if showThis:
			displaySlotResult = ''
			for slot in slots:
				slotName = slot
				slotValue = slots[slot]
				displaySlotResult += ' | %s : %s'%(slotName, slotValue)
			if len(slots) == 0:
				displaySlotResult = 'No slot found'
			displayResult += ' | Slots: %s'%displaySlotResult
			print(displayResult)
	#

	def request(self, method, host, path='', jsonString=None, postinfo=None): #standard function handling all get/post request
		if self._reqHdl == None:
			self._reqHdl = requestUrl.build_opener()
			self._reqHdl.addheaders = [
						('User-agent', 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:51.0) Gecko/20100101 Firefox/51.0'),
						('Connection', 'keep-alive'),
						('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8'),
						('Upgrade-Insecure-Requests', 1)
					]

		url = host+path
		if _debug: debug(5, 'request: %s method: %s postinfo: %s'%(url, method, postinfo))

		if method == 'GET':
			answer = self._reqHdl.open(url, timeout = 5)
		else:
			if jsonString != None:
				jsonBytes = jsonString.encode()
				req = urllib.request.Request(url, data=jsonBytes, headers={'Content-Type': 'application/json'})
				answer = self._reqHdl.open(req)

			if postinfo != None:
				data = parseUrl.urlencode(postinfo)
				data.encode()
				answer = self._reqHdl.open(url, data, timeout = 5)

		if jsonString != None:
			if _debug: debug(5, 'request info: %s'%answer.info())
			result = json.load(answer)
			return result

		return [answer.getcode(), answer.read()]
	#
#

def debug(level, text):
	if _debug >= level:
		print("--debug:", text)
#

def batchTest(jsonPath):
	batchJson = json.load(open(jsonPath, 'rb'), encoding='utf-8')
	for query in batchJson:
		if _debug: debug(3, 'batch test: %s | %s'%(query, batchJson[query]))
		try:
			result = rhasspy.testQuery(query)
			rhasspy.showResult(result, batchJson[query], query)
		except Exception as e:
			print('Batch test ERROR: ', e)
			break
		time.sleep(0.4)
#

def singleTest(query, matchIntent=''):
	result = rhasspy.testQuery(query)
	rhasspy.showResult(result, matchIntent, query)
	time.sleep(0.4)
#


#testing purpose:
if __name__ == "__main__":
	_adrr = "http://192.168.0.140"
	_port = "12101"
	rhasspy = rhasspyBatcher(_adrr, _port)

	singleTest("allume la lumière", 'lightsTurnOnJeedom')
	#batchTest('batch.json')

	print()
	print('matched: %s | unmatched: %s | total: %s'%(rhasspy.matched, rhasspy.unmatched, rhasspy.tested))

It is rather simple, but having a json with a set of currently used sentences by family help validating the assistant after modifications/training. It help me a lot to understand and debug snips nlu, and just helped me a lot with rhasspy. So if anyone find it usefull, feel free to use it

After a few hundreds of tests, rhasspt finish by saying “can’t create new thread”. You must then restart the docker container or the venv to get it back running fine. Done it a few times without problems, during tons of editing intents, training etc.

KiboOst · January 30, 2020, 8:45pm

For those interested in such batch testing, I’ve put an enhanced version usable via command line or as a python module.

All is available and explained here : Rhasspy-BatchTester

Still got 100% match over 58 tests

geoffrey · March 8, 2020, 7:32am

This is really helpful when mixing and matching sentences in another language like you do as well! Very well done.

Is there an option to prevent the intent of being handled? When I at the moment have a sentence that tells e.g. Rhasspy to turn on a light, it does so when running the batch.

For lights this not yet such a problem, but when testing an intent that opens a gate, that is something else

KiboOst · March 9, 2020, 8:59pm

Actually it shouldn’t send the intent command to the intent handler. The request to the http api is done for not forwarding it and never got this on 2.4.17 !!

geoffrey · March 9, 2020, 11:12pm

I assume that still triggers an event to be sent to the websocket with the intent information in it to which my setup reacts using Nodered.

How do you perform your intent handling?

KiboOst · March 10, 2020, 7:46am

With remote http to jeedom plugin.