Intent Handling : remote / publish wakeword detected

KiboOst · December 29, 2019, 10:52am

Hi,

@synesthesiam you added Intent Handling/remote to post intent on an url. It allow by its own the jeedom plugin to be notified of the intent and handle it

What we miss now is exactly the same, when the wakeword is detected.
Could you also post to the remote url a json with wakewordid and siteid that detected it ?
{‘wakeword’:{‘wakeId’: ‘snowboy/hey_brigitte.pmdl’,‘siteId’: ‘salle’}}

If you post such json to the remote url, this would allow to mute sound when too high so the asr listening is a lot better.
We do that actually with snips plugin. The plugin know the wakeword has been detected, set a variable in jeedom that start a scenario to check/mute any music/sound device in the room immediately. So when we ask something, the sound is muted in the room, then the intent is handled and a scenario unmute the sound in the room.
It works perfect, and the only thing I miss for the rhasspy plugin is a msg of wakeword detected with wakeword id and siteid.

Let me know if it’s possible for you to add this. Definitely a great feature to have. Maybe just a check box in the interface “Also notifiy wakeword”, no need for another url.

fastjack · December 29, 2019, 12:17pm

I don’t know how this fits with the Hermes protocol.

It seems to me that using events on websockets or MQTT messages would solve all of these HTTP POSTing stuffs. This is the main pros of using a specific protocol that provides all events directly via an endpoint instead of multiplying specific HTTP requests all over the code for particular events (with the multitude of additional options in config that would require)

POST requests would also assume that Rhasspy wait for the response from the remote server which I do not think is a good idea (it creates hard coupling between components which is a big no-no in the micro service architecture where Rhasspy is going) Rhasspy should not have to wait for a remote server to respond to events. It should only notify any listeners of what it just did (wake word detection, asr transcribed, nlu recognizes, etc…) and handle commands in a non blocking way

Is it possible for a Jeedom plugin to run permanently and listen for the events already sent over websockets (or event better a MQTT broker as Rhasspy seems to be going the MQTT way) instead (some kind of bridge deamon between Rhasspy and Jeedom)?

As Rhasspy already handle the Hermes protocol, would it be possible to simply fork the Snips Jeedom plugin code directly with minor changes regarding payloads?

What do you guys think?

KiboOst · December 29, 2019, 1:00pm

It is possible yes, but won’t do it. A lot harder to do, and can cause dependencies headhache depending on Jessie, Stretch, Buster repos etc. Snips plugin is like that, and an update crashed all jeedom Jessie due to dependencies (actually you must have jeedon on stratch or plugin won’t work !!). It also need tons of dependencies and run constantly in background just to catch a few event a day.

Actually, posting the event on remote url is perfect and lighing fast on both side.
Having the wakeword detected is the only missing feature regarding post/remote. Adding all the hassle of mqtt/dependencies to the plugin just for that would be a total non sense. Even if mqtt is nice and would be solution for base/satellite communication between rhasspy devices.

And if rhasspy post to home assistant, why it could not post to a generic url to allow any smarthome solution or whatever to be fully compatible with rhasspy ?

Also, rhasspy doesn’t have to wait for an answer at all when posting these intent/wakeword event on remote url. Just a fire and forget. This is only a few ms, and only if activated by the user in settings.

Sincerely, snips plugin is the most heavy one, taking lot of resources and caused many problem with installation due to dependencies. All just to get a json and start a scenario according to intent name. I won’t go that route, sorry.

banderson · December 29, 2019, 3:58pm

I think that I need to agree with @fastjack here. I’m not sure that a bunch of HTTP endpoints for publishing events to clients is the way to go in the long run. In fact, I’m not convinced that HTTP API endpoints for commands is the way to go either. I’m not sure it scales well (server needs to keep registered callback URLs, setup/teardown of HTTP connections, etc). Also, every new event needs yet another API endpoint defined. I believe that this is really what websockets were designed for…real-time bi-directional communication between client and server. As a case study, note that HASS eventually dropped the HTTP endpoints in favor of websockets only. As an example client, consider AppDaemon which “hosts” Python “apps” that consume HASS events and produce HASS commands (via websockets). I believe that AppDaemon might also be architected in a manner that it is not limited to doing this just for HASS. It might be possible to teach it about a websocket event/command interface with a Rasspy “hub”. I guess I am also implying here that a websocket API is a good way to go for satellites too.

KiboOst · December 29, 2019, 8:58pm

Ok, no problem if you want rhasspy to integrate only with hass/ha, just let me know. I won’t spend more time on jeedom plugin development and rhasspy testing until I know more about where and how rhasspy want to be in the future.
I was just asking a post request when wakeword is detected, like posting intent when recognized, nothing more. And only if user check this option. But seems everything that would handle a rhasspy intent and actually does something, will have to get mqtt installed and listening 24/7. No problem with that, will keep eyes open anyway on this project as the foundation seems amazing. But if only pure programmer apart ha user can does something with intents, sad.

koan · December 29, 2019, 9:17pm

Relax, @KiboOst, @synesthesiam has made it clear that Rhasspy is not meant only for Home Assistant. The question is just what the right architecture is. Taking the time now to do this right will save a lot of time later to fix things that were done hastily.

synesthesiam · December 29, 2019, 9:52pm

I think there’s value is Rhasspy being able to POST to endpoints on specific events. I was thinking of adding a “webhooks” section to the profile where you could specify endpoints and which events. Should be pretty easy to get something in if you don’t mind editing your profile json directly (web UI always takes me much longer).

I figured the “events” would mostly be the same ones as in Snips Land: wake detected, session start/stop, ASR, NLU, errors. I’d like to keep the webhooks pretty simple, since anything more complex than a POST with some json should probably be a Node-RED flow or something.

Thoughts?

KiboOst · December 29, 2019, 10:13pm

I was thinking about such list of event/post but I’m not sure other events would be useful.
I’m all to keep things simple, and wouldn’t multiply such post request at every single step. This will be mqtt publishing in the next versions I guess.
Wake detected and intent recognized are enough I think to an intent handler, being a plugin for jeedom or whatever. So, having command and or remote for these two event should cover 90% of needs. And anything more complex can subscribe to mqtt and get everything happening to handle more stuff.

So really, I don’t ask for posting every events, just these two which are really important in every smarthome solution that want to support rhasspy and not install tons of dependencies, risking the reliability of the solution itself.

Also, I guess posting all to same url would be easier for rhasspy maintenance ( less settings). Then the endpoint will sort events based on json it receive.

synesthesiam · December 29, 2019, 10:19pm

I’ll be using aiohttp going forward in Rhasspy, since quart is all about asyncio. So while the POST would block the current command (if you have any webhooks), it wouldn’t block other commands at least.

Multiple webooks for the same event could also be run in parallel, though I’d really recommend a pub/sub system at that point (e.g., MQTT).

banderson · December 30, 2019, 1:18am

@KiboOst: I certainly did not intend on implying that Rhasspy would be limited to HASS only. Quite the contrary. Sorry if that came across wrong, my apologies. Any references to HASS and AppDaemon are solely from an example point of view…examples of mechanisms that have proven themselves useful in the context which they are deployed. Maybe they don’t apply to a future Rhasspy architecture…they are intended only as examples of how one might approach the problem. I am still trying to understand Rhasspy architecture and use cases at a basic level. Accommodating many use cases is certainly what I am be interested in. I had also assumed that your requests for callbacks was larger in scope than apparently you intended. So to be fair, these two “intent handler” callbacks do seem reasonable…but… Given that they are currently statically defined (in the profile), this isn’t necessarily what I would expect from a general purpose mechanism. What if multiple “clients” want to be informed of intent handler events? Do they “register” dynamically with the “server”? That brings up a whole set of issues that we probably want to avoid. It seems as if websockets might be a better mechanism for accomplishing this. Still, I do believe there are issues even with this approach if we are truly going to move to a distributed services architecture.

It does occur to me that the current HTTP API is a reasonable choice for a monolithic “server” architecture. It is not so clear how this scales/applies if we are to move towards a distributed services architecture. Here is where HASS is not a good example as it is a monolithic server, not a set of distributed services. In HASS, and in the current Rhasspy implementation there is a single “server” endpoint. In a distributed architecture, there are potentially many endpoints (one for each service). What URL does the client use to issue HTTP requests to (or websocket upgrades)? Is there a HTTP “adapter” service that “aggregates” on behalf of the underlying services? Possibly. This could hide the underlying services architecture on behalf of those clients that preferred an HTTP or Websocket API at a single base URL to service the API. Decoupling the underlying services from each other seems like a reasonable architectural goal as has been previously discussed here and in other topics. Each service shouldn’t have to know what other service(s) are interested in the data or events it produces. Likewise, a particular service shouldn’t care what its clients are. MQTT is certainly a proven way to accomplish this decoupling between services. I’m sure we could come up with other ways, but pub/sub is definitely a proven pattern for this style of architecture. I would bet this was one of the driving issues behind Snips choosing the architecture that they did.

@synesthesiam: Hoping that we can continue to discuss these architectural issues (possibly in a new thread?) before you spend a lot of your time implementing other API mechanisms. It seems like it would be a good thing ™ for you to spend your time on continuing to separate functionality into components that could eventually be deployed as services as you know the internals of Rhasspy the best. That and adding more domain specific stuff (that I don’t necessarily understand that well and you are obviously a domain expert in). Maybe the current server based architecture could continue with feature improvements and internal refactoring to support separate components whilst we consider a new architecture. Once that architecture is agreed upon, maybe we start a second (v2) implementation in a different branch and diverge from the current master until such time as V2 is ready? Just a thought. Its your baby man!

synesthesiam · December 31, 2019, 6:04pm

I’ve added a bare bones webhook for @KiboOst and the Jeedom plugin. We can consider different design options going forward, and I’d be interested in any feedback.

The webhook settings are not currently described in the docs or available in the web UI. But it’s very easy to add to your profile:

{
  "webhooks": {
    "awake": ["http://localhost:5000", ...]
  }
}

This will POST to all of the URLs in the list when Rhasspy wakes up. The data is JSON with these fields:

{
    "wakewordId": "...",
    "siteId": "..."
}

Hope it works for you!

KiboOst · January 1, 2020, 11:53am

Many thanks @synesthesiam just tested it and it works perfectly !
Will publish a new beta for Jeedom plugin, great new feature that some users ever asked for.

synesthesiam · January 1, 2020, 2:24pm

Awesome, cheers!

MarcBouteiller · April 19, 2020, 12:22pm

Hi !
This is what I was looking for and works well with HA too

Thanks !

KiboOst · April 19, 2020, 12:27pm

Actually it only works in 2.5 with siteid default.

leoben7 · April 23, 2020, 5:48pm

Perferct… It was really useful to me as well…

KiboOst · December 12, 2020, 2:59pm

Hi. Sorry, old subject, but it does POST only to the last url in the list
I would not use several ones but for test purpose I did …