Dialogue Manager documentation?

Light · May 13, 2022, 3:35am

I have seen several places mentioned that it is better to use the dialogue manager to generate speech notifications, rather than posting to the topic hermes/tts/say. However, there seems to be a lack of documentation explaining how the dialogue manager works. For example, one simple question I have been wondering about is what happens if you issue a new dialogue manager session in notification mode, and then you issue another one before the speech is finished? Does it queue up, or is it more appropriate to check whether it has finished somehow?

By the way, I’m not hating, I appreciate that documentation is not easy and time-consuming.

romkabouter · May 13, 2022, 7:21am

Publishing a new message to startSession will do just that, start a new session. You can let is be queued by setting the canBeEnqueued property.
Thtat can be read here: Reference - Rhasspy

There is nothing wrong to post to tts/say for doing some notification or anything.
If you want some interaction with Rhasspy it is best to use startSession, because with that you have and id to keep track of that interaction.

So it totally depends on your use case

Light · May 13, 2022, 3:33pm

Thanks, you are prolific here. Yes, I have seen reference, for example there is no property on the notification settings for the startSession indicating whether it can be queued. So that makes me wonder what the implied behavior is. Even if using text-to-speech directly, the question is still relevant. Naturally I can probably figure this out by doing some tests, but I was hoping to save some time by drawing on the expertise of those more knowledgeable than me.

rejoe2 · May 13, 2022, 3:56pm

You may be right with that question. In addition to that: There’s also not mentioned what will happen when startSession is called with “not to be enqueued” flag whilst another session (to my understanding: on same satellite) is running.

Unfortunately, I don’t have no answer to both of them yet. At least afai’ve seen until now, there’s not been any real need to clarify that - perhaps there’s not been that much users trying to run several sessions in parallel until now.
To my personal experience, sticking to the provided functionality as close as possible seems to be the way to avoid future trouble. So this is why I always prefer the “dialog manager way” to addressing “say” directly - I just expect the responsible program to do it “best” (from perspective of the entire ecosystem). So I’d bet on notifications always beeing marked as enqueueable and expect other behaviour to be a bug (or missing detail in docu/future feature “flag-setting” to be possible)…

In general, I really appreciate the docu, it’s really great! It still is “very much details”, and going even much more into detail might lead to more confusion in the end. (Just my2ct).

romkabouter · May 13, 2022, 5:50pm

A notification is queued, because starting another notification will be after the first one, it is almost the same as tts/say
For tts/say, that is also queued because message arrive after each other. There is no real need to explicitly make it queued or not.

If you want to see what the actual behaviour is, it is best to try it I guess

Light · May 14, 2022, 2:57am

Great, thank you. As you suggest, I will give it a try.

Light · May 17, 2022, 3:32am

I finally got around to trying this out, and as @romkabouter suggested, both the text-to-speech and startSession notifications get queued by default.