Action Protocol
Estuary characters can trigger actions in your application by embedding XML-style tags in their text responses. This lets AI characters do more than just talk -- they can animate, navigate, play sounds, or control any aspect of your application.
How Actions Work
Actions are embedded directly in bot_response text using self-closing XML tags:
Sure, I'd love to wave at you! <action name="wave" />
When the SDK receives a bot_response containing action tags:
- The tags are parsed to extract the action name and parameters
- Action events are dispatched to your registered handlers
- The tags are optionally stripped from the display text
This means the user sees: "Sure, I'd love to wave at you!" while your code receives a wave action to trigger an animation.
Action Tag Format
Actions use self-closing XML tags with a required name attribute:
<action name="actionName" />
Actions can include additional parameters as attributes:
<action name="navigate" target="kitchen" speed="fast" />
<action name="emote" type="laugh" intensity="0.8" />
<action name="play_sound" clip="doorbell" volume="0.5" />
Rules
- The
nameattribute is required and identifies which handler to invoke - All other attributes are optional parameters passed to the handler
- Attribute values are always strings (parse them to numbers/booleans in your handler)
- Tags are case-insensitive (
<action>and<Action>are equivalent) - Multiple actions can appear in a single response
Defining Custom Actions
Actions are defined as part of your character's configuration in the Estuary Dashboard. When creating or editing a character, you specify which actions are available and what they do. The LLM uses these definitions to decide when and how to emit actions.
Example character action configuration:
Available actions:
- wave: Wave at the user. Use when greeting or saying goodbye.
- sit: Sit down. Use when the conversation is relaxed.
- navigate(target): Move to a location. Use when the user asks you to go somewhere.
- play_sound(clip): Play a sound effect. Use for emphasis or reactions.
The LLM learns from these descriptions when each action is appropriate and includes them in its responses naturally.
Client-Side Handling
Parsing
SDKs include a built-in action parser that extracts action tags from bot_response text. Parsed actions contain:
- Name -- The action identifier (from the
nameattribute) - Parameters -- A dictionary of all other attributes
Handler Registration
Each SDK provides a way to register handlers for specific action names. When an action is parsed from a response, the matching handler is invoked.
Unity (C#):
// Using EstuaryActionManager component
// Configure action bindings in the Inspector, or register in code:
actionManager.OnAnyActionReceived += (action) =>
{
Debug.Log($"Action: {action.Name}");
switch (action.Name)
{
case "wave":
animator.SetTrigger("Wave");
break;
case "navigate":
var target = action.GetParameter("target");
navAgent.SetDestination(GetWaypoint(target));
break;
case "play_sound":
var clip = action.GetParameter("clip");
audioSource.PlayOneShot(GetClip(clip));
break;
}
};
TypeScript:
client.on('action', (action) => {
console.log(`Action: ${action.name}`);
if (action.name === 'wave') {
playAnimation('wave');
} else if (action.name === 'navigate') {
navigateTo(action.parameters.target);
}
});
Stripping Action Tags
By default, SDKs strip action tags from the text before displaying it to the user. The original text with tags is still available if needed.
| Original text | Display text |
|---|---|
Let me wave! <action name="wave" /> | Let me wave! |
<action name="sit" /> I'll sit down here. | I'll sit down here. |
Examples
Triggering Animations
<action name="emote" type="happy" />
<action name="gesture" animation="point_forward" />
<action name="idle" pose="thinking" />
Navigation
<action name="navigate" target="door" speed="walk" />
<action name="look_at" target="user" />
<action name="turn" direction="left" degrees="90" />
Sound Effects
<action name="play_sound" clip="notification" volume="0.7" />
<action name="play_music" track="ambient_forest" />
UI Control
<action name="show_image" url="photo.jpg" />
<action name="open_panel" panel="inventory" />
<action name="highlight" element="submit_button" />
Next Steps
- Conversation Protocol -- The full event reference
- For SDK-specific action implementation details, see your SDK's action system guide