Video Streaming
The EstuaryWebcam component streams camera video to the Estuary world model for spatial awareness. The character can see and describe the user's environment, detect objects, and track scene changes.
Streaming Modes
| Mode | Transport | Latency | Codec | Platform |
|---|---|---|---|---|
| LiveKit | WebRTC video track | Low | VP8/H264 | Desktop, Android, iOS |
| WebSocket | Socket.IO MJPEG | Higher | JPEG | All platforms |
LiveKit is the preferred mode. If LiveKit is unavailable, the component can automatically fall back to WebSocket streaming.
Setup
Add the Component
- Create a new GameObject (e.g., "Webcam")
- Add the Estuary Webcam component
- Configure the settings:
| Field | Default | Description |
|---|---|---|
Stream Mode | LiveKit | LiveKit or WebSocket |
Auto Fallback | true | Fall back to WebSocket if LiveKit unavailable |
Target Fps | 10 | Frames per second (lower = less bandwidth) |
Target Width | 1280 | Capture resolution width |
Target Height | 720 | Capture resolution height |
Auto Start On Connect | false | Start streaming when EstuaryManager connects |
Auto Subscribe Scene Graph | true | Subscribe to scene graph updates |
Starting and Stopping
// Start streaming with a session ID
webcam.StartStreaming(sessionId);
// Stop streaming
webcam.StopStreaming();
// Auto-start: set autoStartOnConnect = true in Inspector
// The component will start streaming when LiveKit is ready
LiveKit Video
In LiveKit mode, the component:
- Captures frames from
WebCamTextureviaDirectWebcamVideoSource - Publishes a LiveKit video track to the shared room (same room as voice)
- Notifies the backend to subscribe to the video track (
enable_livekit_video)
The backend processes frames through the world model pipeline: object detection, scene understanding, and graph construction.
// Access the webcam texture for preview
RawImage preview = GetComponent<RawImage>();
preview.texture = webcam.WebcamTexture;
WebSocket Video (Fallback)
In WebSocket mode, the component:
- Captures frames from
WebCamTexture - Encodes frames as JPEG
- Sends frames as base64-encoded
video_frameevents at the target FPS
// Force WebSocket mode
webcam.StreamMode = WebcamStreamMode.WebSocket;
Scene Graph
The world model builds a scene graph from the video feed. Subscribe to updates:
webcam.OnSceneGraphUpdated += (SceneGraph graph) =>
{
Debug.Log($"Scene: {graph.EntityCount} entities");
Debug.Log($"Location: {graph.LocationType}");
Debug.Log($"Activity: {graph.UserActivity}");
foreach (var entity in graph.Entities)
{
Debug.Log($" {entity.ClassName}: {entity.Label} at {entity.Position}");
}
};
webcam.OnRoomIdentified += (RoomIdentified room) =>
{
Debug.Log($"Room: {room.RoomName} ({room.Status})");
};
Manual Subscription
// Subscribe
await webcam.SubscribeToSceneGraphAsync();
// Unsubscribe
await webcam.UnsubscribeFromSceneGraphAsync();
Camera Capture (On-Demand)
For on-demand image capture (not continuous streaming), the server can request a camera image:
// The server sends camera_capture events when it wants an image
// (e.g., when the character detects vision intent in user speech)
// Handle this in your code to capture and send an image
The camera_image event payload:
{
"image": "<base64 JPEG>",
"mime_type": "image/jpeg",
"text": "What do you see?"
}
Device Pose (AR)
For AR applications, send device pose data alongside video:
// Enable in Inspector
webcam.SendPose = true;
webcam.CameraTransform = arCamera.transform;
// Or send manually
await webcam.SendPoseAsync(camera.transform.localToWorldMatrix);
Camera Selection
// List available cameras
var devices = webcam.AvailableDevices;
foreach (var device in devices)
{
Debug.Log($"{device.name} (front: {device.isFrontFacing})");
}
// Switch camera
webcam.SetDevice("HD Webcam");
// Use front-facing camera (mobile/AR)
webcam.UseFrontCamera = true;
Next Steps
- Action System -- Trigger actions from AI responses
- API Reference: Input Components -- Full
EstuaryWebcamreference