Skip to main content

Camera Module

Enable AI vision capabilities by capturing and analyzing images from the device camera.

Overview

The Camera Module allows your AI characters to "see" what the user is looking at through the device camera. This enables powerful multimodal interactions:

  • Visual identification - "What breed of dog is this?"
  • Object analysis - "Is this fruit ripe enough to eat?"
  • Scene understanding - "What do you think of this painting?"
  • Reading assistance - "Can you read this sign for me?"
┌──────────────────────────────────────────────────────────────────────┐
│ Camera Module Flow │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ User: "Hey, what do you think of this vase I'm looking at?" │
│ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ Server AI Agent │──→ │ LLM decides visual context │ │
│ │ (Function Call) │ │ is needed via tool use │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ cameraCaptureReq │ ←──│ Server sends capture request │ │
│ │ Event to Client │ │ to client SDK │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ EstuaryCamera │──→ │ Captures image using │ │
│ │ (Example) │ │ Spectacles CameraModule │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ EstuaryManager │──→ │ Sends Base64 image to │ │
│ │ .sendCameraImage │ │ server for AI analysis │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ AI: "That's a beautiful ceramic vase with blue floral │ │
│ │ patterns! It appears to be hand-painted..." │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘

How It Works

Vision in the Estuary SDK is server-driven. The server's AI agent has a camera_capture tool available to it. When the AI determines — based on conversation context — that it needs visual information, it invokes this tool. This works for both explicit and implicit visual requests:

Explicit requests:

  • "What am I looking at?"
  • "Describe what you see"
  • "Take a picture and tell me about it"

Implicit visual context:

  • "Hey what do you think of this vase I'm looking at?"
  • "Can you help me identify this plant?"
  • "Is this ripe enough to eat?"

The flow is:

  1. The server AI detects a need for visual context via its LLM function calling system
  2. Server sends a cameraCaptureRequest event to the client
  3. The client SDK captures an image from the device camera
  4. The client sends the image back via sendCameraImage()
  5. The AI analyzes the image and responds via botResponse and botVoice events

No client-side intent detection is needed — the server handles all visual intent analysis.


Quick Start

Prerequisites

  • Estuary SDK set up with EstuaryVoiceConnection or EstuaryManager
  • EstuaryCredentials configured
  • Spectacles hardware (CameraModule is device-only)
  • Extended Permissions enabled in Project Settings (for development)

Step 1: Add EstuaryCamera Component

The EstuaryCamera example component handles camera capture. Copy it from the SDK's Examples/ folder to your project:

Examples/EstuaryCamera.ts

Then add it to a SceneObject in your scene.

Step 2: Configure Settings

EstuaryCamera (camera capture):

SettingDefaultDescription
captureResolution512Image resolution (smaller dimension in pixels)
enableVisionAcknowledgmenttrueCharacter says acknowledgment before analyzing
debugModetrueEnable debug logging

EstuaryCamera Example

The EstuaryCamera example component (in Examples/EstuaryCamera.ts) handles the actual image capture on Spectacles hardware. It automatically listens for cameraCaptureRequest events from the server and captures images in response.

Setup in Lens Studio

  1. Copy Examples/EstuaryCamera.ts to your project
  2. Create a SceneObject in your scene
  3. Add the EstuaryCamera script as a component
  4. Configure settings in the Inspector

Inspector Properties

PropertyTypeDefaultDescription
debugModebooleantrueEnable debug logging
captureResolutionnumber512Camera resolution (smaller dimension)
enableVisionAcknowledgmentbooleantrueAI says acknowledgment before analyzing

Vision Acknowledgment

When enableVisionAcknowledgment is enabled, the AI character will say a brief phrase (e.g., "Let me take a look!") immediately when the camera triggers. This provides instant feedback while the image is being captured and processed.

Resolution Guidelines

ResolutionQualityUse Case
256LowFastest transfer, basic recognition
512GoodRecommended - balanced quality/speed
768HighDetailed analysis needs
1024Very HighMaximum detail, slower transfer

Manual Capture

You can trigger a capture programmatically:

// Get reference to EstuaryCamera component
const estuaryCamera = /* your EstuaryCamera instance */;

// Trigger manual capture
estuaryCamera.manualCapture("What do you see?");

Integration Flow

Complete Event Flow

// 1. User speaks (handled by EstuaryVoiceConnection)
// "What do you think of this vase?"

// 2. Server AI decides it needs visual context
// (handled automatically by server-side LLM function calling)

// 3. Server sends cameraCaptureRequest to client
character.on('cameraCaptureRequest', (request: CameraCaptureRequest) => {
print(`Server wants to see: ${request.request_id}`);
// EstuaryCamera handles this automatically if added to scene
});

// 4. EstuaryCamera captures and sends image
// (automatic when subscribed to cameraCaptureRequest)

// 5. AI responds with image analysis
character.on('botResponse', (response) => {
// "That's a beautiful ceramic vase with blue patterns..."
});

Event Flow Diagram

User Speaks → Server AI Processing → LLM Function Call


┌────────────────────────────┐
│ Server sends │
│ cameraCaptureRequest event │
└──────────┬─────────────────┘


┌────────────────────────────┐
│ EstuaryCamera Component │
│ captures image │
└──────────┬─────────────────┘


┌────────────────────────────┐
│ Encode to Base64 │
└──────────┬─────────────────┘


┌────────────────────────────┐
│ sendCameraImage() │
└──────────┬─────────────────┘


┌────────────────────────────┐
│ Server AI Analysis │
│ → Bot Response + Voice │
└────────────────────────────┘

EstuaryManager Camera Methods

The EstuaryManager provides methods for camera integration:

sendCameraImage

Send a captured image to the server for AI analysis:

import { EstuaryManager } from 'estuary-lens-studio-sdk';

EstuaryManager.instance.sendCameraImage(
imageBase64, // Base64-encoded image data
'image/jpeg', // MIME type
requestId, // Optional: request ID for correlation
text, // Optional: context text
24000 // Optional: TTS sample rate
);

Practical Examples

Basic Setup

The camera system works automatically when you add the example components to your scene:

  1. Add EstuaryVoiceConnection (from Examples/EstuaryVoiceConnection.ts)
  2. Add EstuaryCamera (from Examples/EstuaryCamera.ts)

That's it! The components integrate automatically:

  • The server AI decides when it needs to see something
  • It sends a cameraCaptureRequest event to the client
  • EstuaryCamera listens for this event and captures the image
  • The captured image is sent back to the server for analysis

Manual Camera Capture

Trigger a capture programmatically from EstuaryCamera:

// Get reference to EstuaryCamera component
const estuaryCamera = /* your EstuaryCamera instance */;

// Trigger manual capture with a prompt
estuaryCamera.manualCapture("Describe what you see in detail.");

Custom Camera Handler

If you need more control, subscribe to cameraCaptureRequest directly:

import { EstuaryManager, CameraCaptureRequest } from 'estuary-lens-studio-sdk';

character.on('cameraCaptureRequest', (request: CameraCaptureRequest) => {
// Custom capture logic
const imageBase64 = captureMyCustomImage();

EstuaryManager.instance.sendCameraImage(
imageBase64,
'image/jpeg',
request.request_id,
request.text
);
});

Requirements & Limitations

Platform Requirements

RequirementDetails
HardwareSpectacles only (CameraModule is device-specific)
PermissionsExtended Permissions required for development
InternetRequired for AI analysis

Development Notes

CameraModule Limitations
  • CameraModule APIs cannot be called in onAwake() - use OnStartEvent or later
  • In Lens Studio Preview, the camera captures the preview scene content (not a real camera feed)
  • Using CameraModule disables open internet for publicly released Lenses

Testing Workflow

  1. Development: Enable Extended Permissions in Project Settings
  2. Preview: Camera capture works but only captures the preview scene content
  3. Device: Deploy to Spectacles to capture real-world camera feed

Troubleshooting

Camera Not Capturing

  1. Check Lifecycle: Ensure camera is initialized after OnStartEvent
  2. Check Permissions: Extended Permissions must be enabled
  3. Check Connection: Verify EstuaryManager.isConnected is true
  4. Preview vs Device: In Preview, camera captures the preview scene; on Spectacles, it captures the real camera

Image Not Sending

  1. Check Encoding: Verify Base64 encoding is successful
  2. Check Size: Large images may timeout - reduce captureResolution
  3. Check Connection: WebSocket must be connected

AI Not Responding to Image

  1. Check Request ID: If responding to a server request, include the request_id from the CameraCaptureRequest
  2. Check Server Logs: Verify image was received and processed
  3. Check Vision Model: Ensure the character's VLM model supports vision

Best Practices

Optimize for Performance

// Use appropriate resolution for your use case
captureResolution: 512 // Good balance of quality/speed

// Enable acknowledgment for user feedback
enableVisionAcknowledgment: true

Handle Edge Cases

// Check camera availability before capture
if (!cameraModule) {
print("Camera not available on this device");
return;
}

// Handle capture failures gracefully
try {
this.captureAndSend();
} catch (error) {
print(`Capture failed: ${error}`);
this.notifyUser("Sorry, I couldn't capture that image");
}

Provide User Feedback

// Show visual indicator during capture
character.on('cameraCaptureRequest', () => {
this.captureIndicator.enabled = true;
this.showProcessingUI();
});

SDK Structure

The camera functionality is split between core and example components:

estuary-lens-studio-sdk/
├── src/
│ └── Components/
│ └── EstuaryManager.ts ← Core: sendCameraImage()
│ └── Core/
│ └── EstuaryEvents.ts ← CameraCaptureRequest type
└── Examples/
├── EstuaryVoiceConnection.ts ← Example: voice + camera integration
└── EstuaryCamera.ts ← Example: camera capture implementation

Next Steps