Camera Module
Enable AI vision capabilities by capturing and analyzing images from the device camera.
Overview
The Camera Module allows your AI characters to "see" what the user is looking at through the device camera. This enables powerful multimodal interactions:
- Visual identification - "What breed of dog is this?"
- Object analysis - "Is this fruit ripe enough to eat?"
- Scene understanding - "What do you think of this painting?"
- Reading assistance - "Can you read this sign for me?"
┌──────────────────────────────────────────────────────────────────────┐
│ Camera Module Flow │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ User: "Hey, what do you think of this vase I'm looking at?" │
│ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ Server AI Agent │──→ │ LLM decides visual context │ │
│ │ (Function Call) │ │ is needed via tool use │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ cameraCaptureReq │ ←──│ Server sends capture request │ │
│ │ Event to Client │ │ to client SDK │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ EstuaryCamera │──→ │ Captures image using │ │
│ │ (Example) │ │ Spectacles CameraModule │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ ┌──────────────────────────────┐ │
│ │ EstuaryManager │──→ │ Sends Base64 image to │ │
│ │ .sendCameraImage │ │ server for AI analysis │ │
│ └──────────────────┘ └───────────────┬──────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ AI: "That's a beautiful ceramic vase with blue floral │ │
│ │ patterns! It appears to be hand-painted..." │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────┘
How It Works
Vision in the Estuary SDK is server-driven. The server's AI agent has a camera_capture tool available to it. When the AI determines — based on conversation context — that it needs visual information, it invokes this tool. This works for both explicit and implicit visual requests:
Explicit requests:
- "What am I looking at?"
- "Describe what you see"
- "Take a picture and tell me about it"
Implicit visual context:
- "Hey what do you think of this vase I'm looking at?"
- "Can you help me identify this plant?"
- "Is this ripe enough to eat?"
The flow is:
- The server AI detects a need for visual context via its LLM function calling system
- Server sends a
cameraCaptureRequestevent to the client - The client SDK captures an image from the device camera
- The client sends the image back via
sendCameraImage() - The AI analyzes the image and responds via
botResponseandbotVoiceevents
No client-side intent detection is needed — the server handles all visual intent analysis.
Quick Start
Prerequisites
- Estuary SDK set up with
EstuaryVoiceConnectionorEstuaryManager EstuaryCredentialsconfigured- Spectacles hardware (CameraModule is device-only)
- Extended Permissions enabled in Project Settings (for development)
Step 1: Add EstuaryCamera Component
The EstuaryCamera example component handles camera capture. Copy it from the SDK's Examples/ folder to your project:
Examples/EstuaryCamera.ts
Then add it to a SceneObject in your scene.
Step 2: Configure Settings
EstuaryCamera (camera capture):
| Setting | Default | Description |
|---|---|---|
captureResolution | 512 | Image resolution (smaller dimension in pixels) |
enableVisionAcknowledgment | true | Character says acknowledgment before analyzing |
debugMode | true | Enable debug logging |
EstuaryCamera Example
The EstuaryCamera example component (in Examples/EstuaryCamera.ts) handles the actual image capture on Spectacles hardware. It automatically listens for cameraCaptureRequest events from the server and captures images in response.
Setup in Lens Studio
- Copy
Examples/EstuaryCamera.tsto your project - Create a SceneObject in your scene
- Add the
EstuaryCamerascript as a component - Configure settings in the Inspector
Inspector Properties
| Property | Type | Default | Description |
|---|---|---|---|
debugMode | boolean | true | Enable debug logging |
captureResolution | number | 512 | Camera resolution (smaller dimension) |
enableVisionAcknowledgment | boolean | true | AI says acknowledgment before analyzing |
Vision Acknowledgment
When enableVisionAcknowledgment is enabled, the AI character will say a brief phrase (e.g., "Let me take a look!") immediately when the camera triggers. This provides instant feedback while the image is being captured and processed.
Resolution Guidelines
| Resolution | Quality | Use Case |
|---|---|---|
| 256 | Low | Fastest transfer, basic recognition |
| 512 | Good | Recommended - balanced quality/speed |
| 768 | High | Detailed analysis needs |
| 1024 | Very High | Maximum detail, slower transfer |
Manual Capture
You can trigger a capture programmatically:
// Get reference to EstuaryCamera component
const estuaryCamera = /* your EstuaryCamera instance */;
// Trigger manual capture
estuaryCamera.manualCapture("What do you see?");
Integration Flow
Complete Event Flow
// 1. User speaks (handled by EstuaryVoiceConnection)
// "What do you think of this vase?"
// 2. Server AI decides it needs visual context
// (handled automatically by server-side LLM function calling)
// 3. Server sends cameraCaptureRequest to client
character.on('cameraCaptureRequest', (request: CameraCaptureRequest) => {
print(`Server wants to see: ${request.request_id}`);
// EstuaryCamera handles this automatically if added to scene
});
// 4. EstuaryCamera captures and sends image
// (automatic when subscribed to cameraCaptureRequest)
// 5. AI responds with image analysis
character.on('botResponse', (response) => {
// "That's a beautiful ceramic vase with blue patterns..."
});
Event Flow Diagram
User Speaks → Server AI Processing → LLM Function Call
│
▼
┌────────────────────────────┐
│ Server sends │
│ cameraCaptureRequest event │
└──────────┬─────────────────┘
│
▼
┌────────────────────────────┐
│ EstuaryCamera Component │
│ captures image │
└──────────┬─────────────────┘
│
▼
┌────────────────────────────┐
│ Encode to Base64 │
└──────────┬─────────────────┘
│
▼
┌────────────────────────────┐
│ sendCameraImage() │
└──────────┬─────────────────┘
│
▼
┌────────────────────────────┐
│ Server AI Analysis │
│ → Bot Response + Voice │
└────────────────────────────┘
EstuaryManager Camera Methods
The EstuaryManager provides methods for camera integration:
sendCameraImage
Send a captured image to the server for AI analysis:
import { EstuaryManager } from 'estuary-lens-studio-sdk';
EstuaryManager.instance.sendCameraImage(
imageBase64, // Base64-encoded image data
'image/jpeg', // MIME type
requestId, // Optional: request ID for correlation
text, // Optional: context text
24000 // Optional: TTS sample rate
);
Practical Examples
Basic Setup
The camera system works automatically when you add the example components to your scene:
- Add
EstuaryVoiceConnection(fromExamples/EstuaryVoiceConnection.ts) - Add
EstuaryCamera(fromExamples/EstuaryCamera.ts)
That's it! The components integrate automatically:
- The server AI decides when it needs to see something
- It sends a
cameraCaptureRequestevent to the client EstuaryCameralistens for this event and captures the image- The captured image is sent back to the server for analysis
Manual Camera Capture
Trigger a capture programmatically from EstuaryCamera:
// Get reference to EstuaryCamera component
const estuaryCamera = /* your EstuaryCamera instance */;
// Trigger manual capture with a prompt
estuaryCamera.manualCapture("Describe what you see in detail.");
Custom Camera Handler
If you need more control, subscribe to cameraCaptureRequest directly:
import { EstuaryManager, CameraCaptureRequest } from 'estuary-lens-studio-sdk';
character.on('cameraCaptureRequest', (request: CameraCaptureRequest) => {
// Custom capture logic
const imageBase64 = captureMyCustomImage();
EstuaryManager.instance.sendCameraImage(
imageBase64,
'image/jpeg',
request.request_id,
request.text
);
});
Requirements & Limitations
Platform Requirements
| Requirement | Details |
|---|---|
| Hardware | Spectacles only (CameraModule is device-specific) |
| Permissions | Extended Permissions required for development |
| Internet | Required for AI analysis |
Development Notes
- CameraModule APIs cannot be called in
onAwake()- useOnStartEventor later - In Lens Studio Preview, the camera captures the preview scene content (not a real camera feed)
- Using CameraModule disables open internet for publicly released Lenses
Testing Workflow
- Development: Enable Extended Permissions in Project Settings
- Preview: Camera capture works but only captures the preview scene content
- Device: Deploy to Spectacles to capture real-world camera feed
Troubleshooting
Camera Not Capturing
- Check Lifecycle: Ensure camera is initialized after
OnStartEvent - Check Permissions: Extended Permissions must be enabled
- Check Connection: Verify
EstuaryManager.isConnectedis true - Preview vs Device: In Preview, camera captures the preview scene; on Spectacles, it captures the real camera
Image Not Sending
- Check Encoding: Verify Base64 encoding is successful
- Check Size: Large images may timeout - reduce
captureResolution - Check Connection: WebSocket must be connected
AI Not Responding to Image
- Check Request ID: If responding to a server request, include the
request_idfrom theCameraCaptureRequest - Check Server Logs: Verify image was received and processed
- Check Vision Model: Ensure the character's VLM model supports vision
Best Practices
Optimize for Performance
// Use appropriate resolution for your use case
captureResolution: 512 // Good balance of quality/speed
// Enable acknowledgment for user feedback
enableVisionAcknowledgment: true
Handle Edge Cases
// Check camera availability before capture
if (!cameraModule) {
print("Camera not available on this device");
return;
}
// Handle capture failures gracefully
try {
this.captureAndSend();
} catch (error) {
print(`Capture failed: ${error}`);
this.notifyUser("Sorry, I couldn't capture that image");
}
Provide User Feedback
// Show visual indicator during capture
character.on('cameraCaptureRequest', () => {
this.captureIndicator.enabled = true;
this.showProcessingUI();
});
SDK Structure
The camera functionality is split between core and example components:
estuary-lens-studio-sdk/
├── src/
│ └── Components/
│ └── EstuaryManager.ts ← Core: sendCameraImage()
│ └── Core/
│ └── EstuaryEvents.ts ← CameraCaptureRequest type
└── Examples/
├── EstuaryVoiceConnection.ts ← Example: voice + camera integration
└── EstuaryCamera.ts ← Example: camera capture implementation
Next Steps
- API Reference: Camera Module - Complete API documentation
- Voice Connection - Audio setup for transcripts
- Action System - Trigger actions from AI responses