Skip to main content

LiveKit Integration

The Estuary Unity SDK uses LiveKit for low-latency WebRTC voice and video. These internal managers are created and used by EstuaryManager -- you normally do not interact with them directly, but they are documented here for advanced use cases.

LiveKitVoiceManager

Manages the LiveKit voice room connection, microphone publishing, and bot audio subscription.

Namespace: Estuary.Core

Properties

PropertyTypeAccessDescription
IsConnectedboolgettrue when connected to the LiveKit room
StateLiveKitConnectionStategetCurrent connection state
RoomNamestringgetName of the current LiveKit room
BotAudioSourceAudioSourcegetThe AudioSource playing the bot's voice (created at runtime)

Methods

MethodReturnsDescription
ConnectAsync(string token, string url, string roomName)TaskConnect to a LiveKit room with the provided JWT token
DisconnectAsync()TaskLeave the room and clean up
StartPublishingAsync()TaskPublish the microphone track to the room
StopPublishingAsync()TaskUnpublish the microphone track
MuteAsync()TaskMute the published microphone track
UnmuteAsync()TaskUnmute the published microphone track
SignalInterruptAsync(string messageId)TaskSignal an interrupt for the given message
MuteBotAudio()voidMute the bot's audio track (used during interrupts)
UnmuteBotAudio()voidUnmute the bot's audio track
NotifyAudioChunk(string messageId, double timestamp)voidTrack incoming audio chunks for interrupt filtering

Events

EventSignatureDescription
OnStateChangedAction<LiveKitConnectionState>Connection state changed
OnReadyAction<string>Room is ready (room name)
OnBotAudioSourceCreatedAction<AudioSource>Bot's AudioSource was created (for lip sync integration)
OnErrorAction<string>Connection or publishing error

Connection Flow

1. EstuaryManager calls EstuaryClient.RequestLiveKitTokenAsync()
2. Server responds with livekit_token (JWT, URL, room name)
3. EstuaryManager calls LiveKitVoiceManager.ConnectAsync(token, url, room)
4. LiveKitVoiceManager connects to room, state → Connecting
5. EstuaryManager calls EstuaryClient.NotifyLiveKitJoinAsync()
6. Server joins the bot to the room, sends livekit_ready
7. State → WaitingForBot → Ready
8. LiveKitVoiceManager subscribes to the bot's audio track
9. OnBotAudioSourceCreated fires with the runtime AudioSource

Interrupt Handling

When an interrupt is signalled:

  1. SignalInterruptAsync sends client_interrupt to the server
  2. MuteBotAudio() is called to immediately silence the bot
  3. The server stops generating audio and sends an interrupt confirmation
  4. Audio chunks with message_id matching the interrupted message are filtered out
  5. Audio chunks with timestamps before the interrupt are also filtered

LiveKitVideoManager

Publishes webcam video as a LiveKit video track in the same room as voice.

Namespace: Estuary.Core

Properties

PropertyTypeAccessDescription
IsPublishingboolgettrue while video is being published

Methods

MethodReturnsDescription
StartPublishingAsync(Room room)TaskStart publishing video to the LiveKit room
StopPublishingAsync()TaskStop publishing video

How It Works

  1. EstuaryWebcam creates a DirectWebcamVideoSource from the WebCamTexture
  2. When the LiveKit room is ready, EstuaryWebcam calls LiveKitVideoManager.StartPublishingAsync(room)
  3. The manager publishes the video source as a local video track with configured encoding options
  4. EstuaryWebcam sends enable_livekit_video to notify the backend to subscribe to the track
  5. The backend processes frames through the world model pipeline

Video Encoding

The video track uses VP8 encoding by default. Resolution and frame rate are controlled by EstuaryWebcam's TargetWidth, TargetHeight, and TargetFps settings.


Dependencies

The LiveKit integration requires the io.livekit.livekit-sdk package (v1.3.3+), which is declared as a dependency in the Estuary SDK's package.json. It is automatically resolved by Unity Package Manager when you install the Estuary SDK.

The LiveKit Unity SDK provides:

  • Room -- Room connection management
  • LocalAudioTrack / LocalVideoTrack -- Publishing local media
  • RemoteAudioTrack -- Subscribing to remote audio
  • RtcAudioSource -- Native audio capture with AEC
  • AudioStream -- Audio playback from remote tracks
  • DirectWebcamVideoSource -- Webcam capture for video tracks