Overview
This reference documents the WebSocket events the avatar server sends to your client. These are primarily confirmations and status updates for your audio commands.WebSocket Connection
WebSocket URL:wss://wss.agenthuman.com
Connect to this URL after creating a session. The server will send these event messages in response to your commands.
What You’ll Receive
The avatar server sends simple JSON events to confirm:- ✅ Session initialization
- ✅ Audio received and queued
- ✅ Playback interrupted
- ❌ Errors
Important: Events are Confirmations Only
These WebSocket events are confirmations for your commands. Video appears in your Daily or LiveKit room automatically. Event Flow:- You send audio via WebSocket → Server confirms receipt
- Server generates video → Streams to Daily or LiveKit room
- You receive video from Daily or LiveKit (separate from WebSocket)
Event Format
All events are JSON-formatted with atype field:
Event Types
Connection Established
Sent after successful session initialization. Confirms that the server has loaded your session configuration and is ready to receive audio commands.session_id- Your session ID (echoed back)
session.init message.
Client Action: Ready to send audio commands. The avatar server will join your video room and begin streaming video there.
Example Handlers:
Speak Confirmation
Sent after receiving and queuing audio for video generation. Confirms the audio was received and provides the sample count.audio_samples- Number of audio samples queued for processing (after any resampling to 16kHz)
Interrupt Confirmation
Sent after processing an interrupt request. Confirms that current video generation and playback have been stopped.Error Event
Sent when an error occurs during processing. Contains detailed error information.error- Human-readable error description
Common Error Messages
| Error Message | Meaning | Solution |
|---|---|---|
First message must be session.init | Session init was not sent first | Send session.init immediately after connecting |
Missing session id | The session_id field is missing in session.init | Include session_id in the session.init message |
Invalid session id - server not assigned or session id mismatch | Session was routed to the wrong server | Use the server_ws_uri provided when creating/starting the session |
Failed to fetch video path | Session config could not be fetched from API | Ensure the session exists and the session_token is correct |
Session not found | Session ID not recognized | Create a new session and re-establish connection |
Invalid JSON format | Malformed JSON received | Validate JSON structure before sending |
No audio data provided | Empty audio field | Include base64-encoded audio in agent.speak message |
Failed to process audio: ... | Invalid audio format | Verify 16-bit mono PCM and set the correct sample_rate |
Unknown message type: ... | Unsupported message | Check message type is supported (session.init, agent.speak, agent.interrupt) |
Event Handling Best Practices
1. Always Check Event Type
2. Serialize audio sends
agent.speak.confirmed does not include a correlation ID. For the simplest integration, send one agent.speak at a time and wait for the confirmation before sending the next chunk.
3. Handle Errors Gracefully
4. Wait for Confirmations
Don’t assume operations succeeded. Wait for confirmation events:- Learn about Client → Server Messages