Skip to main content

Overview

This reference documents the WebSocket events the avatar server sends to your client. These are primarily confirmations and status updates for your audio commands.

WebSocket Connection

WebSocket URL: wss://wss.agenthuman.com Connect to this URL after creating a session. The server will send these event messages in response to your commands.

What You’ll Receive

The avatar server sends simple JSON events to confirm:
  • ✅ Session initialization
  • ✅ Audio received and queued
  • ✅ Playback interrupted
  • ❌ Errors

Important: Events are Confirmations Only

These WebSocket events are confirmations for your commands. Video appears in your Daily or LiveKit room automatically. Event Flow:
  1. You send audio via WebSocket → Server confirms receipt
  2. Server generates video → Streams to Daily or LiveKit room
  3. You receive video from Daily or LiveKit (separate from WebSocket)

Event Format

All events are JSON-formatted with a type field:
{
  "type": "event_type",
  "field1": "value1",
  "field2": "value2"
}

Event Types

Connection Established

Sent after successful session initialization. Confirms that the server has loaded your session configuration and is ready to receive audio commands.
{
  "type": "connection.established",
  "session_id": "your-session-id"
}
Fields:
  • session_id - Your session ID (echoed back)
When Sent: Immediately after receiving a valid session.init message. Client Action: Ready to send audio commands. The avatar server will join your video room and begin streaming video there. Example Handlers:
ws.onmessage = async (event) => {
    const message = JSON.parse(event.data);
    
    if (message.type === 'connection.established') {
        console.log('Session established:', message.session_id);
        // Avatar server will join your video room
        // Ready to send audio commands
    }
};

Speak Confirmation

Sent after receiving and queuing audio for video generation. Confirms the audio was received and provides the sample count.
{
  "type": "agent.speak.confirmed",
  "audio_samples": 48000
}
Fields:
  • audio_samples - Number of audio samples queued for processing (after any resampling to 16kHz)
When Sent: After successfully receiving and queuing your audio data. Client Action: Optional - you can use this to track which audio chunks have been received and are being processed. Example Handlers:
if (message.type === 'agent.speak.confirmed') {
    console.log(`Audio confirmed: ${message.audio_samples} samples`);
    console.log(`Duration: ${message.audio_samples / 16000} seconds (approx)`);
}
Note: Video generation happens in the background and is streamed to your video room. This event only confirms audio receipt. Watch for the avatar video to appear in your video room.

Interrupt Confirmation

Sent after processing an interrupt request. Confirms that current video generation and playback have been stopped.
{
  "type": "agent.interrupt.confirmed",
}
Fields: None When Sent: After stopping current video generation/playback. Client Action: Optional - you can track that the interrupt was successful. Example Handlers:
if (message.type === 'agent.interrupt.confirmed') {
    console.log('Playback interrupted successfully');
    // Can now send new audio
}

Error Event

Sent when an error occurs during processing. Contains detailed error information.
{
  "type": "error",
  "error": "Detailed error message"
}
Fields:
  • error - Human-readable error description
When Sent: When any operation fails or invalid data is received. Client Action: Log the error, notify user and take corrective action based on the error message. Example Handlers:
if (message.type === 'error') {
    console.error('Server error:', message.error);
    alert(`Error: ${message.error}`);
}

Common Error Messages

Error MessageMeaningSolution
First message must be session.initSession init was not sent firstSend session.init immediately after connecting
Missing session idThe session_id field is missing in session.initInclude session_id in the session.init message
Invalid session id - server not assigned or session id mismatchSession was routed to the wrong serverUse the server_ws_uri provided when creating/starting the session
Failed to fetch video pathSession config could not be fetched from APIEnsure the session exists and the session_token is correct
Session not foundSession ID not recognizedCreate a new session and re-establish connection
Invalid JSON formatMalformed JSON receivedValidate JSON structure before sending
No audio data providedEmpty audio fieldInclude base64-encoded audio in agent.speak message
Failed to process audio: ...Invalid audio formatVerify 16-bit mono PCM and set the correct sample_rate
Unknown message type: ...Unsupported messageCheck message type is supported (session.init, agent.speak, agent.interrupt)

Event Handling Best Practices

1. Always Check Event Type

ws.onmessage = async (event) => {
    const message = JSON.parse(event.data);
    
    switch (message.type) {
        case 'connection.established':
            handleConnectionEstablished(message);
            break;
        case 'agent.speak.confirmed':
            handleSpeakConfirmed(message);
            break;
        case 'agent.interrupt.confirmed':
            handleInterruptConfirmed(message);
            break;
        case 'error':
            handleError(message);
            break;
        default:
            console.warn('Unknown message type:', message.type);
    }
};

2. Serialize audio sends

agent.speak.confirmed does not include a correlation ID. For the simplest integration, send one agent.speak at a time and wait for the confirmation before sending the next chunk.
function waitForSpeakConfirmed(ws, timeoutMs = 5000) {
    return new Promise((resolve, reject) => {
        const timeout = setTimeout(() => {
            ws.removeEventListener('message', onMessage);
            reject(new Error('Timeout waiting for agent.speak.confirmed'));
        }, timeoutMs);

        function onMessage(event) {
            const msg = JSON.parse(event.data);
            if (msg.type === 'agent.speak.confirmed') {
                clearTimeout(timeout);
                ws.removeEventListener('message', onMessage);
                resolve(msg.audio_samples);
            }
        }

        ws.addEventListener('message', onMessage);
    });
}

3. Handle Errors Gracefully

if (message.type === 'error') {
    console.error('Server error:', message.error);
    
    // Handle specific errors
    if (message.error.includes('Session not found')) {
        // Re-establish session
        await reconnect();
    } else {
        // Show user-friendly error
        showErrorToUser(message.error);
    }
}

4. Wait for Confirmations

Don’t assume operations succeeded. Wait for confirmation events:
async function sendAudioWithConfirmation(ws, audioBase64, sampleRate = 48000) {
    // Send audio
    ws.send(JSON.stringify({
        type: 'agent.speak',
        audio: audioBase64,
        sample_rate: sampleRate
    }));

    // Wait for the next confirmation
    const audioSamples = await waitForSpeakConfirmed(ws);
    return audioSamples;
}