Skip to main content

Best Practices

Session Management

Always Initialize Session First

Send session.init as the very first message after connecting to the WebSocket.
ws.onopen = () => {
    // First message must be session.init
    ws.send(JSON.stringify({
        type: 'session.init',
        config: {
            session_id: sessionId,
            access_token: accessToken,
            room: {
                platform: 'daily',
                url: session.daily_room.url,
                token: session.daily_room.token,
                display_name: 'AI Avatar (AH)'
            },
            video_width: 1280,
            video_height: 720
        }
    }));
};

Secure Credential Storage

  • Never hardcode credentials in client-side code
  • Use environment variables for server-side applications
  • Fetch credentials from your backend API for browser apps
  • Rotate access tokens regularly
// Good: Fetch from your backend
const credentials = await fetch('/api/get-session-credentials').then(r => r.json());

// Bad: Hardcoded in source
const SESSION_ID = 'ses_abc123'; // Don't do this!

Session Lifecycle

Use the same session for multiple audio requests to improve performance:
// Good: Reuse session
await client.connectDaily();  // Connect to Daily.co room once
await client.connectWebSocket();  // Connect to WebSocket once
await client.sendAudio('audio1.wav');
await client.sendAudio('audio2.wav');  // Same session
await client.sendAudio('audio3.wav');  // Same session
await client.close();  // Clean shutdown

// Less efficient: New session for each audio
for (const audio of audios) {
    await client.connectDaily();
    await client.connectWebSocket();
    await client.sendAudio(audio);
    await client.close();  // Overhead!
}

Proper Cleanup

There is no WebSocket session.stop message. When you’re done:
  1. Close the WebSocket connection
  2. Leave the Daily.co room
  3. Call POST /v1/sessions/{session_id}/end to release resources and mark the session ended
async function cleanup() {
    try {
        // 1. Close WebSocket
        if (ws) ws.close();
    } finally {
        // 2. Leave Daily.co room
        if (callFrame) await callFrame.leave();
        
        // 3. End the session via REST
        // (Do this from your backend; don’t expose API keys in client-side code.)
        // await fetch(`/api/end-session?session_id=${sessionId}`);
    }
}

Audio Quality

Optimal Audio Format

Input audio should be:
  • Sample Rate: 16kHz by default, or send 48kHz and set sample_rate: 48000 in agent.speak
  • Channels: Mono (1 channel)
  • Bit Depth: 16-bit signed integer
  • Format: Raw PCM (no WAV headers)
  • Encoding: Base64 string

Audio Normalization

Normalize audio levels to prevent clipping and ensure consistent volume:
import librosa
import numpy as np

# Load audio
audio, sr = librosa.load("input.wav", sr=48000, mono=True)

# Normalize to peak at 0.95 (-0.5 dB)
peak = np.abs(audio).max()
if peak > 0:
    audio = audio * (0.95 / peak)

# Convert to 16-bit PCM
audio_int16 = (audio * 32768.0).astype(np.int16)

Noise Reduction

For best results, use clean audio without background noise:
import noisereduce as nr

# Apply noise reduction (optional, requires noisereduce package)
audio_clean = nr.reduce_noise(y=audio, sr=sr)

Chunk Size Recommendations

DurationSamples (48kHz)Bytes (16-bit)Base64 SizeRecommendation
1 second48,00096,000~128 KBMinimum
5 seconds240,000480,000~640 KBOptimal
10 seconds480,000960,000~1.3 MBGood
30 seconds1,440,0002,880,000~3.8 MBMaximum recommended
Recommendation: Send 5-10 second chunks for best balance of latency and efficiency.

Daily.co Connection

Wait for Session Confirmation

Wait for session to be established before sending audio:
// Good: Wait for confirmation
ws.onmessage = async (event) => {
    const message = JSON.parse(event.data);
    
    if (message.type === 'connection.established') {
        // Now safe to send audio commands
        console.log('Ready to send audio');
    }
};

// Bad: Send immediately
ws.onopen = async () => {
    await sendAudio();  // Don't do this!
};

Monitor Avatar Participation

Track when the avatar joins the Daily.co room:
callFrame.on('participant-joined', (event) => {
    console.log('Participant joined:', event.participant);
    
    if (event.participant.user_name === 'Avatar') {
        console.log('Avatar is ready in the room');
    }
});

Monitor Connection Health

Track Daily.co connection state and handle failures:
callFrame.on('network-connection', (event) => {
    console.log('Network quality:', event.quality);
    
    if (event.quality === 'good') {
        console.log('Good connection quality');
    } else if (event.quality === 'low') {
        console.warn('Poor connection quality');
    }
});

callFrame.on('error', (error) => {
    console.error('Daily.co error:', error);
    handleReconnection();
});

Implement Reconnection Logic

Handle temporary network issues gracefully:
let reconnectAttempts = 0;
const MAX_RECONNECT_ATTEMPTS = 3;

async function handleReconnection() {
    if (reconnectAttempts >= MAX_RECONNECT_ATTEMPTS) {
        console.error('Max reconnection attempts reached');
        return;
    }
    
    reconnectAttempts++;
    const delay = Math.min(1000 * Math.pow(2, reconnectAttempts), 10000);
    
    console.log(`Reconnecting in ${delay}ms (attempt ${reconnectAttempts}/${MAX_RECONNECT_ATTEMPTS})`);
    
    await new Promise(resolve => setTimeout(resolve, delay));
    await reconnect();
}

async function reconnect() {
    // Close existing connections
    if (callFrame) await callFrame.leave();
    if (ws) ws.close();
    
    // Reconnect
    await connectDaily();
    await connectWebSocket();
}

Performance Optimization

Connection Pooling

Reuse WebSocket connections for multiple requests:
class AvatarSessionPool {
    constructor(maxSessions = 3) {
        this.sessions = [];
        this.maxSessions = maxSessions;
    }
    
    async getSession() {
        // Return existing idle session or create new one
        let session = this.sessions.find(s => s.isIdle);
        
        if (!session && this.sessions.length < this.maxSessions) {
            session = await this.createSession();
            this.sessions.push(session);
        }
        
        return session;
    }
    
    async createSession() {
        const client = new AvatarClient(sessionId, accessToken);
        await client.connectDaily();
        await client.connectWebSocket();
        return { client, isIdle: true };
    }
}

Lazy Video Track Handling

Only process video frames when needed:
let videoTrack = null;

pc.ontrack = (event) => {
    if (event.track.kind === 'video') {
        // Only attach to DOM when user wants to see it
        if (document.getElementById('avatar-video').isVisible) {
            videoTrack = event.track;
            displayVideo(event.streams[0]);
        }
    }
};

Audio Buffering Strategy

For real-time applications, implement smart buffering:
class AudioBuffer:
    def __init__(self, chunk_duration=5.0, sample_rate=48000):
        self.chunk_duration = chunk_duration
        self.sample_rate = sample_rate
        self.chunk_samples = int(chunk_duration * sample_rate)
        self.buffer = []
    
    def add_audio(self, audio_samples):
        """Add audio and return complete chunks"""
        self.buffer.extend(audio_samples)
        chunks = []
        
        while len(self.buffer) >= self.chunk_samples:
            chunk = self.buffer[:self.chunk_samples]
            self.buffer = self.buffer[self.chunk_samples:]
            chunks.append(np.array(chunk))
        
        return chunks
    
    def flush(self):
        """Get remaining audio as final chunk"""
        if len(self.buffer) > 0:
            chunk = np.array(self.buffer)
            self.buffer = []
            return chunk
        return None

Error Handling

Comprehensive Error Handling

Handle all possible error scenarios:
class AvatarErrorHandler {
    handleError(error, context) {
        console.error(`Error in ${context}:`, error);
        
        if (error.message.includes('Session not found')) {
            return this.handleSessionError();
        } else if (error.message.includes('Failed to process audio')) {
            return this.handleAudioError();
        } else if (error.message.includes('network')) {
            return this.handleNetworkError();
        } else if (error.message.includes('Daily')) {
            return this.handleDailyError();
        } else {
            return this.handleUnknownError(error);
        }
    }
    
    async handleSessionError() {
        // Re-establish session
        await this.reconnect();
    }
    
    async handleDailyError() {
        // Reconnect to Daily.co room
        await this.reconnectDaily();
    }
    
    async handleAudioError() {
        // Validate and resend audio
        console.log('Re-encoding audio...');
    }
    
    async handleNetworkError() {
        // Implement exponential backoff
        await this.backoffRetry();
    }
    
    handleUnknownError(error) {
        // Log for debugging
        this.reportError(error);
    }
}

Validate Before Sending

Always validate data before sending:
function validateAudioData(base64Audio) {
    // Check if base64 is valid
    try {
        atob(base64Audio.substring(0, 100));
    } catch (e) {
        throw new Error('Invalid base64 encoding');
    }
    
    // Check size
    const sizeBytes = (base64Audio.length * 3) / 4;
    const sizeMB = sizeBytes / (1024 * 1024);
    
    if (sizeMB > 100) {
        throw new Error(`Audio too large: ${sizeMB.toFixed(2)} MB (max 100 MB)`);
    }
    
    return true;
}

Troubleshooting

Daily.co Connection Issues

Problem: Cannot connect to Daily.co room or avatar video not appearing. Solutions:
  1. Verify room credentials:
// Check room URL and token are valid
console.log('Room URL:', session.daily_room.url);
console.log('Token:', session.daily_room.token ? 'Present' : 'Missing');

// Join with proper error handling
try {
    await callFrame.join({
        url: session.daily_room.url,
        token: session.daily_room.token
    });
    console.log('Successfully joined Daily.co room');
} catch (error) {
    console.error('Failed to join room:', error);
}
  1. Check network connectivity:
    • Verify internet connection is stable
    • Daily.co handles firewall/NAT traversal automatically
  2. Monitor participant events:
callFrame.on('participant-joined', (event) => {
    console.log('Participant joined:', event.participant.user_name);
});

callFrame.on('participant-left', (event) => {
    console.log('Participant left:', event.participant.user_name);
});
  1. Add connection timeout:
const CONNECTION_TIMEOUT = 10000; // 10 seconds

const timeout = setTimeout(() => {
    console.error('Daily.co connection timeout');
    handleReconnection();
}, CONNECTION_TIMEOUT);

callFrame.on('joined-meeting', () => {
    clearTimeout(timeout);
    console.log('Connected to Daily.co room');
});

Audio/Video Sync Issues

Problem: Audio and video are out of sync. Solutions:
  1. Check network latency and quality:
    • Monitor Daily.co network events (connection quality, packet loss)
    • Measure the time from sending agent.speak to receiving agent.speak.confirmed (queueing latency)
  2. Ensure consistent audio format:
    • Always use 16-bit, mono PCM
    • If you send 48kHz audio, include sample_rate: 48000
    • Verify mono channel
    • Check 16-bit depth
  3. Monitor Daily.co video tracks:
callFrame.on('track-started', (event) => {
    console.log('Track started:', event.track.kind);
    if (event.track.kind === 'video') {
        console.log('Avatar video track started');
    }
});

callFrame.on('track-stopped', (event) => {
    console.log('Track stopped:', event.track.kind);
});

Performance Issues

Problem: Slow video generation or high latency. Solutions:
  1. Check audio format and size:
# Verify audio specifications
print(f"Sample rate: {sr} Hz")
print(f"Duration: {len(audio) / sr:.2f} seconds")
print(f"Size: {len(audio_bytes) / 1024:.2f} KB")

# Ensure optimal chunk size (5-10 seconds)
max_samples = 10 * 48000  # 10 seconds
if len(audio) > max_samples:
    print("Warning: Audio chunk too large, consider splitting")
  1. Monitor network bandwidth:
// Check connection quality (if available)
if (navigator.connection) {
    console.log('Connection type:', navigator.connection.effectiveType);
    console.log('Downlink speed:', navigator.connection.downlink, 'Mbps');
}
  1. Optimize audio encoding:
# Use efficient base64 encoding
import base64

# Faster encoding for large files
audio_base64 = base64.b64encode(audio_bytes).decode('ascii')

Session Errors

Problem: “Session not found” or “Invalid session configuration” errors. Solutions:
  1. Verify credentials are current:
// Refresh credentials before connecting
async function getLatestCredentials() {
    const response = await fetch('/api/create-session', {
        method: 'POST',
        headers: { 'x-api-key': API_KEY }
    });
    return await response.json();
}

const { session_id, access_token } = await getLatestCredentials();
  1. Check session hasn’t expired:
// Sessions may have TTL - recreate if needed
const sessionAge = Date.now() - sessionCreatedAt;
const MAX_SESSION_AGE = 3600000; // 1 hour

if (sessionAge > MAX_SESSION_AGE) {
    console.log('Session expired, creating new one');
    await createNewSession();
}
  1. Ensure avatar is configured:
Verify your session has an avatar assigned via the Create Session endpoint before starting the session.

WebSocket Disconnections

Problem: WebSocket connection drops unexpectedly. Solutions:
  1. Handle reconnection:
ws.onclose = (event) => {
    console.log('WebSocket closed:', event.code, event.reason);
    
    if (event.code !== 1000) {  // Not normal closure
        console.log('Reconnecting in 2 seconds...');
        setTimeout(reconnect, 2000);
    }
};
  1. Check for network changes:
window.addEventListener('online', () => {
    console.log('Network restored, reconnecting...');
    reconnect();
});

window.addEventListener('offline', () => {
    console.log('Network lost');
});

Error Reference

Error MessageMeaningSolution
First message must be session.initSession init was not sent firstSend session.init immediately after connecting
Invalid room platformThe room config is missing/invalidSet room.platform to daily
Invalid video dimensionsNon-positive or non-numeric dimensionsUse positive integers for video_width and video_height
Invalid aspect ratioUnsupported width/height ratioUse one of: 18:9, 16:9, 5:3, 16:10, 3:2, 4:3, 1:1 (or portrait equivalents)
Invalid session id - server not assigned or session id mismatchSession was routed to the wrong serverRe-run Start Session and use the returned ws_uri
Failed to fetch video pathSession config could not be fetchedEnsure the session is started and the access_token is correct
Session not foundSession ID not recognizedRe-establish session with session.init
Invalid JSON formatMalformed JSON receivedValidate JSON before sending
No audio data providedEmpty audio fieldInclude base64-encoded audio
Failed to process audio: ...Invalid audio formatVerify 16-bit mono PCM and set the correct sample_rate
Unknown message type: ...Unsupported messageCheck message type is supported

Production Checklist

Before deploying to production:
  • Implement proper error handling for all message types
  • Add reconnection logic with exponential backoff
  • Validate audio format before sending (16-bit, mono PCM + correct sample_rate, e.g. 48000)
  • Monitor Daily.co connection state and participant events
  • Implement session lifecycle management
  • Add logging and monitoring
  • Test with various network conditions
  • Handle WebSocket disconnections gracefully
  • Secure credential storage (no hardcoding)
  • Add timeout handling for all async operations
  • Test audio/video synchronization in Daily.co room
  • Optimize chunk sizes (5-10 seconds recommended)
  • Don’t send application-level ping messages (unsupported); rely on reconnect logic
  • Add user-friendly error messages
  • Test on target browsers/platforms
  • Verify Daily.co room tokens are valid