Skip to main content

Overview

The Pipecat AgentHuman integration provides a plug-and-play transport layer for adding realistic talking avatars to your voice AI pipelines. Simply configure your avatar and API key - the plugin handles all WebSocket communication, audio formatting, and session management automatically. Perfect for:
  • Voice AI applications
  • Conversational agents
  • Real-time speech synthesis
  • Interactive AI assistants

Installation

pip install pipecat-agenthuman

Quick Start

from pipecat_agenthuman import AgentHumanTransport
from pipecat.pipeline import Pipeline

# Configure your avatar
transport = AgentHumanTransport(
    api_key="ah_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    avatar_id="avat_01H3Z8G9YR3K2N5M6P7Q8W4T",
    room_url="https://your-domain.daily.co/your-room",
    room_token="your-daily-token"
)

# Build your pipeline
pipeline = Pipeline([
    # Your pipeline processors here
    transport
])

# Run the pipeline
await pipeline.run()
That’s it! The transport handles:
  • ✅ Session creation and management
  • ✅ WebSocket connection and authentication
  • ✅ Audio format conversion (to 16-bit mono PCM)
  • ✅ Room configuration and avatar joining
  • ✅ Error handling and reconnection

Configuration Options

AgentHumanTransport Parameters

ParameterTypeRequiredDescription
api_keystringYesYour AgentHuman API key
avatar_idstringYesAvatar ID to use
room_urlstringYesDaily or LiveKit room URL
room_tokenstringYesRoom authentication token
room_platformstringNoRoom platform: 'daily' or 'livekit' (default: 'daily')
display_namestringNoAvatar display name in room (default: 'AI Avatar (AH)')
video_widthintNoVideo width in pixels (default: 1280)
video_heightintNoVideo height in pixels (default: 720)
aspect_ratiostringNoVideo aspect ratio: '16:9', '9:16', '1:1' (default: '16:9')

Complete Example

import asyncio
from pipecat.pipeline import Pipeline
from pipecat.processors.audio import AudioProcessor
from pipecat.processors.speech import SpeechProcessor
from pipecat_agenthuman import AgentHumanTransport

async def main():
    # Configure avatar transport
    avatar_transport = AgentHumanTransport(
        api_key="ah_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
        avatar_id="avat_01H3Z8G9YR3K2N5M6P7Q8W4T",
        room_url="https://your-domain.daily.co/your-room",
        room_token="your-daily-token",
        room_platform="daily",
        display_name="AI Assistant",
        aspect_ratio="16:9"
    )
    
    # Build voice AI pipeline
    pipeline = Pipeline([
        AudioProcessor(),           # Process incoming audio
        SpeechProcessor(),          # Speech recognition/synthesis
        avatar_transport            # Send to avatar
    ])
    
    # Run the pipeline
    try:
        await pipeline.run()
    except KeyboardInterrupt:
        print("Shutting down...")
    finally:
        await avatar_transport.close()

if __name__ == "__main__":
    asyncio.run(main())

Pipeline Integration

Audio Flow

Microphone → AudioProcessor → SpeechProcessor → AgentHumanTransport → Avatar
The AgentHumanTransport automatically:
  1. Converts audio to the required format (16-bit mono PCM)
  2. Sends audio via WebSocket to avatar server
  3. Avatar generates video and streams to your room

Event Handling

from pipecat_agenthuman import AgentHumanTransport

transport = AgentHumanTransport(...)

# Listen for events
@transport.on("connected")
async def on_connected():
    print("Avatar connected and ready")

@transport.on("audio_sent")
async def on_audio_sent(samples):
    print(f"Sent {samples} audio samples")

@transport.on("error")
async def on_error(error):
    print(f"Error: {error}")

Room Setup

Using Daily.co

from daily import Daily

# Create Daily room
daily = Daily(api_key="your-daily-api-key")
room = daily.rooms.create()

# Use room credentials with transport
transport = AgentHumanTransport(
    api_key="ah_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    avatar_id="avat_01H3Z8G9YR3K2N5M6P7Q8W4T",
    room_url=room['url'],
    room_token=room['token'],
    room_platform="daily"
)

Using LiveKit

from livekit import api

# Create LiveKit room
livekit_api = api.LiveKitAPI(
    url="wss://your-livekit-server.com",
    api_key="your-livekit-key",
    api_secret="your-livekit-secret"
)

room = await livekit_api.room.create_room(name="my-room")
token = await livekit_api.create_token(room_name=room.name)

# Use room credentials with transport
transport = AgentHumanTransport(
    api_key="ah_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    avatar_id="avat_01H3Z8G9YR3K2N5M6P7Q8W4T",
    room_url=f"wss://your-livekit-server.com/{room.name}",
    room_token=token,
    room_platform="livekit"
)

Best Practices

1. Session Management

# Proper cleanup
try:
    await pipeline.run()
finally:
    await transport.close()  # Closes WebSocket and ends session

2. Error Handling

@transport.on("error")
async def on_error(error):
    if "Session not found" in str(error):
        # Recreate session
        await transport.reconnect()
    else:
        # Log error
        logger.error(f"Avatar error: {error}")

3. Audio Quality

# Configure for optimal quality
transport = AgentHumanTransport(
    api_key="...",
    avatar_id="...",
    room_url="...",
    room_token="...",
    video_width=1920,      # Higher resolution
    video_height=1080,
    aspect_ratio="16:9"
)

Troubleshooting

Transport Not Connecting

Cause: Invalid API key or avatar ID Solution: Verify credentials in your AgentHuman dashboard
# Test connection
try:
    await transport.connect()
    print("Connected successfully")
except Exception as e:
    print(f"Connection failed: {e}")

No Video in Room

Cause: Room credentials incorrect or avatar hasn’t joined yet Solution:
  1. Verify room URL and token
  2. Wait 2-3 seconds after connection
  3. Check room participant list

Audio Quality Issues

Cause: Pipeline audio format mismatch Solution: Ensure audio is properly formatted before reaching transport
# Add audio format processor before transport
pipeline = Pipeline([
    AudioFormatProcessor(sample_rate=48000, channels=1),
    avatar_transport
])

Resources

Support

Need help with Pipecat integration?