Skip to main content

Overview

AgentHumanVideoService is a Pipecat AIService that slots into your pipeline right after your TTS service. It receives TTS audio frames, sends them to the AgentHuman avatar via a LiveKit data stream, and injects the avatar’s video and audio frames back into the pipeline for your output transport to publish. Perfect for:
  • Voice AI applications with a visual presenter
  • Conversational agents with animated avatars
  • Real-time speech-driven video generation
  • Interactive AI assistants

How It Works

User mic → Transport input → STT → LLM → TTS → AgentHumanVideoService → Transport output
                                                       ↑                        ↑
                                             Sends audio to avatar    Receives avatar video
AgentHumanVideoService handles everything internally:
  1. Creates an AgentHuman session via the REST API on startup
  2. Connects to the returned LiveKit room
  3. Resamples TTS audio to 16 kHz mono and streams it to the avatar
  4. Forwards the avatar’s video and audio frames downstream to your transport

Installation

pip install pipecat-ai[agenthuman]
The agenthuman extra installs the LiveKit SDK (livekit, livekit-agents) which is required for the avatar data stream transport.

Environment Variables

VariableDescription
AGENTHUMAN_API_KEYYour AgentHuman API key

Next Steps

Quick Start

Add AgentHumanVideoService to your pipeline in minutes

Configuration

All parameters and transport requirements

Examples

Complete working bot examples

API Reference

Underlying session API