AgentHumanVideoService
Constructor Parameters
| Parameter | Type | Description |
|---|---|---|
api_key | str | Required. Your AgentHuman API key |
session_request | NewSessionRequest | Avatar and aspect ratio configuration. Defaults to a built-in avatar with aspect_ratio="auto" |
transport | BaseTransport | Required. Your Pipecat output transport. Must have video_out_enabled=True |
NewSessionRequest
Parameters
| Parameter | Type | Description |
|---|---|---|
avatar | str | Required. Avatar ID from your AgentHuman dashboard |
aspect_ratio | str | Video aspect ratio: "4:3", "3:4", "1:1", or "auto". Defaults to "auto" |
Aspect Ratios
| Value | Typical resolution | Best for |
|---|---|---|
"4:3" | 1200×900 | Landscape / desktop |
"3:4" | 900×1200 | Portrait / mobile |
"1:1" | 1200×1200 | Square layouts |
"auto" | Derived from transport | Automatic — recommended |
aspect_ratio is "auto" (the default), the service reads video_out_width and video_out_height from your transport and selects the closest supported ratio. A warning is logged if the dimensions don’t closely match any standard ratio.
Transport Requirements
AgentHumanVideoService requires a transport with video output enabled. Passing a transport without video_out_enabled=True raises a ValueError immediately.
Session Lifecycle
| Phase | What happens |
|---|---|
| Setup | POST /sessions API call — AgentHuman avatar session created, internal LiveKit credentials returned |
| Start | LiveKit room connection established; audio chunk size calculated from transport dimensions |
| Running | TTS frames resampled to 16 kHz mono → sent to avatar via DataStreamAudioOutput; avatar video/audio frames forwarded downstream |
| Stop / Cancel | LiveKit room disconnected; POST /sessions/{id}/end called to terminate the session |
Audio Processing
The service automatically resamples all incoming TTS audio to 16 kHz mono PCM before forwarding to the avatar. No manual audio format configuration is needed regardless of which TTS service you use. Audio is buffered and sent in chunks. A new chunk is dispatched when:- The buffer reaches the target chunk size and silence is detected, or
- A
TTSStoppedFrameis received (end of utterance)