Agent Handoffs: When and How to Transfer Control

Table of content

One agent hits its limit. Another takes over. That’s a handoff.

The pattern is simple: an LLM decides it can’t complete a task and transfers control to a specialized agent. Customer asks about a refund? Triage agent hands off to refund agent. Technical question? Route to tech support agent.

OpenAI’s Swarm framework popularized this pattern. Anthropic’s research confirms it works. But most teams overcomplicate it.

The Core Mechanism

A handoff is just a function that returns a new agent:

from swarm import Swarm, Agent

def transfer_to_refund_agent():
    """Transfer when customer asks about refunds."""
    return refund_agent

triage_agent = Agent(
    name="Triage",
    instructions="Route customers to the right agent.",
    functions=[transfer_to_refund_agent]
)

refund_agent = Agent(
    name="Refund Agent",
    instructions="Handle refund requests. Be helpful."
)

The LLM sees transfer_to_refund_agent as a tool. It calls the function. The framework switches current_agent and continues the conversation.

That’s it. No message buses. No orchestration layer. Just a function call.

When to Use Handoffs

Use handoffs when:

Don’t use handoffs when:

Anthropic’s recommendation: “Start with simple prompts, optimize them with comprehensive evaluation, and add multi-agent systems only when simpler solutions fall short.”

Handoff Patterns

Pattern 1: Triage Router

The entry agent classifies and routes:

triage_agent = Agent(
    name="Triage",
    instructions="""Classify the customer request:
    - Billing questions → transfer_to_billing
    - Technical issues → transfer_to_tech
    - General questions → answer directly""",
    functions=[transfer_to_billing, transfer_to_tech]
)

This works for customer service, help desks, anything with clear categories.

Pattern 2: Escalation Chain

Agent escalates when it can’t handle something:

from pydantic import BaseModel

class EscalationData(BaseModel):
    reason: str

async def on_escalate(ctx, data: EscalationData):
    log(f"Escalated: {data.reason}")

handoff_obj = handoff(
    agent=human_review_agent,
    on_handoff=on_escalate,
    input_type=EscalationData
)

The LLM provides a reason when it escalates. You get visibility into why handoffs happen.

Pattern 3: Specialist Network

Multiple specialists, any can hand off to any other:

researcher = Agent(
    name="Researcher",
    handoffs=[writer, analyst]
)

writer = Agent(
    name="Writer",
    handoffs=[researcher, editor]
)

editor = Agent(
    name="Editor",
    handoffs=[writer]
)

Research finds info, hands to writer. Writer drafts, hands to editor. Editor may loop back. Use for complex workflows where sequence isn’t fixed.

Input Filters

By default, the new agent sees the entire conversation history. Sometimes you want to filter it:

from agents.extensions import handoff_filters

handoff_obj = handoff(
    agent=specialist_agent,
    input_filter=handoff_filters.remove_all_tools
)

Common filters:

The Real Tradeoff

Handoffs trade simplicity for specialization.

Single agent approach:

Multi-agent with handoffs:

Most real systems land somewhere in between. A triage agent that routes to 2-3 specialists. Not a swarm of dozens.

What You Can Steal

The minimum viable handoff:

def transfer_to_specialist():
    return specialist_agent

main_agent = Agent(
    instructions="Transfer if you can't handle it.",
    functions=[transfer_to_specialist]
)

Add routing hints:

specialist = Agent(
    name="Refund Specialist",
    handoff_description="Use when customer mentions refund, return, or money back"
)

Log handoffs for debugging:

async def on_handoff(ctx, input_data):
    print(f"Handoff to {ctx.agent.name}: {input_data}")

Start without handoffs. Add them when a single agent’s instructions become unmanageable. You’ll know when.

Next: Subagent Patterns