Skip to main content
Any agent becomes a manager by listing other agents in its subagents. At runtime the manager breaks the task into pieces, hands each piece to a subagent, and writes the final answer once it has gathered enough. You launch a manager exactly like any other agent: one session, one answer. The fan-out happens behind it. Each subagent is a full agent with its own environment, model, skills, and instructions, and runs as its own session, isolated from its siblings. The manager runs them in parallel (up to a concurrency cap set by its model profile), so breadth that would be sequential for one agent happens at once. Multi-agent is especially efficient for parallelizable tasks: researching a question across many sources at once, pairing a fast text-mode searcher with a visual subagent for pages that need real clicks, or having one subagent verify what another found.

Define a manager

List subagents inline (or by catalog name). A manager that only delegates can omit environments entirely; give it one only if it should also act on a surface itself, as the manager below does. Here a research manager delegates to a fast text-mode searcher and a visual verifier:
curl -X POST https://agp.eu.hcompany.ai/api/v2/agents \
  -H "Authorization: Bearer $HAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "research-orchestrator",
    "description": "Researches a question across sources and synthesizes a sourced answer.",
    "environments": ["h/browser"],
    "instructions": "Split the question into independent sub-questions, delegate each, then reconcile the findings into one sourced answer.",
    "subagents": [
      {
        "name": "fast-searcher",
        "description": "Searches the web quickly in text mode. Use for broad lookups and gathering candidate sources.",
        "model": "holo3-1-35b-a3b",
        "environments": [
          {"id": "search-browser", "kind": "web", "mode": "text", "width": 1280, "height": 720, "start_url": "https://www.bing.com"}
        ]
      },
      {
        "name": "visual-verifier",
        "description": "Visually inspects a specific page to confirm a fact or read content behind interactions.",
        "environments": ["h/browser"]
      }
    ]
  }'
Then run a session against research-orchestrator like any other agent.

How a run unfolds

  1. The manager reads the task and decides how to split it.
  2. It spawns subagents, each as a child session with its own task. They run in parallel.
  3. It waits for them to finish, then reads their answers.
  4. It may spawn follow-ups to fill gaps or verify findings.
  5. It synthesizes a single final answer and returns it. That answer is what your session receives.
The manager picks which subagent to spawn from each subagent’s description, so write descriptions as capability statements (“Use for…”), the same way you would for a skill.

What a subagent sees

A subagent works in isolation and is instructed to finish its task on its own:
  • It has no access to the end user. It cannot ask questions or send messages to you; only the manager surfaces anything. Give it a self-contained task.
  • The manager receives only the subagent’s final answer, not its scrollback or intermediate observations. A good subagent answer carries its own data, source URLs, and caveats.
  • It can delegate further. A subagent that lists its own subagents becomes a manager for them, nested up to 16 levels deep (a deeper chain, or a cycle, is rejected with 422 when the agent is resolved). Keep trees shallow well before that: deep nesting multiplies sessions and cost, and the manager you launch owns the top-level decomposition.

Observe and control the tree

Each subagent is a real session, so the whole tree is inspectable and steerable:
  • The manager’s status lists its children in subagent_session_ids. Retrieve or watch any of them like a normal session.
  • Filter children by their parent with GET /sessions?parent_session_id=..., or tag a whole run with group_id and list it with GET /sessions?group_id=....
  • Force an answer or cancel the manager and the signal cascades: in-flight subagents get a short grace window (about 30s) to wrap up, partial results fold into the manager’s answer, and anything still unfinished is cancelled.