Bindu - A2A Protocol Compliant AI Agent Framework

Stop the joke agent (Ctrl+C in its terminal). We’ll start both it and four more using a helper script:

./examples/gateway_test_fleet/start_fleet.sh

Expected output:

  [joke_agent]      started, pid=64945
  [math_agent]      started, pid=64958
  [poet_agent]      started, pid=64969
  [research_agent]  started, pid=64980
  [faq_agent]       started, pid=64993

Five agents now, each on its own port:

Agent	Port	Does
`joke_agent`	3773	Tells jokes
`math_agent`	3775	Solves math problems step-by-step
`poet_agent`	3776	Writes short poems
`research_agent`	3777	Web search + summarize a factual question
`faq_agent`	3778	Answers from a canned FAQ

Each is under ~100 lines of Python. Open any one - say joke_agent.py - and you’ll see a small configuration that wires a language model (openai/gpt-4o-mini, dispatched through OpenRouter) to a few lines of instructions (“tell jokes, refuse other requests”). Narrow scope on purpose so mistakes are visible.

Gateway is already running from the previous chapter; don’t restart it.

A three-agent question

Paste this into your curl terminal. It asks something that genuinely needs three agents to answer:

curl -N http://localhost:3774/plan \
  -H "Authorization: Bearer ${GATEWAY_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "First research the current approximate population of Tokyo. Then compute what exactly 0.5% of that population is. Finally write a 4-line poem celebrating that number of people.",
    "agents": [
      {
        "name": "research", "endpoint": "http://localhost:3777",
        "auth": { "type": "none" },
        "skills": [{ "id": "web_research", "description": "Web search and summarize a factual question" }]
      },
      {
        "name": "math", "endpoint": "http://localhost:3775",
        "auth": { "type": "none" },
        "skills": [{ "id": "solve", "description": "Solve math problems step-by-step" }]
      },
      {
        "name": "poet", "endpoint": "http://localhost:3776",
        "auth": { "type": "none" },
        "skills": [{ "id": "write_poem", "description": "Write a short poem" }]
      }
    ]
  }'

This takes around 15 seconds and produces three task.started events, in order - research first, then math, then poet. Abbreviated output from a real run:

task.started  → research called with "What is the current population of Tokyo?"
task.artifact → "Tokyo's metropolitan area has approximately 36.95 million people..."
task.finished → completed

task.started  → math called with "Compute 0.5% of 36,950,000"
task.artifact → "0.005 × 36,950,000 = 184,750"
task.finished → completed

task.started  → poet called with "Write a 4-line poem about 184,750 people"
task.artifact → "In Tokyo's heart, where dreams align, / 184,750 souls brightly shine, / ..."
task.finished → completed

text.delta    → "Step 1 - Population: 36.95 million..."
...
final
done

The gateway chose the order, extracted the right number from each reply, and passed it to the next agent - all without you writing a single line of glue code. That’s the whole point.

How it chose

The planner saw three tools available (one per agent-skill combination):

Tool name	Description
`call_research_web_research`	Web search and summarize a factual question
`call_math_solve`	Solve math problems step-by-step
`call_poet_write_poem`	Write a short poem

Where do those tool names come from? The gateway builds them automatically from the name and skills[].id fields in your request: call_<agent-name>_<skill-id>.

Then the planner read the question: “First research… Then compute… Finally write a 4-line poem…” The word “First” strongly suggests research is step 1, and the LLM picked call_research_web_research. It waited for the reply, re-read the question with the new context, decided the next step was math, picked call_math_solve, and so on. All of this happens inside one HTTP request. The SSE stream is the gateway narrating what the planner decided.

What if you added a fourth agent it doesn’t need?

Try it. Add the joke agent to the catalog above and re-run:

{
  "name": "joke", "endpoint": "http://localhost:3773",
  "auth": { "type": "none" },
  "skills": [{ "id": "tell_joke", "description": "Tell a joke" }]
}

The SSE output is the same - three task.started events for research, math, poet. The joke tool sat there unused.

The planner only calls what it needs. This matters in production: you can hand the gateway a catalog of 50 agents, and only the 2 or 3 relevant to a given question will actually be invoked.

What is the planner, actually?

Inside the gateway, there’s a single agent configuration file called gateway/agents/planner.md. It’s a markdown file with YAML frontmatter — this is the real shape from the repo:

---
name: planner
description: Planning gateway for multi-agent Bindu collaboration
mode: primary
model: openrouter/anthropic/claude-sonnet-4.6
temperature: 0.3
steps: 10
permission:
  agent_call: ask
---

# System prompt body - the planner's own instructions.

The body is the system prompt. On each /plan request, the gateway does this:

Read the planner's system prompt from the cached agent registry.

The registry is built once at boot by scanning gateway/agents/*.md. There’s no per-request reload — see gateway/src/agent/index.ts.

Add the user's question as a new user message.

Plus any prior turns the client sent in history (the gateway is stateless, so history travels on the request).

Build the tool list from your agents[] catalog.

One tool per agent.skill pair, named call_<agent>_<skill>.

Hand all of that to OpenRouter with streamText().

Claude (or whatever model you picked) drives the loop.

Stream the output back to you as SSE.

Text deltas + tool calls + tool results.

Inside OpenRouter, Claude runs its agentic loop - text → tool call → tool result → more text → another tool call → final text. The gateway’s job is just to execute the tool calls against your real agents and plumb the results back.

Edits to gateway/agents/planner.md require a gateway restart. The agent registry is populated once at boot and cached for the lifetime of the process. If you tweak the system prompt and want it to take effect, Ctrl+C and npm run dev again.

Next up: teach the planner reusable patterns without editing its system prompt. Recipes →

Gateway doesn’t run agents randomly - it decides the order and connects their outputs.

Documentation Index

​A three-agent question

​How it chose

​What if you added a fourth agent it doesn’t need?

​What is the planner, actually?

A three-agent question

How it chose

What if you added a fourth agent it doesn’t need?

What is the planner, actually?