Why One AI Agent Is Not Enough

Key insights

Separation of concerns is borrowed from software engineering, not invented for AI. The pattern works, but it is not a breakthrough.
Hierarchical agents trade context dilution for orchestration complexity. Picking the wrong tradeoff can make the system worse.
Model flexibility lets teams put expensive models only where they matter, cutting inference costs without sacrificing planning quality.

SourceYouTube

Published March 12, 2026

IBM Technology

Hosts:Martin Keen

This is an AI-generated summary. The source video may include demos, visuals and additional context.

Watch the video · How the articles are generated

In Brief

Martin Keen, Master Inventor at IBM, argues that single AI agents break down on complex tasks because they lose focus, pick the wrong tools, and ignore information buried in the middle of long prompts. His proposed fix is hierarchical agents: layered systems where a high-level agent plans, mid-level agents coordinate, and low-level agents execute narrow tasks. The approach solves real problems but introduces new ones, including orchestration overhead and what Keen calls the "telephone game effect," where instructions degrade as they pass through layers.

The single-agent problem

Before explaining hierarchies, Keen lays out why a single agent struggles with complex, multi-step work. He identifies three failure modes.

The first is context dilution, where the original goal gets buried under intermediate steps. As a task grows, the agent's prompt fills up with history, and the signal of what it was actually trying to do fades into noise.

The second is tool saturation. The more tools an agent has access to, the harder it becomes to pick the right one. More choices mean more chances to call the wrong tool or pass invalid arguments.

The third is the "lost in the middle" phenomenon. Even when the right instruction is in the prompt, large language models tend to underweight content buried in the middle of a long context window. Information at the beginning and end gets more attention than information in the middle.

These three problems share a root cause: a single agent is doing too many jobs at once. It plans, executes, and checks quality, all while juggling an ever-growing context window.

Hierarchy as a fix

Keen's solution borrows directly from separation of concerns, a principle from software engineering where each component handles one specific responsibility. Applied to agents, it means splitting the work across two or three tiers.

The high-level agent handles strategy. It processes plans, breaks goals into subtasks, and decides which agents should handle what. There is typically just one.

Mid-level agents receive directives from above and coordinate teams of specialists. They further decompose tasks and manage workflows.

Low-level agents are the doers. Each one is specialized for a narrow task, trained on specific data, or given access to particular tools. Keen compares this to the principle of least privilege in IT security: each agent gets only the tools it needs. A security agent gets the vulnerability scanner, not the deployment pipeline.

This structure addresses the original problems in concrete ways. Instead of dumping the entire conversation history into every prompt, the high-level agent sends "contextual packets," pruned slices of context that contain only what a specific agent needs. If an agent's job is to format a JSON file, it does not need the initial 4,000-word strategy document. That keeps the signal-to-noise ratio high and avoids the lost-in-the-middle problem.

Model flexibility is another practical benefit. A single-agent system usually needs the most powerful (and expensive) model available, because some tasks demand it. In a hierarchy, the heavyweight model handles planning at the top, while lighter, cheaper models run the simpler tasks at the bottom. This can significantly reduce inference costs, the computational expense of running an AI model to generate responses.

Opposing perspectives

Keen does not present hierarchical agents as a silver bullet. He dedicates the final third of the video to honest limitations, which is worth noting because the framing could easily have been pure advocacy.

Task decomposition is genuinely hard

The entire system depends on the high-level agent's ability to break a complex goal into the right subtasks. If it over-decomposes simple tasks into unnecessary steps, or sequences things in the wrong order, everything downstream inherits the mistake. Current large language models are inconsistent at planning. They miss dependencies, underestimate complexity, and sometimes create more steps than needed.

Orchestration overhead

A single agent just needs a prompt. A hierarchy requires designing state management, defining handoff logic between agents, and building retry loops for failures. If the coordination logic is brittle, the system can fall into a recursive loop where agents pass errors back and forth until they hit their token limit.

The telephone game effect

Keen draws a direct parallel to workplace communication. A manager issues an instruction that passes through two colleagues before reaching the person doing the work. By then, the message has shifted. The same thing happens when task decomposition is slightly off or the wrong bit of context gets pruned along the way. The specialized agent can end up perfectly executing the wrong task.

How to interpret these claims

Keen's presentation is clear and balanced, but several points deserve additional context.

An established pattern, not a new idea

Separation of concerns is decades old in software engineering. Applying it to AI agents is logical, but it is not a conceptual breakthrough. The real challenge, and the part Keen acknowledges, is that the orchestration layer is hard to build well. The pattern is proven; the implementation in AI systems is still maturing.

Self-reported benefits

The video does not cite benchmarks, academic papers, or production case studies. The claimed benefits (better focus, lower costs, parallel execution) are plausible and well-reasoned, but they remain architectural arguments rather than measured outcomes. Stronger evidence would include side-by-side comparisons of single-agent versus hierarchical systems on the same tasks.

The tradeoff equation

Every benefit maps to a cost. Contextual packets reduce dilution but require someone to design the pruning logic correctly. Tool specialization reduces selection errors but means more agents to maintain. Model flexibility saves money but adds complexity in model selection and routing. The question is not whether hierarchies help, but whether the orchestration overhead is worth the gains for a given use case.

Practical implications

For teams building AI agents

Start simple. If a single agent handles the task well enough, adding hierarchy creates complexity without clear benefit. Hierarchy becomes valuable when tasks are long-running, involve multiple tools, or require different levels of reasoning.

For decision-makers evaluating AI tools

Ask vendors about failure modes, not just capabilities. A hierarchical agent system that cannot handle decomposition errors gracefully will produce confident-sounding wrong answers at scale.

Glossary

Term	Definition
Hierarchical agents	An AI agent system where agents are organized in layers, each with different responsibilities.
Context dilution	When an AI loses focus on its original goal as intermediate steps pile up in the prompt.
Tool saturation	When an agent has too many tools available, making it harder to pick the right one.
Lost in the middle	The tendency of large language models to underweight information buried in the middle of a long prompt.
Separation of concerns	A software engineering principle where each component handles one specific responsibility.
Principle of least privilege	A security principle: give each component only the access it needs, nothing more.
Contextual packets	Pruned, relevant slices of context sent to lower-level agents instead of the full history.
Task decomposition	Breaking a complex goal into smaller subtasks that can be handled independently.
Orchestration overhead	The extra engineering work required to coordinate multiple agents in a system.
Inference costs	The computational expense of running an AI model to generate a response.