This video features Shaw Talebi explaining Claude Code's subagents and agent teams in depth, sharing experimental results comparing the two approaches on real tasks. Starting from Claude Code's core concepts, Shaw walks through the context management challenges AI agents face and clarifies how subagents and agent teams each address those limitations. He illustrates the pros and cons of each approach with real examples and data, helping viewers understand both features and know when to use them.
1. Claude Code and the Importance of Context Management ✨
Shaw begins by explaining what Claude Code is. At its core, Claude Code is a combination of Claude — the underlying language model — and a range of tools and software built around it. These tools include local file access, web search, terminal command execution, conversation compaction, and various mode switches such as "think mode" and "chat mode." Claude Code can also create to-do lists for complex tasks and ask the user clarifying questions to gather additional context. Together, these capabilities make Claude Code a powerful AI agent capable of handling diverse tasks, including complex software engineering work.
Even the most capable LLM, however, faces several challenges when it comes to handling context during complex tasks. 😥
-
Technical limits: LLMs have a technical constraint on how much text they can process at once, called the context window. Claude Sonnet, for example, has a 200,000-token context window — roughly equivalent to one long textbook. Everything must fit inside: system messages, tool access information, user messages, the model's reasoning, tool call results, and agent responses. If the volume of information exceeds this limit, the agent simply cannot access that information.
-
Context Rot: A bigger problem is that even when all information fits within the context window, model performance degrades as it fills up — a phenomenon called context rot. Shaw illustrates this with a graph showing that once the context window reaches 50%, 60%, or 70% capacity, model performance drops at an observable rate.
These problems make context management critically important: keeping only the necessary information in the context window at the right time is the key.
2. Subagents: Managing Context Through Division of Labor 🛠️
Shaw introduces subagents as one of the most powerful approaches to context management. The core idea is straightforward:
"The idea of delegating tasks to a new, specialized Claude Code instance."
When a user sends a request to Claude Code, the main agent — the Claude instance interacting with the user — can delegate some or all of the work to a subagent instead of handling it directly. For example, if the main agent is asked to "research all coding agent tools available today," it can spin up a subagent dedicated to that research rather than doing it itself.
How subagents work:
- The main agent creates a subagent with a specific purpose.
- The subagent performs the task and produces results.
- The subagent reports those results back to the main agent.
- The main agent synthesizes all results and delivers them to the user.
Claude isn't limited to one subagent — it can spawn as many as needed, which is especially useful for research tasks. Multiple subagents can simultaneously investigate different search terms or domains and report their findings to the main agent, which then consolidates everything.
The three core components of a subagent are its model, tools, and purpose. Claude Code ships with several built-in subagents:
- Explore Agent: Uses the smallest model, Haiku, with read-only tools for file and code search — designed for understanding a codebase.
- Planning Subagent: Uses the same model as the main agent with read-only tools, useful for studying a codebase and drafting an implementation plan.
- General Purpose Agent: A copy of the main agent with the same model and full tool access — useful for complex research or multi-step tasks.
Users can also create their own custom subagents — for example, one that performs code reviews in a specific style or enforces certain coding standards. Subagents are defined as text files containing metadata (name, description, accessible tools, model) along with instructions for the subagent to follow. The main agent uses a subagent's name and description to automatically determine which one fits the current task and invoke it accordingly. Users can also explicitly direct the main agent to use a particular subagent.
3. The Limits of Subagents and the Rise of Agent Teams 🚀
Subagents still have limitations. Shaw describes these as a bottleneck:
"These subagents can do a lot of work, but only the information reported back to the main agent is preserved."
The key limitations are:
- No direct communication between subagents: Subagents can only communicate with the main agent — they cannot talk to each other directly. If subagent A is building the frontend and subagent B is building the backend and they are interdependent, it would be far more efficient for them to communicate directly. Under the subagent model, each reports to the main agent, which must integrate the outputs and relay instructions back when problems arise.
- Main agent context limits: Because all information flows through the main agent, it still faces context window limits and context rot. If the main agent has to process too much information, its performance can degrade.
Agent teams emerged to address these limitations. Agent teams essentially allow subagents to talk to each other. 🗣️
The flow starts the same way — the user sends a request to the main agent — but instead of simply spawning subagents, the main agent now creates a shared task list and assembles a full team of agents, each with distinct instructions.
- Direct interaction with the shared task list: Subagents can interact with the shared task list directly, without going through the main agent. They can claim tasks, mark them complete, and update the shared context themselves.
- Direct communication between subagents: Most importantly, subagents can talk to each other directly. A frontend agent, a backend agent, and a validation agent can freely exchange feedback and coordinate their work.
- Main agent as supervisor: Each agent works independently until its tasks are done. The main agent oversees everything, monitors the shared task list, and once all tasks are complete, performs a final review before delivering results to the user.
As of 2026, agent teams remain an experimental feature that must be manually enabled. To activate it, create a hidden .claude folder inside your project directory, add a settings.json file, and set the claude-code-experimental-agent-teams variable to 1. Once enabled, you can explicitly instruct Claude to "create an agent team to perform tasks X, Y, Z" and specify the roles each agent should take (e.g., frontend, backend, validation).
4. Subagents vs. Agent Teams: A Comparative Analysis ⚖️
Shaw summarizes the similarities and key differences between subagents and agent teams.
✅ Similarities:
- Both allow Claude to delegate tasks to new Claude Code instances.
- Both are powerful ways to effectively manage the main agent's context window — especially for token-heavy research tasks, filtering out unnecessary information so the context window doesn't accumulate noise and performance doesn't degrade.
❌ Key differences:
- Subagents: Subagents cannot communicate with each other.
- Agent teams: Subagents can communicate with each other.
Shaw connects this distinction to multi-agent system architectures:
-
Subagents: Correspond to a centralized architecture. The main agent acts as the orchestrator, and subagents act as worker agents that interact only with the main agent, forming a clear hierarchy.
- Advantages: The architecture is simpler and more fault tolerant. If one subagent makes a mistake, that error doesn't propagate to all other agents — the main agent has the opportunity to review and correct it.
- Best use case: Well suited for sequential tasks — for example, completing task A, then B, then C — where maintaining a single thread of reasoning is valuable.
-
Agent teams: Correspond to a hybrid architecture. There is a clear hierarchy between the lead agent (main agent) and the subagents, but because subagents can communicate with each other, it also takes on characteristics of a decentralized architecture — a blend of centralized and decentralized.
- Advantages: More complex architecture, but the ability for subagents to communicate makes it highly advantageous for parallelizable tasks. For example, when building a complex codebase, separate subagents can simultaneously work on the frontend, backend, authentication, and payment integration — coordinating with each other as they go. This allows Claude Code to build larger, more sophisticated applications while avoiding context management pitfalls.
- Disadvantages: Susceptibility to error cascades. A critical mistake by one subagent can propagate to others, causing widespread issues and performance degradation.
5. Mini Experiment: Subagents vs. Agent Teams Head-to-Head 🧪
Shaw ran a mini experiment applying subagents and agent teams to real tasks, evaluating how well each approach handled three challenges.
Experiment tasks:
- Lead list generation: Create a CSV of 50 potential customer contacts from mid-sized tech companies in the Dallas–Fort Worth area.
- YouTube course generator app: Build a web app that takes a YouTube playlist as input and converts it into a distraction-free online course.
- Landing page generation: Design a landing page and write compelling copy.
Shaw's initial hypotheses:
- Agent teams would outperform on parallelizable tasks like lead list generation.
- Subagents would outperform on sequential tasks like building the YouTube course generator app.
- Landing page generation mixed sequential and parallelizable work, making the outcome hard to predict.
Experiment setup:
- Both approaches received the same prompt (with one added line for agent teams explicitly instructing them to use an agent team).
- Each task was run once to control costs (multi-agent systems are token-intensive).
- Evaluation criteria: total execution time, token usage, and output quality.
5.1. Task 1: Lead List Generation (Parallelizable Task) 📊
- Goal: Produce a CSV of 50 contacts from mid-sized tech companies in the Dallas area.
- Expectation: Agent teams would be faster and more effective.
| Metric | Subagents | Agent Teams |
|---|---|---|
| Execution time | 27 min | 19 min (8 min faster) |
| Token usage | 165,000 | 195,000 (30,000 more) |
| Contacts generated | 50 | 50 |
| Emails provided | 50 | 8 |
Results:
- Execution time: As expected, agent teams were significantly faster, thanks to multiple agents running in parallel.
- Token usage: Agent teams used more tokens, but not by the 2–3× margin Shaw anticipated.
- Output quality: Surprisingly, subagents delivered better quality. Both systems generated 50 contacts, but the agent team provided only 8 emails, while the subagent approach provided an email for every contact. Shaw attributed this to agent teams being less persistent about completing the task thoroughly.
5.2. Task 2: YouTube Course Generator App (Sequential Task) 🖥️
- Goal: Build a web app that converts a YouTube playlist into a distraction-free online course.
- Expectation: Subagents would perform better.
| Metric | Subagents | Agent Teams |
|---|---|---|
| Execution time | 47.5 min | 45 min (2.5 min faster) |
| Token usage | 99,000 | 111,000 (12,000 more) |
| Output quality | Better design, includes progress bar | Worse design, no progress bar |
Results:
- Execution time: Agent teams were slightly faster, but the margin was small. Shaw explained that because the task offered limited parallelization opportunities, the agent team's work ended up proceeding sequentially in practice despite spawning agents.
- Token usage: Subagents used fewer tokens.
- Output quality: Subagents clearly won. The subagent-built app had a better design and included a progress bar, while the agent team's app had a worse design and no progress bar. Shaw noted that for cohesive tasks like a single-page web app, a single agent (the subagent approach) maintaining the full context throughout works to its advantage.
5.3. Task 3: Landing Page Generation (Mixed Task) 🌐
- Goal: Design a landing page and write compelling copy.
- Expectation: Hard to predict (mix of sequential and parallelizable work).
| Metric | Subagents | Agent Teams |
|---|---|---|
| Execution time | 42 min | 52 min (10 min slower) |
| Token usage | 102,000 | 164,000 (62,000 more) |
| Output quality | Better design, includes FAQ | Better copy, clearly defines who it is/isn't for |
Results:
- Execution time: Surprisingly, subagents were faster. Shaw explained that the agent team spent a lot of time on extensive research into his background, copywriting best practices, landing page examples, and so on.
- Token usage: Subagents used far fewer tokens.
- Output quality: No clear winner. The subagent landing page had a better design and an FAQ section, while the agent team landing page offered better copy — including a clear "who this is for" and "who this is not for" breakdown — and more persuasive content overall. Both implementations successfully integrated the waitlist sign-up field with a ConvertKit account.
6. Key Takeaways and Conclusions 💡
Shaw's mini experiment produced results that deviated somewhat from expectations, but offered several important insights.
- Speed: Agent teams generally completed tasks faster, thanks to parallel processing.
- Token usage: Agent teams generated more tokens across all tasks, though not by the overwhelming margin Shaw expected.
- Output quality: Subagents consistently delivered better output quality overall.
Shaw speculates that these results may reflect the fact that agent teams are still in an experimental stage and that the early scaffolding is immature — meaning the issue may be one of implementation rather than a fundamental limitation of the agent team architecture itself. He expects Anthropic to improve the feature as more feedback comes in.
"But at this point, I would probably stick with subagents for doing actual work."
Shaw concludes that subagents are currently more reliable for real tasks, while agent teams are worth exploring further through additional experimentation. He invites viewers who have experience with agent teams or specialized subagents to share their findings in the comments, emphasizing that such information would be valuable to him and other viewers alike.
