In this talk, Cloudflare's Matt Carey explains how AI agents will use APIs and why the current tool-bundling approach does not scale. He covers the context-window problem, alternatives such as CLIs and tool search, Cloudflare's code mode approach, and the importance of safe execution environments for untrusted code.


1. Agents meet APIs

Carey works on MCP and agents at Cloudflare, with a focus on how agents interact with the outside world. The original way to give agents "hands" was tool calling or function calling: the model chooses a function, and the system executes it.

Early agents bundled their own tools, which created duplication. Each agent or client had to implement similar access to the same APIs. Remote MCP helped standardize tool access so service providers could expose common tools once.

But this introduced a new problem: context windows could explode.


2. The context-window explosion

Cloudflare wanted agents to access the full API surface. But turning every endpoint into a tool would require too much context. Its OpenAPI spec alone is millions of tokens, and the tool definitions would still be far beyond what most models can handle.

Splitting the API into product-specific MCP services reduced the context burden, but it forced users to choose the right service and often failed to expose the whole API surface.

The real requirement is progressive discovery: agents should be able to find the right capability when needed without loading everything at once.


3. Three ways to handle the context problem

The first option is the CLI. In a sandbox, an agent can run commands and use --help to discover parameters. This is powerful, but it assumes shell access.

The second option is tool search. A client loads only the tools relevant to the user's prompt. This reduces context, but irrelevant tools may still remain and the agent is constrained by the search system.

The third option, and Cloudflare's preferred direction, is code mode. Instead of exposing every operation as a static tool, let the model write code against typed APIs. TypeScript types compactly describe inputs and outputs, and the OpenAPI spec remains the source of truth.

As models improve, this approach improves with them. The agent writes code to list workers, deploy workers, configure access, or call other APIs rather than selecting from thousands of preloaded tool descriptions.


4. The challenge: running untrusted code safely

Code mode is powerful, but it means executing code generated by an untrusted model. That creates obvious security risks: secret access, filesystem access, infinite loops, excessive network calls, or abuse of compute.

Cloudflare's answer is Workers. V8 isolates provide a strong sandbox where the environment can be controlled programmatically. Carey demonstrates blocking access to secrets, disabling Node compatibility when needed, and limiting external network access.

These programmable guardrails make it possible to expose broad read-only API access while still keeping execution safe.


5. The future of agents and APIs

Carey expects more isolated environments across the web. Code is a compact plan, and models are increasingly good at writing it. Technologies like Deno, Pydantic-style execution environments, and Cloudflare WorkerD point in that direction.

API providers also need to prepare for agents as API users. More generated code means more requests, more parallel loops, and more need for strong rate limiting and abuse controls.

On the client side, agents will increasingly create scripts that can be saved and reused for scheduled work, scraping, and repeated tasks. Agents may even repair those scripts when they fail.

Finally, MCP itself may become middleware. A future framework might expose an API to agents with a simple flag such as MCP=true, making agent tooling a native part of full-stack development.


Closing

Carey's core point is that large APIs should not try to squeeze every endpoint into a model's context as a static tool. Agents need compact discovery, typed surfaces, and safe code execution. For API providers, the future user is not only a human developer but also code written and run by an agent.

Related writing