12-Factor Agents: Patterns for Reliable LLM Applications

The Journey and Reality of Agent Development

Dex Horthy opens his talk by asking the audience whether they've tried building an agent. He notes that many developers take a shot at it, but most stall out somewhere around the 70–80% mark.

"You can get to 70–80% pretty quickly. The CEO is excited, the team is growing. But then comes the hard part."

At that point, developers find themselves digging through deep call stacks, reverse-engineering how prompts are assembled and how tools are passed along. The conclusion many reach:

"Like me, you end up throwing it all away and starting over from scratch."

He also stresses that not every problem needs an agent. He recounts setting out to build a DevOps agent, only to realize:

"I could have just written a bash script in 90 seconds."

Patterns Found in the Field — and the Birth of 12-Factor Agents

After conversations with more than 100 founders, developers, and engineers, Dex identified common patterns shared by LLM-based agents that actually work in production.

"Most production agents aren't really that 'agentic.' They're closer to just software."

Rather than rewriting existing codebases from scratch, these patterns involved applying small, modular concepts on top of existing code. Dex distilled them into 12-Factor Agents, published the framework on GitHub, and found it resonated widely.

"This isn't a critique of frameworks. It's more of a wish list — what capabilities a framework should provide so that great builders can move fast while still achieving high reliability."

The Core Principles of 12-Factor Agents

1. The Magic of LLMs Is Turning Sentences into JSON

The most powerful thing an LLM can do isn't running complex code or wielding sophisticated tools — it's this:

"Taking a sentence like this and turning it into JSON like this."

What you do with that JSON is handled by the patterns that follow.

2. Own Your Prompts and Context Window Directly

How well you handle prompts determines the quality of your agent.

"At some point you'll need to write every token by hand." Since an LLM is a pure function that only responds to the tokens it receives, "Putting in good tokens — that is the core of reliability."

The same applies to the context window:

"You need to look at every single token and optimize how clearly information is being communicated."

3. Tool Use Is Not Magic

Many people think tool use is the heart of an agent, but Dex goes as far as saying:

"Tool use is harmful."

In practice, the LLM outputs JSON, and deterministic code receives and acts on it. That's all there is to it.

4. Own Your Control Flow

Managing the execution flow yourself is what yields high reliability.

"Code is a graph. If you've written an if-statement, you've already written a DAG."

The basic structure of an agent looks like this:

Prompt: instructs how to choose the next step
Switch statement: receives the JSON the model outputs and processes it
Context window: accumulates information
Loop: determines when and how to terminate

"When you own the control flow, you have the flexibility to break, switch, summarize, or let the LM make a judgment call."

5. State Management and API Exposure

Execution state and business state should be managed separately.

Execution state: current step, next step, retry count, etc.
Business state: user messages, display data, pending approvals, etc.

This state should sit behind a REST API or MCP server, and the context window must be serializable to and restorable from a database.

"An agent is just software. To build it well you need flexibility, and the heart of that is owning your inner loop."

6. Error Handling and Context Optimization

When the model makes a bad API call or an error occurs:

"Don't just pile errors into the context window — summarize them and keep only what's necessary." This way the model retains context and produces better results.

7. Human-in-the-Loop Interaction

When you let the agent decide from natural-language tokens whether to call a tool or talk to a human:

"The model can naturally express intentions like 'done,' 'need more clarification,' or 'need to escalate to a manager.'"

This approach enables agent interaction across many channels — email, Slack, Discord, SMS, and more.

8. Micro-Agent Architecture

Agents that work effectively are almost always small and focused.

"Most of the pipeline is handled by deterministic code; the LLM is only used in small loops of 3–10 steps."

As an example, HumanLayer's deployment bot runs mostly on CI/CD code, and only when a GitHub PR is merged and tests pass does it tell the model to "deploy." The model might say "I'll start with the frontend," and a human can correct it in natural language: "No, do the backend first." That natural-language input gets turned into JSON to determine the next step in the workflow.

9. Keep State Stateless

The agent itself should not own state — state should be managed externally.

"An agent isn't a reducer, it's a transducer. It goes through multiple steps."

Framework vs. Library — and the Path Forward

Dex argues:

"What agents need isn't bootstrapping — it's scaffolding, like shadCN, that sets up the structure and then lets you own everything yourself."

Summary and Closing

Agents are, in the end, software.
Because an LLM is a stateless function, you must own the context, state, and control flow yourself to get the best results.
Human collaboration should be deliberately designed into the system.
Engaging directly with the hard parts is actually the path to building better agents.

"You can all build software. If you've used a switch statement and a while loop, you can do this." "Agents get better when humans are in the loop. Think carefully about how to design that collaboration." "There are hard parts, but right now it's worth doing them yourself. Most frameworks try to hide the hard parts of AI — what we need are tools that help you focus on those hard parts instead."

Dex closes by emphasizing the importance of open source and collaboration:

"Let's build better agents together. See you in the hallway!"

Key Concept Glossary

Agent
LLM (Large Language Model)
Prompt Engineering
Context Window
Control Flow
Tool Use
State Management
Micro Agents
Human in the Loop
Open Source, Collaboration

💡 Conclusion: Agent development may look complex, but it ultimately comes down to solid, fundamentals-driven software engineering. If you focus on direct control, direct experimentation, and designing for human collaboration, anyone can build a trustworthy LLM agent. 🚀

The Journey and Reality of Agent Development

Dex Horthy opens his talk by asking the audience whether they've tried building an agent. He notes that many developers take a shot at it, but most stall out somewhere around the 70–80% mark.

"You can get to 70–80% pretty quickly. The CEO is excited, the team is growing. But then comes the hard part."

At that point, developers find themselves digging through deep call stacks, reverse-engineering how prompts are assembled and how tools are passed along. The conclusion many reach:

"Like me, you end up throwing it all away and starting over from scratch."

He also stresses that not every problem needs an agent. He recounts setting out to build a DevOps agent, only to realize:

"I could have just written a bash script in 90 seconds."

Patterns Found in the Field — and the Birth of 12-Factor Agents

After conversations with more than 100 founders, developers, and engineers, Dex identified common patterns shared by LLM-based agents that actually work in production.

"Most production agents aren't really that 'agentic.' They're closer to just software."

"This isn't a critique of frameworks. It's more of a wish list — what capabilities a framework should provide so that great builders can move fast while still achieving high reliability."

The Core Principles of 12-Factor Agents

1. The Magic of LLMs Is Turning Sentences into JSON

The most powerful thing an LLM can do isn't running complex code or wielding sophisticated tools — it's this:

"Taking a sentence like this and turning it into JSON like this."

What you do with that JSON is handled by the patterns that follow.

2. Own Your Prompts and Context Window Directly

How well you handle prompts determines the quality of your agent.

"At some point you'll need to write every token by hand." Since an LLM is a pure function that only responds to the tokens it receives, "Putting in good tokens — that is the core of reliability."

The same applies to the context window:

"You need to look at every single token and optimize how clearly information is being communicated."

3. Tool Use Is Not Magic

Many people think tool use is the heart of an agent, but Dex goes as far as saying:

"Tool use is harmful."

In practice, the LLM outputs JSON, and deterministic code receives and acts on it. That's all there is to it.

4. Own Your Control Flow

Managing the execution flow yourself is what yields high reliability.

"Code is a graph. If you've written an if-statement, you've already written a DAG."

The basic structure of an agent looks like this:

Prompt: instructs how to choose the next step
Switch statement: receives the JSON the model outputs and processes it
Context window: accumulates information
Loop: determines when and how to terminate

"When you own the control flow, you have the flexibility to break, switch, summarize, or let the LM make a judgment call."

5. State Management and API Exposure

Execution state and business state should be managed separately.

Execution state: current step, next step, retry count, etc.
Business state: user messages, display data, pending approvals, etc.

This state should sit behind a REST API or MCP server, and the context window must be serializable to and restorable from a database.

"An agent is just software. To build it well you need flexibility, and the heart of that is owning your inner loop."

6. Error Handling and Context Optimization

When the model makes a bad API call or an error occurs:

"Don't just pile errors into the context window — summarize them and keep only what's necessary." This way the model retains context and produces better results.

7. Human-in-the-Loop Interaction

When you let the agent decide from natural-language tokens whether to call a tool or talk to a human:

"The model can naturally express intentions like 'done,' 'need more clarification,' or 'need to escalate to a manager.'"

This approach enables agent interaction across many channels — email, Slack, Discord, SMS, and more.

8. Micro-Agent Architecture

Agents that work effectively are almost always small and focused.

"Most of the pipeline is handled by deterministic code; the LLM is only used in small loops of 3–10 steps."

9. Keep State Stateless

The agent itself should not own state — state should be managed externally.

"An agent isn't a reducer, it's a transducer. It goes through multiple steps."

Framework vs. Library — and the Path Forward

Dex argues:

"What agents need isn't bootstrapping — it's scaffolding, like shadCN, that sets up the structure and then lets you own everything yourself."

Summary and Closing

Agents are, in the end, software.
Because an LLM is a stateless function, you must own the context, state, and control flow yourself to get the best results.
Human collaboration should be deliberately designed into the system.
Engaging directly with the hard parts is actually the path to building better agents.

"You can all build software. If you've used a switch statement and a while loop, you can do this." "Agents get better when humans are in the loop. Think carefully about how to design that collaboration." "There are hard parts, but right now it's worth doing them yourself. Most frameworks try to hide the hard parts of AI — what we need are tools that help you focus on those hard parts instead."

Dex closes by emphasizing the importance of open source and collaboration:

"Let's build better agents together. See you in the hallway!"

Key Concept Glossary

Agent
LLM (Large Language Model)
Prompt Engineering
Context Window
Control Flow
Tool Use
State Management
Micro Agents
Human in the Loop
Open Source, Collaboration

The Journey and Reality of Agent Development

Patterns Found in the Field — and the Birth of 12-Factor Agents

The Core Principles of 12-Factor Agents

1. The Magic of LLMs Is Turning Sentences into JSON

2. Own Your Prompts and Context Window Directly

3. Tool Use Is Not Magic

4. Own Your Control Flow

5. State Management and API Exposure

6. Error Handling and Context Optimization

7. Human-in-the-Loop Interaction

8. Micro-Agent Architecture

9. Keep State Stateless

Framework vs. Library — and the Path Forward

Summary and Closing

Key Concept Glossary

Related writing

Reframing a Company's Real AI Asset

Every Conversation Gets Recorded

When AI Builds Itself

Reading

The Journey and Reality of Agent Development

Patterns Found in the Field — and the Birth of 12-Factor Agents

The Core Principles of 12-Factor Agents

1. The Magic of LLMs Is Turning Sentences into JSON

2. Own Your Prompts and Context Window Directly

3. Tool Use Is Not Magic

4. Own Your Control Flow

5. State Management and API Exposure

6. Error Handling and Context Optimization

7. Human-in-the-Loop Interaction

8. Micro-Agent Architecture

9. Keep State Stateless

Framework vs. Library — and the Path Forward

Summary and Closing

Key Concept Glossary

Related writing

Reframing a Company's Real AI Asset

Every Conversation Gets Recorded

When AI Builds Itself