1. Introduction: A Paradigm Shift in AI Agents
- OpenAI has unveiled two groundbreaking models, declaring they will bring permanent change to how AI agents are built and understood.
- "This release is so significant that OpenAI itself no longer calls them 'models.' They call them 'AI systems.'"
- While YouTube discourse focuses on benchmark comparisons, "no benchmark we have today can capture the full potential of this release."
- Goals of this video:
- Why we can no longer call them "AI models"
- What an "AI system" actually is
- How to prepare to build the next generation of AI agents
2. What Is a System?
- Defining a system:
- Drawing on Donella Meadows' Thinking In Systems
- "A system is a set of interconnected things that, over time, produces its own distinct behavior."
- "The whole is greater than the sum of its parts."
- Examples of systems:
- A house, a phone, a business, a human being
- Three components of any system:
- Elements
- Interconnections
- Purpose
- Example:
- A home thermostat
- Goal: maintain temperature
- Elements: temperature sensors, air conditioning, windows, etc.
- "All these elements work together through a feedback loop to keep the temperature stable."
- "Feedback loops are absolutely critical in every system. They're what make a system behave in a particular way."
- A home thermostat
3. The Limitations of Old AI Models and the Innovation of New AI Systems
- Limitations of previous large language models (LLMs):
- "Earlier models took an input, produced an output, and could call tools in between — but that was it."
- "The model couldn't reflect on its own actions, so agents would often stall before completing a task."
- "There was no ability to reason about tool calls."
- How was this handled before?
- "You had to add more agents — having a higher-level agent supervise and reflect on lower-level agents."
- "All of this stemmed from the absence of a decisive feedback loop."
- The innovation of O4-mini and O3:
- "At last, there is a feedback loop!"
- Quoting the OpenAI blog:
- "These models were trained with reinforcement learning for tool use, learning not just how to use tools, but when to use them."
- "They can now execute up to 600 sequential tool calls. Six hundred!"
- "This is the single biggest breakthrough of this release."
4. How the New AI Systems Work: The Power of the Feedback Loop
- Old agents:
- Plan → sequential tool execution → stop
- "If something went wrong in the middle, there was no way to correct course."
- O3/O4-mini-based agents:
- "After each action, the model thinks and chooses the appropriate next path."
- "Now it can automate far more complex workflows without being locked into its initial plan."
- "If you can make 600 sequential tool calls, you can automate virtually any workflow in the world."
- Structural difference:
- Old GPT agents: model, knowledge, tools
- O3/O4-mini agents: reasoning (feedback loop) is added
- "This small feedback loop is what allows agents to loop on their own."
- "Now agents can keep going without human input or feedback."
- "They receive tool output, think about it, plan the next action, and iterate until the goal is reached."
- "This is the real breakthrough OpenAI has achieved."
5. Synergy of the AI System: Exceeding the Sum of Its Parts
- AI systems now leverage all three:
- Reasoning
- Tool calls
- Modalities (image/text input)
- "All of this happens within a continuous reinforcement feedback loop."
- "This AI system is far greater than the sum of its parts."
- What the benchmarks show:
- AIME 2025 (mathematics benchmark)
- "Simply adding tools to O4-mini closed the gap almost entirely — reaching 99.5% accuracy."
- "No new model was built. Just tools were added!"
- "From O3-mini to O4-mini: 46% improvement. O4-mini with tools: 93% improvement."
- "Just adding tools doubled benchmark performance."
6. The Future of AI Agents: The Beginning of Explosive Change
- "From here on out, it's exponential."
- "OpenAI has essentially solved math. And what is built on math? Large language models."
- "We can now build new AI systems faster than ever."
- Greater agent reliability:
- "Agents now reason after every tool call, and are better at understanding and correcting mistakes."
- Agents directing themselves:
- "You no longer need to tell an agent what to do at every step. It looks at the data, uses its tools, and finds the optimal path on its own."
- "Instead of long prompts, just give it your internal systems, data, and goal. The agent handles the rest."
- "No human can track more than 600 tool calls."
- The end of workflow automation platforms:
- "Agents can now design complex workflows on their own — tasks that would take days on Zapier or Make, they handle alone."
- The evolution of agents:
- "Thanks to this reasoning and self-reflection, agents can adjust their own instructions and tools to build more dynamic workflows."
- "If they can't do something, they restructure themselves."
- Long-running agents:
- "Agents can now run continuously for days without user input."
- "They can complete highly complex, time-consuming tasks entirely on their own."
7. Limitations and How to Prepare
Limitations
- "They still make mistakes a human would not."
- O4-mini is smaller, so it hallucinates more than O3
- "At the time of recording, tool calling is not available via API. Support is expected within two weeks."
- "For now, these can only be used as AI models, not AI systems."
How to Prepare
- "The most important advice: take initiative."
- "Soon, AI systems will handle implementation entirely."
- "Even today, you can build remarkable apps with a single prompt."
- "With such powerful systems at your fingertips, too many people aren't even trying."
- "In one or two years, anyone will be able to build anything with a single prompt. Those who take initiative today will achieve the most in the future."
- "Build an agent right now."
- "Don't wait for tool-calling API support — build agents for your personal tasks, daily life, business, and company workflows."
- "Even today, most businesses haven't properly tapped the potential of the previous models."
- "In a few years, every business will use more AI agents than human workers."
- "If you're not sure where to start, our community and platform offer free support."
- "Exclusive sessions and Q&As with real agent developers are also available for free."
- "Know which model to use and when."
- "O4-mini is 5x cheaper than O3 and 2–3x cheaper than Gemini, with nearly comparable performance."
- "But don't just look at benchmarks."
- "O4-mini is the top recommendation for agents."
- "You don't need reasoning models for every agent. Like a real organization, some agents just need to execute simple tasks."
- "Use GPT-4.1 for clearly defined tasks — it's fast and cheap."
- "Reserve O3 for mission-critical tasks where the error tolerance is extremely low and hallucinations must be prevented."
8. Is This AGI?
- "Is this AGI or not?"
- "On Twitter, some say it is; others say not at all."
- "My view: these AI systems are not AGI in a basic sense, but you can make them into AGI today."
- "If AGI is defined as a system that performs most tasks better than humans, you first need to give it all the tools, knowledge, and instructions required."
- "They can make 600 sequential tool calls, but they can't use 600 simultaneously. O4-mini tops out at around 20; O3 at 30–40."
- "That's not enough to cover the full range of human tasks."
- "So we need multi-agent systems — each agent specialized for a specific task, with its own tools and instructions, like a real organization."
- "Apply that to each industry and each business, and you get true AGI."
- "With the capabilities of today's AI systems alone, we can build real AGI together."
- "Let's start building agents in our own industries right now."
9. Closing and Next Video Recommendation
- "If you want to go deeper — for example, how Chase scaled an AI sales agent team to over $50,000 a month — check out the next video."
- "Thank you, and goodbye!" 👋
Key Concepts Summary
- AI system vs. AI model
- Feedback loop
- Tool calling
- Reasoning
- Self-reflection
- Workflow automation
- Agent reliability and persistence
- AGI (artificial general intelligence)
- Initiative
- Model selection strategy
"AI systems are now far greater than the sum of their parts." "This small feedback loop is what allows agents to loop on their own." "Those who take initiative today will achieve the most in the future." "Even today, most businesses haven't properly tapped the potential of the previous models." "Agents can now run continuously for days without human input." "With the capabilities of today's AI systems alone, we can build real AGI together."
This video carries a powerful message about the future of AI agents and what we need to do right now. Start today! 🚀
