
This video defines 2025 as the year of 'Claude Code' and agents based on Andrej Karpathy's retrospective, and discusses the expansion of Reinforcement Learning with Verifiable Rewards (RLVR) beyond coding. It emphasizes that AI advancement is transforming engineers from mere coders into 'directors' who orchestrate AI. It also predicts that in 2026, the deepening gap between frontier models and open-source, along with the rise of AI-native organizations, will be key themes.
1. 2025 Retrospective: Claude Code and the Transformation of Knowledge Acquisition
The video begins by looking back at the past year from the perspective of December 29, 2025. Park Jinhyung defines 2025 in one phrase as 'the year of Claude Code.' This is because we experienced explosive productivity gains when we moved beyond using AI solely through web UI chat interfaces to giving AI tools in the terminal environment and letting it autonomously modify code.
Furthermore, tools like NotebookLM and derivatives such as 'Nano Banana' became mainstream, fundamentally changing how we acquire knowledge. People now prefer listening to AI-summarized podcasts or grasping only the key points rather than reading full texts.
2025 was the year of Claude Code. Beyond just using the web UI interface, when we gave agents tools inside the terminal and said "figure it out yourself," the results were explosive.
Knowledge is abundant and intelligence has become free, yet the effort of actually executing things still remains. Going forward, a person's greatest competitive advantage will be execution ability, authenticity, and persistence (focus) — extremely scarce resources.
2. Verifiable Rewards (RLVR) and the Evolution of Agents
The first narrative that defined 2025 was RLVR (Reinforcement Learning with Verifiable Rewards) — reinforcement learning through verifiable rewards. In domains where correct answers are clear or can be immediately verified through compilers, such as math problems or coding, AI intelligence has grown dramatically. However, in domains where correct answers are ambiguous (writing, ethical judgment, etc.), limitations remain.
Going forward, it's predicted that not just coding but also science and mathematics will see breakthrough advances through simulation environments (World Models) and techniques like 'AlphaEvolve' that generate their own verification signals.
It's clear that intelligence has surged in domains where correct answers are well-defined. But what people want is to find answers to problems where the answers aren't given.
For coding problems, interpreters or compilers exist to provide clearly verifiable reward functions, but for science and mathematics, those don't exist. So attempts to create virtual verification environments through 'World Models' or simulations will be a key focus in 2026.
3. The Era of Vibe Coding and the Changing Role of Engineers
'Vibe Coding,' which was selected as Collins Dictionary's word of the year in 2025, was also discussed as an important topic. Code has now become nearly free, like a disposable commodity you write once and throw away. However, because AI makes coding so easy, people tend to not explain context in detail or think deeply, just clicking through and passing things off.
Both speakers emphasized that in an era like this, the engineer's role must become that of a 'Director.' Since AI implements well but doesn't understand the big picture, humans need to have a solid theoretical foundation and provide precise feedback and direction to AI.
Code has essentially become free, disposable like something you write once and throw away. Because AI is fast, people tend to input things casually, and since casual results come out, they undervalue their own work.
Every developer must become a director. Presenting the vision, giving instructions, providing feedback, and taking responsibility for the output — all of this is what a director must do. Deep, theory-centered knowledge will become even more important for giving critical feedback to AI.
4. Sandboxing and the Limits of Benchmarks
As AI agents began directly modifying code and executing commands in local environments (personal computers), the importance of security and sandboxing (isolated environments) grew. Agents could accidentally delete important drivers or break systems. In enterprise environments, this uncertainty leads to a preference for agents following prescribed workflows rather than operating freely.
Skepticism about existing benchmark methods was also raised. Simply solving specific test sets well isn't true intelligence — the ability to learn and adapt to new patterns across diverse environments needs to be evaluated.
The more authority you give AI, the higher productivity goes. But if the AI malfunctions or gets attacked, the damage is massive. That's why many enterprises only allow execution within defined workflow processes.
A test set is merely one snapshot of an environment. More general than tests is the 'environment,' and if the environment is well-structured, we can overcome overfitting problems and get closer to measuring true AGI.
5. 2026 Outlook: Model Competition and Widening Gaps
For 2026's open-source and model competition, the prediction was 'deepening gaps.' While 2025 saw notable advances from Chinese models like DeepSeek, the gap between the top frontier labs (OpenAI, Google, Anthropic, xAI, etc.) and everyone else is widening.
In particular, only a handful of companies with the capital to build data centers based on NVIDIA's latest chips (Blackwell) can operate massive models like MoE (Mixture of Experts), while the rest will use Dense models distilled from these massive models.
The gap between frontier labs and others is so large that it will likely keep growing. What we can do is wait for good dense models to be released, adopt them, and fine-tune them for our domains — that's the realistic strategy.
Ultimately, from an enterprise perspective, what matters is whether AI makes money, which is a question of how well it's applied to actual industries. In Korea, this is also called AX (AI Transformation).
6. Conclusion: AI-Native Organizations and Our Attitude
Finally, the discussion turned to insights about AI adoption and organizational structure. Large enterprises with legacy systems and conservative cultures will struggle to see AI adoption benefits, while startups and new organizations designed as AI Native from the start are likely to demonstrate overwhelming productivity compared to large enterprises.
Trying to perfectly predict the future and make three-year plans has become meaningless. Instead, we need to believe in technology while maintaining a flexible attitude of constantly exchanging feedback.
Because human-created structures tend to resist AI advancement, newly founded companies with AI-native structures from the start have a high chance of beating large enterprises. This could represent an enormous structural change in society.
Having no plan means having no view. Even if the plan gets revised a week later after receiving countless feedback from the world, making a plan itself is still important. That's the only way to work proactively with AI.
Closing
2025 was the inaugural year when AI went beyond being a simple chat partner to establishing itself as a practical work tool and agent. Technology will continue to change rapidly in 2026, but we must not forget that what ultimately matters is the 'directing ability' and 'health' of the people wielding that technology.