1. Introduction: GPT's Remarkable Evolution and the IMO Gold Medal News

On the evening of Sunday, July 20, 2025, hosts Seungjun Choi and Chester Lo began recording just one day after their previous session. The reason: the monumental news that OpenAI had achieved gold medal-level performance at the International Mathematical Olympiad (IMO).

"When GPT-3.5 came out, it could solve three-digit addition step by step with Chain of Thought. That was late 2022. Now we have a model solving IMO problems."

They emphasize this achievement holds significance beyond just solving problems. Despite recent issues like talent departures from OpenAI and the failed Windsurf acquisition, this result demonstrates OpenAI's continued presence at the frontier.


2. What Is IMO? And OpenAI's Achievement

The IMO (International Mathematical Olympiad) is the world's most difficult math competition for high school students under 20 who haven't received university education. Over two days, contestants solve 6 problems, each worth 7 points, for a total of 42 points.

"OpenAI solved 5 problems this time. That's gold medal level."

Problems 3 and 6 are known to be the hardest. While problem 6 wasn't solved, IMO medalists within OpenAI verified the solutions, which also gained external recognition.


3. Technical Significance: Pure Reasoning Without a "Harness"

The model used was from the KIMI K2 lineage or the o1-series LLM. The crucial point is that a single model solved all problems through pure reasoning alone.

"This time, no MCTS-like systems were used. Just next token prediction and reinforcement learning to solve math problems. That's what's really amazing."

Like humans solving 3 problems over 4.5 hours, the model worked for hours without stopping. Previous approaches combined multiple systems or proof assistants (like Lean), but this time it was done with just the LLM, without any external tools -- a revolutionary achievement.


4. Math Competitions and AI: The Difference Between Humans and AI

While IMO problems are solved by high schoolers, they require tremendous creativity and deep thinking. Even prominent AI skeptic Gary Marcus mocked that "AI can't solve IMO problems," but this barrier was now broken.

"These are problems that only the genius kids at school used to solve. They look simple on the surface, but you'd need a whole notebook to actually solve them."

The problems and solutions are all publicly available, and OpenAI read the official problems as-is and wrote proofs in natural language. Rather than translating to Lean or similar, the LLM directly wrote natural-language proofs -- a key differentiator.


5. Reinforcement Learning, Test-Time Compute, and Generalization

This achievement came not from a math-specialized model but from a general-purpose reasoning LLM. This suggests it can expand beyond mathematics to various other fields.

"This model isn't a math specialist -- it's a next-generation general-purpose model. The fact that a general reasoning model like the o-series can solve IMO-level math problems is hugely significant."

Additionally, increasing test-time compute (computational resources devoted during problem solving) consistently improves performance.


6. Comparing Humans and AI, and a Cautious Perspective

External experts, particularly math legend Terence Tao, take a cautious stance.

"If you changed the competition format -- made problems easier, provided unlimited tools and internet access, allowed collaboration -- success rates would change dramatically. So we should be careful when directly comparing AI and humans."

The point is that we need discussion about whether AI solved problems under the same conditions as humans, or whether there were format differences.


7. AI Development Status and the Talent War

The AI industry is experiencing an intense talent war. Meta, Google, OpenAI, and others are attracting talent with astronomical salaries and stock options.

"There's this popular photo going around -- Ronaldo next to a developer. Developers now get superstar treatment."

Bidding wars over Windsurf, Cursor, Devin, and various other startups and acquisitions continue. We've entered an era where value is assessed at the individual level, not the company level.

"Traditional company-level exits and IPO strategies are crumbling. Now the IP inside individuals' heads is being directly valued."


8. OpenAI's Organizational Culture and Future Outlook

OpenAI maintains a bottom-up system of small teams, where individuals set their own missions and experiment, with successful projects rising to the top. After the Sam Altman crisis in November 2023, headcount grew from 700 to 3,000, but it still operates like a startup.

"When Google reached godlike status in 2005-2008, it was the same internally. This seems like a repeating cycle."

The key to AI experimentation and model development lies in compute (computational resources) to keep running experiments. Meta, x.ai, Google, and OpenAI are all building massive data centers.


9. Technical Limitations and Challenges Ahead

Currently, pre-trained model performance has somewhat plateaued, but there's still plenty of room to push performance through reinforcement learning, test-time compute, and synthetic data generation.

"Now the reasoning tokens generated during reinforcement learning far exceed pre-training data, and verification machines run alongside continuously. This creates synthetic data and models keep growing."


10. The Meaning of an IMO Gold Medal and Coming Changes

This IMO gold medal goes beyond simple math problem solving -- it's received as a signal that AI is increasingly resembling human creative thinking and reasoning.

"Now the model thinks for hours, not minutes, to solve IMO problems. What could it solve if it thought for a day? A month?"

The hosts honestly confess feeling overwhelmed by the pace of change.

"Today my brain just stopped. I can't find an answer."


11. Conclusion: Talent, Capital, and the Future of AI

The AI industry is rapidly reorganizing around talent, capital, and technology. Meta, Google, and Amazon have the investment capacity backed by stable revenue models, while OpenAI and Anthropic still lack clear revenue models.

"The Windsurf price is 6 times higher than the DeepMind acquisition. The scale of money moving through Silicon Valley is beyond imagination."

Amid these changes, the hosts emphasize that traditional corporate structures and compensation systems are collapsing, and an era where individuals' capabilities and IP are directly valued has arrived.

"It's no longer about companies -- the IP inside individuals' heads is being directly valued."

Finally, the hosts can't hide their astonishment at today's news and close by pondering what we should do going forward.

"What should we do in this situation? Today, there's really no answer."


12. Epilogue: In the Middle of Change

As the show wraps up, the two hosts honestly reveal their sense of awe and slight fear at the speed of AI's advancement and the resulting social and industrial changes.

"Being able to check firsthand every morning where the frontier of intelligence has reached is truly an enviable privilege."

They conclude with a message that we all must continue following this wave of change and consider how to adapt to a new era.


Key Concepts:

  • OpenAI, IMO gold medal, LLM, Chain of Thought, reinforcement learning, test-time compute, synthetic data, talent war, organizational culture, AI generalization, human-machine comparison, future outlook

"An era has arrived where a single model wins an IMO gold medal through pure reasoning alone, without a harness (complex system combination)." "We are witnessing the moment where artificial intelligence increasingly resembles human creative thought." "The feeling of being overwhelmed by the speed of change -- that's today's conclusion."

Related writing