This video deeply explores the concept of 'manifest' in the AI era and the possibilities and limits of auto research, centering on Andrej Karpathy and Sarah Guo's podcast, as well as Terence Tao and Dwarkesh Patel's interview. It shares the key observation that RL advances at breakneck speed in verifiable domains, while AI still struggles with non-verifiable areas like jokes and tacit knowledge, confirming this through a real writing experiment. It analyzes the 'vibe physics' case from Anthropic's AI science blog and harness design methodology, emphasizing the value of work that AI cannot simply 'click' to handle and the importance of clear goal-setting.
1. Work That 'Clicks' vs Work That Doesn't: Shifting Value in the AI Era
The video was recorded on March 28, 2026. Seungjun Choi notes how AI technology has made tasks like Markdown trivially easy. He questions whether results produced so easily hold value for others. Karpathy uses the term 'ephemeral software' to describe how easily made things can be short-lived. The implication: we must challenge ourselves with work AI cannot simply 'click' to complete, though even such work may eventually become automatable.
Seungjun Choi: "Things keep being created, but since they're made so easily, while they might have value to me, I wonder if they truly have value to others."
Jungseok Noh explains this leads to the question of "where to escape to." Work anyone can do loses value, so one must focus on what only they can do or maintain a temporal advantage in.
2. Karpathy and Sarah Guo on the Age of 'Manifest'
The Karpathy-Guo podcast discusses the concept of 'manifest' — expressing one's will and having AI handle the rest. Karpathy replaced the term with 'manifest,' meaning to bring something into existence through intentional will.
They use the expression AI psychosis to describe the compulsive state of constantly directing AI. Karpathy reveals that since December 2025, he no longer writes code directly but manifests through AI, running 8 instances of tools like Codex and Claude Code simultaneously.
Karpathy: "In October I said 80-20, but now it's flipped to 20-80. I haven't typed a single line of code by hand."
3. The Possibilities and Limits of Auto Research
The video introduces Karpathy's auto research concept — where AI autonomously conducts research when goals are clear and verifiable. Given a model performance improvement goal, AI searches papers, modifies code, and reduces validation loss through RL-like positive reinforcement.
Jungseok Noh: "If there's a clear objective and clear evaluation of the output, you can optimize everything in between — documentation, research, repos, models — by deploying LLMs and tokens."
However, auto research has clear limits. Karpathy notes: "In verifiable domains it races ahead at light speed, but in non-verifiable domains it drifts." AI's performance drops significantly in subjective, non-verifiable areas like jokes. Karpathy describes AI as "genius in some ways, yet an awful fool in others — jagged."
4. Terence Tao Interview: Mathematics and AI's Future
The discussion extends to Terence Tao's interview with Dwarkesh Patel, exploring AI and mathematics. Patel references Karpathy's claim about RL's rapid progress in verifiable domains versus drift elsewhere.
Tao emphasizes the importance of semi-formal language — a middle ground between complete formal languages like Lean and the tacit knowledge mathematicians use when collaborating. Interestingly, Tao suggests that pure research environments like Princeton's IAS can actually exhaust inspiration, echoing Feynman and Hamming's views on the value of inefficiency and serendipity through human interaction.
5. Anthropic's AI Science Blog: 'Vibe Physics' and Claude's Role
Anthropic's AI science blog features physicist Matthew Schwartz using Claude as a "vibe grad student" to publish quantum field theory papers. He details how he guided Claude through the entire process, correcting its mistakes, flattery tendencies, and fabrications. The work that normally took 3-4 months was completed in 10 days to 2 weeks.
Jungseok Noh interprets this as a higher-layer version of auto research, where Claude serves as the evaluator. Claude's strengths include tireless repetition, solid foundational knowledge, and strong documentation and visualization. Its weaknesses: vulnerability to non-standard specifications, lack of aesthetic sense, and susceptibility to pressure.
6. Close the Loop: The Tacit Knowledge Reverse-Engineering Hypothesis
Seungjun Choi proposes the tacit knowledge reverse-engineering hypothesis: building a repository with minimal harness and acceptance criteria that captures an individual's tacit knowledge, then improving it through a bootstrapping loop.
His writing experiment showed AI producing creative prose when given clear acceptance criteria, but failing at jokes and sitcom scripts — supporting Karpathy's observation about non-verifiable domains. Jungseok Noh notes the AI industry's T(thinking) bias makes it difficult to establish evaluation criteria for F(feeling) domains.
7. OKR and Harness: The Core of Work Automation
Jungseok Noh revolutionizes his workflow using OKR and harness concepts — defining clear objectives with measurable key results as scalar values for AI's verifiable rewards. He built 'Chedex' on top of Codex, integrating RL loops and auto research loops where AI self-checks document-code consistency and strategic issues.
Anthropic's harness design guide proposes a multi-agent structure inspired by GANs, scoring subjective judgments for AI learning. As models improve, the harness becomes more important — AI engineers must focus on finding new combinations.
Seungjun Choi: "As models get better, the interesting harness combination space doesn't shrink — it shifts. The exciting work for AI engineers is continuously finding those new combinations."
Conclusion: Finding the Ever-Moving Frontier in the AI Era
The hosts emphasize Karpathy's concept of drift — the gap between goals and reality must be measured against the latest frontier model and its corresponding harness, which constantly shifts. In the AI era, opportunities open in previously unimaginable domains like AI for science, where defining harnesses and creating value matters most. They stress that however much AI advances, the human role in setting clear objectives and navigating non-verifiable domains remains irreplaceable.
