This video deeply explores the concept of 'manifest' in the AI era and the possibilities and limits of auto research, centering on Andrej Karpathy and Sarah Guo's podcast, as well as Terence Tao and Dwarkesh Patel's interview. It shares the key observation that RL advances at breakneck speed in verifiable domains, while AI still struggles with non-verifiable areas like jokes and tacit knowledge, confirming this through a real writing experiment. It analyzes the 'vibe physics' case from Anthropic's AI science blog and harness design methodology, emphasizing the value of work that AI cannot simply 'click' to handle and the importance of clear goal-setting.


1. Work That 'Clicks' vs Work That Doesn't: Shifting Value in the AI Era

The video was recorded on March 28, 2026. Seungjun Choi notes how AI technology has made tasks like Markdown trivially easy. He questions whether results produced so easily hold value for others. Karpathy uses the term 'ephemeral software' to describe how easily made things can be short-lived. The implication: we must challenge ourselves with work AI cannot simply 'click' to complete, though even such work may eventually become automatable.

Seungjun Choi: "Things keep being created, but since they're made so easily, while they might have value to me, I wonder if they truly have value to others."

Jungseok Noh explains this leads to the question of "where to escape to." Work anyone can do loses value, so one must focus on what only they can do or maintain a temporal advantage in.


2. Karpathy and Sarah Guo on the Age of 'Manifest'

The Karpathy-Guo podcast discusses the concept of 'manifest' — expressing one's will and having AI handle the rest. Karpathy replaced the term with 'manifest,' meaning to bring something into existence through intentional will.

They use the expression AI psychosis to describe the compulsive state of constantly directing AI. Karpathy reveals that since December 2025, he no longer writes code directly but manifests through AI, running 8 instances of tools like Codex and Claude Code simultaneously.

Karpathy: "In October I said 80-20, but now it's flipped to 20-80. I haven't typed a single line of code by hand."


3. The Possibilities and Limits of Auto Research

The video introduces Karpathy's auto research concept — where AI autonomously conducts research when goals are clear and verifiable. Given a model performance improvement goal, AI searches papers, modifies code, and reduces validation loss through RL-like positive reinforcement.

Jungseok Noh: "If there's a clear objective and clear evaluation of the output, you can optimize everything in between — documentation, research, repos, models — by deploying LLMs and tokens."

However, auto research has clear limits. Karpathy notes: "In verifiable domains it races ahead at light speed, but in non-verifiable domains it drifts." AI's performance drops significantly in subjective, non-verifiable areas like jokes. Karpathy describes AI as "genius in some ways, yet an awful fool in others — jagged."


4. Terence Tao Interview: Mathematics and AI's Future

The discussion extends to Terence Tao's interview with Dwarkesh Patel, exploring AI and mathematics. Patel references Karpathy's claim about RL's rapid progress in verifiable domains versus drift elsewhere.

Tao emphasizes the importance of semi-formal language — a middle ground between complete formal languages like Lean and the tacit knowledge mathematicians use when collaborating. Interestingly, Tao suggests that pure research environments like Princeton's IAS can actually exhaust inspiration, echoing Feynman and Hamming's views on the value of inefficiency and serendipity through human interaction.


5. Anthropic's AI Science Blog: 'Vibe Physics' and Claude's Role

Anthropic's AI science blog features physicist Matthew Schwartz using Claude as a "vibe grad student" to publish quantum field theory papers. He details how he guided Claude through the entire process, correcting its mistakes, flattery tendencies, and fabrications. The work that normally took 3-4 months was completed in 10 days to 2 weeks.

Jungseok Noh interprets this as a higher-layer version of auto research, where Claude serves as the evaluator. Claude's strengths include tireless repetition, solid foundational knowledge, and strong documentation and visualization. Its weaknesses: vulnerability to non-standard specifications, lack of aesthetic sense, and susceptibility to pressure.


6. Close the Loop: The Tacit Knowledge Reverse-Engineering Hypothesis

Seungjun Choi proposes the tacit knowledge reverse-engineering hypothesis: building a repository with minimal harness and acceptance criteria that captures an individual's tacit knowledge, then improving it through a bootstrapping loop.

His writing experiment showed AI producing creative prose when given clear acceptance criteria, but failing at jokes and sitcom scripts — supporting Karpathy's observation about non-verifiable domains. Jungseok Noh notes the AI industry's T(thinking) bias makes it difficult to establish evaluation criteria for F(feeling) domains.


7. OKR and Harness: The Core of Work Automation

Jungseok Noh revolutionizes his workflow using OKR and harness concepts — defining clear objectives with measurable key results as scalar values for AI's verifiable rewards. He built 'Chedex' on top of Codex, integrating RL loops and auto research loops where AI self-checks document-code consistency and strategic issues.

Anthropic's harness design guide proposes a multi-agent structure inspired by GANs, scoring subjective judgments for AI learning. As models improve, the harness becomes more important — AI engineers must focus on finding new combinations.

Seungjun Choi: "As models get better, the interesting harness combination space doesn't shrink — it shifts. The exciting work for AI engineers is continuously finding those new combinations."


Conclusion: Finding the Ever-Moving Frontier in the AI Era

The hosts emphasize Karpathy's concept of drift — the gap between goals and reality must be measured against the latest frontier model and its corresponding harness, which constantly shifts. In the AI era, opportunities open in previously unimaginable domains like AI for science, where defining harnesses and creating value matters most. They stress that however much AI advances, the human role in setting clear objectives and navigating non-verifiable domains remains irreplaceable.

Related writing

HarvestEngineering Leadership · Data & Decision-MakingEnglish

AX Roadmap That Leads to Results: Connecting Individual Efficiency to Organizational Productivity

This webinar by Flex team's CCPO examines why 'using more AI doesn't automatically improve organizational outcomes' through structural analysis. Drawing from real experiments and failures in measurement, sequencing, and organizational adoption, it presents an AX design strategy focused on solving bottlenecks from the last mile - SSOT, evaluation environments, validation, and access control - concluding that changing bottlenecks, verification, decision-making, and collaboration structures matters more than increasing output volume.

Mar 28, 2026Read more
HarvestAIEnglish

The Era When Agents 'Code' and Research Runs in 'Loops': Andrej Karpathy Conversation Summary

Andrej Karpathy says that with the recent leap in coding agents, the core task has shifted from writing code directly to 'conveying intent to agents.' He sees this extending to AutoResearch—autonomous research loops running experiment-learn-optimize cycles with minimal human involvement.

Mar 22, 2026Read more
HarvestAIEnglish

One Person Ran Anthropic's Entire Growth Marketing? The Incredible Story Made Possible by Claude Code

This is the remarkable case of a single non-technical employee handling all of Anthropic's growth marketing—paid search, social ads, app store optimization, email marketing, and SEO—for 10 months at the $380 billion company, all powered by Claude Code.

Mar 12, 2026Read more