Joon Sung Park, CEO of Simile, founded an AI lab that simulates human behavior and society. He emphasizes that while today's AI models excel at objective problem-solving, they have limits when it comes to behaving like real humans. He explains that Simile aims to build a "GPU of intelligence" that reflects diverse human values, preferences, and tastes, which can help model complex social phenomena such as bank runs, climate cooperation, and the warning signs of democratic collapse. This video takes a deep dive into the simulation technology Simile has developed and its potential.
1. The Possibility of Simulation, Inspired by Science Fiction ✨
Joon Sung Park says he draws deep inspiration from science fiction. His explanation is that science fiction depicting technologically mature societies always features two core elements: some form of artificial general intelligence (AGI) and social simulation. He is convinced that today the opportunity to realize this simulation has arrived.
"I'm someone who draws a lot of inspiration from science fiction. When you read science fiction that depicts a society that is technologically mature enough, you can always see two pillars. There's some form of AGI, and some form of simulation that helps guide society. Today, I see an opportunity to take the first step toward building that simulation. I wouldn't have said this even five years ago, but it's a conviction I've built up over years of digging deep into this research."
He added that even five years ago he didn't hold this conviction, but discovered the possibility as he deepened his research.
2. Stanford's 'Smallville' Project and Generative Agents 🏠
Joon Sung Park describes the Smallville project he ran at Stanford in April 2023. The project began with the observation that large language models (LLMs) now encode much human behavior through their training data from the web and social media. The core idea was that, if you approach these models from the right angle, you can elicit subtle human behaviors from them.
"Smallville was a project I ran at Stanford, and it started from the observation that large language models can now encode a lot of human behavior embedded in their training data, like the web and social media. If you probe them from the right angle, you can actually get a lot of micro-behaviors out of these models. For example, given a very specific description of a situation, what would a person named X do? It actually generates very interesting behavior."
The research team concluded that this was exactly what was needed to create complex agent behavior, and decided to experiment with what kind of society would form if they pushed this as far as possible. That is how Smallville was born. Smallville combined generative AI models with generative agents equipped with memory, planning, and reflection capabilities to simulate agents living their lives in a small town. Twenty-five agents, each with its own persona, woke up in the morning, went about their daily lives, went to work, formed relationships, threw parties, and produced autonomous phenomena just like real people.
3. A Surprising Result in Smallville: The Valentine's Day Party 🎈
One of the most surprising things in the Smallville experiment was a simulation set on the eve of Valentine's Day. One of the agents in the simulation, a café owner named 'Isabella,' makes a plan to throw a Valentine's Day party, gathers party supplies, and invites guests to come. Then on Valentine's Day itself, a spontaneous party phenomenon actually emerged, with all the agents gathering at the café to enjoy the party.
"One of the things I was most surprised by — the simulation itself is set on the eve of Valentine's Day. And these agents — one agent is a café owner named Isabella who runs a café — she thinks, 'It would be nice to throw a Valentine's Day party and invite my friends and customers.' So on the eve of Valentine's Day you can see her going around gathering party supplies and telling guests, 'We're having a party, please come.' And on Valentine's Day itself, you can actually see an immersive party where all the agents come to the café and the party takes place."
Some agents weren't invited or forgot, but unexpected interactions also occurred — for instance, an agent named 'Klaus' brought his crush to the party. These phenomena showed that the simulation doesn't merely follow predetermined rules, but can generate complex, unpredictable interactions similar to real society.
4. The Background of Smallville's Development: A Technology-Driven Discovery 💡
Joon Sung Park explains that the impetus for developing Smallville did not begin with research into human psychology or social behavior, but stemmed from a technical discovery. His team had long paid attention to the possibilities of simulation, and ahead of the 2020 release of GPT-3, they noted the remarkable generalization ability of large language models.
"Our team was excited about simulation, and we saw the vision for simulation early on. In 2020, GPT-3 was about to be released, and we got to see the first demos. In my first year, together with many Stanford researchers, I wrote a paper called 'On the Opportunities and Risks of Foundation Models.' What I was really focused on at the time was: 'This is a new kind of model we've never seen before. This kind of model can generalize in ways that weren't possible in the past.'"
He says that while his fellow researchers at the time were amazed that the model could perform classification or simple generation tasks, what was more interesting to him was the possibility that these models could encode human behavior.
5. Social Simulacra: An Innovation in Testing Social Platforms 👥
Joon Sung Park notes that his research tradition is rooted in the field of Social Computing. Social computing focuses on building technology platforms that enable social interaction and collaboration. When building a social platform, the hardest challenge wasn't UI/UX testing, but predicting how countless people coming together would create positive or negative autonomous phenomena, and how to prepare for that scale. Previously, the only option was field testing — launching a prototype and watching the results — which carried enormous costs and potential negative impacts.
"One of the hardest challenges isn't testing the system's UI/UX, but how tens, millions, and eventually billions of people come together to create good and bad autonomous phenomena, and how you can design for scale. Until now there's been no tool to test this. The only way we test today is basically to do field tests. You launch a prototype and watch what happens. And sometimes there are real costs. Of course there are high costs in terms of human resources and time, but at the same time, if you have a bad design — imagine a feed that's more likely to spread negative emotions on social media. Obviously we'd want to avoid that, but right now this gets tested in the field."
In 2022, Joon Sung Park's team published a pioneering paper called Social Simulacra. This research used LLMs to predict how people would behave through a subreddit simulation. When they defined the subreddit's goals and operating strategy and populated it with thousands of personas (which they didn't call 'agents' at the time), the agents showed spontaneous cooperative phenomena such as discussing Pittsburgh's tourist attractions and planning trips together. This instilled confidence in the potential of simulation.
6. The Evolution of Models and Simile's Differentiator: A 'GPU of Intelligence' That Pursues Human-likeness 🧠
The early GPT-3-based models were "very janky" and didn't follow instructions well, so they required prompting tricks. But within them, the potential for human behavior to be encoded was clear. As later models added instruction tuning capabilities, it became possible to build more complex agents that could reason about memory, and today's models have reached a level where you can build these kinds of applications.
Joon Sung Park points out that today many large language model companies aim to build a "superintelligent machine" — a rational model that excels at solving technical problems with an objective correct answer. He likens these models to the "CPU of intelligence."
"When you look at many large language model companies today — OpenAI, Anthropic, and many newly forming AI labs — I think the models they build have a goal similar to 'let's build a superintelligent machine.' These machines should be rational, and they should be really good at solving technical problems that have objective answers."
But he says these models have limits when it comes to simulating "true human society." That's because humans are irrational and have subjective values, preferences, and tastes. He diagnoses that with the current model paradigm, the ability to predict and simulate human behavior has plateaued.
This is where Simile's role comes to the fore. Simile aims to develop a next generation of models better suited to modeling the diversity of people — a "GPU of intelligence." Simile's goal is not a superhuman model, but a model that is "as human as possible." This model must be able to represent the real perspectives of diverse population groups.
"Simile's model is much closer to developing a GPU of intelligence. The idea here is that we don't need a superhuman model. In fact, we want a model that is as human as possible, but we want these models to be able to represent, at the individual sub-unit level, the real perspectives of diverse population groups."
This gap is precisely why Simile develops its own models. At the same time, Simile also leverages the benefits of frontier models — the latest models — as a means of steering its research and formulating research plans.
7. From Smallville to a Company Called Simile: A Journey Toward Solving Real Problems 🚀
Joon Sung Park says that after the Smallville project, he realized that the roles of research and of a company are very different. Research is great at exploring a variety of hypotheses, but it has limits in applying those research results to the real world. A company, on the other hand, is a "machine for depth for search" that converges resources and people with conviction about a specific field, advancing toward a single vision.
"Research and companies have very different functions. Research is an amazing tool when you want to do broad research. In a lab surrounded by smart people, each researcher takes on a small part of a paper to explore, and some of that leads to amazing research outcomes — but we're not usually known for finishing the job. We're usually not the ones who bring the impact of that research into the real world."
"A company is a machine for depth for search. When you have conviction about a specific field and find a hill you want to climb, this is a tool that lets you gather resources and amazing people without hesitation and pursue a single vision."
About six months after publishing the generative agents paper, the Simile team gained the conviction to found a company. At first, there were many inquiries from social scientists wanting to use it as an experimental platform, and later, executives from Fortune 500 companies visited Stanford, saw the Smallville demo, and began asking whether they could solve market research questions through simulation. These inquiries showed a clear opportunity for the research to make an impact on the real world, and this was precisely the impetus for founding the company.
After that, Simile worked to validate the accuracy of its simulations. They generated a simulation of 1,000 people in the U.S. population and demonstrated that Simile's architecture and models could predict people's behavior with 85% accuracy — as well as people predict their own behavior. Gaining confidence from this result, Simile decided to provide users with a simulation platform for important decision-making.
8. Simile's Customer Engagement Model: Focusing on the CVS Case 🛒
Simile's customers are mainly accustomed to working with polling and panel companies. The initial stage with Simile looks similar — it begins with a customer requesting to better understand a specific population group.
When Simile receives such a request, it partners with panel companies like Gallup to reach real people and collect data. At this point, Simile focuses on what magic questions it can ask within a limited time of 15 minutes in order to obtain sufficient, generalizable data about a particular person. This data is later used to generate agents.
"The initial stage looks very similar to Simile. Our customers come to us and say they want to better understand the XYZ population. Then, through partnerships with vendors — for example, we currently have a strategic partnership with Gallup, a polling and panel company — Simile works with vendors to reach out to real people. So these simulations are grounded in real data."
The collected data is used to create simulation agents, which are designed to be able to answer countless questions beyond the scope of the original questions. These agents are loaded into Simile's platform, a SaaS (Software as a Service) product, allowing customers to ask questions about the specific group they want.
CVS is one of Simile's partners and uses Simile's simulation technology to perform concept testing. A senior VP at CVS read Simile's research papers and, judging that they couldn't answer many questions because of the limits of field testing and the physical constraints of human society, decided to work with Simile. He hoped that through Simile's simulations he could simulate an entire market and grasp even the second-order impact of decisions.
9. Parallels with Self-Driving Car Simulation: Generalization Grounded in Real Data 🚗
Joon Sung Park explains that Simile's simulation concept is similar to self-driving car simulation. Self-driving cars create models grounded in real physical laws, but must be able to generalize to diverse locations and weather conditions beyond their training data. Likewise, Simile collects essential data from real people and, by encoding this data into its models, enables generalizable predictions.
10. The Importance of Collecting Real Data: The 'Say-Do Gap' 🗣️
The interviewer raises the question of whether it's necessary to collect real data, given that large language models can represent everything in the world well. The question was something like, "If you tell Claude you're a 34-year-old woman living in a coastal metropolis, can't you get a faithful answer?"
But Joon Sung Park emphasizes that a "Say-Do Gap" exists. Because large language models are trained mainly on data of what people have said online, there can be a difference between that and what real people "actually do." Simile aims to close this gap.
"One of the important questions here is the question of the 'say-do gap.' There's a gap between what people say and what they actually do, and that gap is real. And many large language models are trained on edited data. Basically, the things people have said online make up a large portion of the training data. So one of the things Simile's simulation platform does is close that gap."
The data Simile collects is mainly behavioral data. This includes questions like "Tell me your life story." Through such questions, they can obtain data that helps build an individual's "longtail information" — that is, a translational layer between attitudes and behavior, such as one's upbringing and difficult decisions. Of course, surveys to gauge people's views on specific topics are also used as an efficient method.
11. Evaluating Simulations: Convergence, Divergence, and Confidence 📈
When asked how the predictive power of simulation models is evaluated, Joon Sung Park acknowledges that theoretical limits exist. Because there is an inherent randomness in human behavior, a person may not answer the same question identically every time.
Simile measures the distribution of responses at the level of individual population groups. Especially for quantitative questions, it uses Total Variation Distance (TVD) to measure how close the real distribution and the simulated distribution are. Simile considers a TVD below 0.15 to be reliable enough for decision-making. This metric also applies to core use cases like RCTs (randomized controlled trials).
However, when evaluating multi-agent simulations or downstream implications, the problem becomes more complex. That's because errors can accumulate between agents. Joon Sung Park broadly divides simulations into two categories: converging simulations and diverging simulations.
-
Converging simulations: Cases where, even with slight errors, the results converge in a particular direction. For example, when you simulate a network of people, a scale-free network phenomenon always appears in which hubs form. This is also the core observation behind Google's PageRank algorithm. As long as human behavior is replicated with a certain accuracy, this convergence always occurs.
-
Diverging simulations: Cases where even a slight change in initial conditions leads to greatly different results. A representative question is something like "Was World War I inevitable?" In the case of election simulations, every decision affects what comes downstream, so you may not get the same result every time.
For diverging simulations, Simile evaluates around "confidence." For example, when running a simulation 100 times, it counts how many times the result comes out as X to estimate confidence, similar to a bootstrap. Joon Sung Park emphasizes that the true strength of simulation lies in showing a diversity of possible outcomes when results diverge, helping people understand the causes and mechanisms behind them and prepare for the future.
"Much of the power of simulation lies in showing the diversity of possible outcomes when it diverges, so that people can understand the causes or mechanisms that led to those outcomes and prepare for the future."
He added that simulation is still in its early stages, and that building a mathematical, robust framework for whether a simulation converges or diverges is an important challenge in current research. Just as a significance level (P-value) of 0.05 became the standard of scientific evidence in statistics, establishing corresponding criteria and standards in the field of simulation is one of Simile's goals.
12. The Future of Simulation: Macroeconomics, Solving Social Problems, and a 'CERN for Human Society' 🌐
Currently Simile's customers are mainly Fortune 500 companies, but Joon Sung Park believes simulation has the potential to solve enormous social problems beyond corporations. Citing the agent-based model research of past Nobel laureate in economics Thomas Schelling as an example, he notes that Schelling used a simple model to reveal the causal mechanisms of macro-level human behavior such as segregation. Simile argues that it can now advance this kind of simulation even further through real agents that replicate the rich characteristics of individuals.
Joon Sung Park lists non-commercial problems that simulation could solve.
- Macroeconomics: When does a bank run occur?
- Climate change: Can we simulate the collective action problem across multiple countries?
- Democracy: What are the early warning signs of democratic collapse?
- Monetary systems: What is the origin of monetary systems?
"In the context of macroeconomics, a question I actually received from economists was, 'When does a bank run happen?' Or questions like climate change. One of the key obstacles to solving the climate problem is the collective action problem across multiple countries. Can we actually simulate this? Or what are the signs of a democracy on the verge of collapse? Can we understand the origin story of monetary systems? These are the kinds of simulations I think should be the north star of this field."
He predicts that such simulations could cost hundreds of millions of dollars and take several months, but once run, they could solve fundamental questions about our society.
"Today it's obviously not the case, but imagine a simulation that costs $100 million and takes months to run once, but once run, solves one of the fundamental questions about our society. I think that's a really exciting possibility for this field."
Joon Sung Park emphasizes that if policymakers were able to simulate the impact of a given policy change and predict even its long-term outcomes, the very way politics is done would also change.
13. Hopes for the Future: A CERN for Human Society 🌍🔬
Joon Sung Park once again emphasizes his belief, inspired by science fiction, that simulation will serve as a guide for human society, expressing his anticipation for the future. He says that while they currently provide users with clear use cases, more innovation will happen going forward, and a simulator like a 'CERN for human society' will be built.
"I'm someone who draws a lot of inspiration from science fiction. When you read science fiction that depicts a society that is technologically mature enough, you can always see two pillars. There's some form of AGI, and some form of simulation that helps guide society."
"Today, I see an opportunity to take the first step toward building that simulation. I wouldn't have said this even five years ago, but it's a conviction I've built up over years of digging deep into this research. And what's exciting is that there's a clear use case we can serve users with today. But I think there's still a lot of innovation left before we can actually build a simulator like a CERN for human society."
Quoting his co-founder Percy, who said that "the greatest scientific innovations often start with an amazing measurement," he believes that, just as the Hubble telescope changed the trajectory of our understanding of the universe, simulation can play that role for human society.
Unlike research concentrated on the natural sciences, Joon Sung Park places great hope in how simulation can expand our understanding of humanity and the social sciences and ultimately make society a better place. The interviewer also agrees that simulation has the potential to "solve" not just economics but every social science field related to human behavior, and the conversation comes to a close.
Conclusion
Through a conversation between Simile CEO Joon Sung Park and Sequoia Capital's Sonya Huang, this video deeply explores how simulation technology can go beyond mere research to help solve the complex problems of real society. Starting with Stanford's 'Smallville' project, it broadly covers Simile's vision of AI that models 'human-likeness,' along with its potential for solving corporate and social challenges. Joon Sung Park's conviction that simulation can become a powerful tool for revealing the hidden mechanisms of human society and predicting the future points to the exciting possibilities we may encounter in the AI era.
