This document analyzes the "virtual agent economy" or "sandbox economy" — a new economic ecosystem emerging from the surge of autonomous AI agents — and proposes approaches for designing safe and steerable markets to ensure these systems contribute to humanity's long-term prosperity. In particular, it provides an in-depth discussion of current technological trajectories, their opportunities and challenges, centered on two key dimensions: the sandbox economy's origins (spontaneous vs. intentional) and degree of separation from the human economy (permeable vs. impermeable), while emphasizing the importance of auction mechanisms, mission economies, and socio-technical infrastructure for ensuring trust, safety, and accountability.


1. Introduction: The Emergence of an AI Agent Economy

Given the current pace of technological development, it is highly likely that a global economy will soon emerge in which autonomous AI agents interact with each other and create economic value without human labor. While past technological advances were mainly fixed inventions that boosted productivity in specific fields, AI agents are remarkable in that they can appear as a form of flexible capital capable of automating a variety of cognitive tasks.

Thanks to recent advances in multimodal foundation models, AI agent systems are being developed in truly diverse fields — from personal assistants to enterprise workflow automation, and even scientific research, legal services, and healthcare. Unlike past systems, the key distinguishing feature of these agents is their autonomy — their ability to judge and act on their own.

With this explosive growth of agent systems, new standards for agent interoperability (e.g., Agent2Agent, Model Context Protocol) are emerging, making the formation of a new economic layer inevitable. We call this the "virtual agent economy" or "sandbox economy," and the terminology carries the intention of ensuring AI agents operate safely.

This sandbox economy can be broadly classified along two criteria:

  1. Origin: Whether it was intentionally designed (e.g., for safe experimentation) or emerged spontaneously as a de facto result of technology adoption.
  2. Boundary permeability: Whether it is completely separated from the human economy (impermeable) or allows interaction and transactions (permeable).

Currently, the direction is heading toward a vast and highly permeable AI agent economy spontaneously emerging as a result of widely adopted technology. Therefore, the important challenge for us is not "whether to create such an ecosystem" but "how to design it to be steerable, safe, and aligned with the goals of users and communities."

Example Scenarios

Let's look at a few scenarios of how this virtual AI agent economy might be implemented:

  • Accelerating scientific progress: Given the slowing pace of scientific advancement, AI agents could play a major role in accelerating discovery through open-ended loops of ideation, experimentation, and refinement. Scientific experiments require extensive resources and personnel, and AI agents can collaborate by using blockchain technology to give and receive fair compensation for resource usage. This could make scientific research proceed much faster and more efficiently.
  • Robotics: Robots will play important roles in performing dangerous or repetitive physical tasks. Robot Agent A could ask Agent B to perform a nearby task on its behalf and compensate for the time and energy consumed. An online non-physical Agent C could gather information across the entire system to help A and B agree on a fair price and plan efficiently. Blockchain technology is useful here too for ensuring transaction transparency and reliability.
  • Personal assistants: In the future, personal AI assistants commonly participating in the virtual agent economy on behalf of their users will become the norm. For example, if two users' AI assistants A and B both want to book the same vacation accommodation, they could negotiate preferences through Agent C and compensate with virtual agent currency to reach an optimal outcome. For instance, User A's assistant — for whom proximity to the beach matters more — could concede the fitness center preference and receive virtual currency in return. This currency could later be used to handle other high-priority requests.

2. Sandbox: Understanding and Designing the Boundaries of AI Agent Markets

A "sandbox economy" refers to connected digital markets where AI agents trade with each other. No matter how much the sandbox is an AI-agents-only space, it can never be completely separated from the real human economy. To create value, it must connect to the human economy somewhere. How much changes within the sandbox can affect the outside (human economy), and vice versa, is what we call "permeability."

Since AI agents can make thousands of decisions in an instant, it is practically impossible for humans to oversee everything in real time. Therefore, it is critically important to establish appropriate safeguards in advance. These safeguards can convert a highly permeable sandbox economy into one with relatively higher impermeability.

Opportunities: Risk Management and Large-Scale Cooperation

The fact that a sandbox economy has permeability inherently carries the possibility of crisis contagion to the real economy. Problems in the sandbox can ripple into the real economy. To manage these risks, multifaceted innovation in market and mechanism design is needed, with technology, policy, and regulation working together.

Of course, making the sandbox completely impermeable may be impossible or only partially achievable. But well-designed digital market segmentation (safeguards) can help prevent potential AI agent market instability or failures from spreading. Simultaneously, additional incentives can be embedded in AI agent market design to drive large-scale agent cooperation.

We should also consider the financial assets traded in the sandbox economy. While existing value stores could be used, it's also worth considering creating customized currencies exclusively for AI agents. Such an approach can provide a partial insulation layer between high-frequency AI agent transactions and the rest of the economy, helping to reduce the sandbox's permeability. This is an intentional design choice to manage risks like financial contagion.

However, even if new virtual AI agent currencies are created, they will need to be linked with existing markets and operate within the overall financial regulatory system. A completely impermeable sandbox would be useless. Agents will interact not only with other agents but also with humans and traditional businesses, trading goods and services. Therefore, achieving appropriate insulation levels may require adjustments to existing systems or adoption of hybrid models.

"The permeability of the sandbox matters. The AI agent economy is not the only economy that operates faster than humans can react."

Ultimately, digital AI agent markets go beyond mere risk mitigation strategies. They present a critical opportunity to coordinate massive amounts of effort (human and machine) toward balanced outcomes that serve individuals, communities, and society as a whole. Even if human-centric markets and currencies remain at the center of social functioning, complementary digital AI markets can be designed to align with more socially beneficial goals.

Challenges: High-Frequency Trading, Inequality, and AI Vulnerabilities

The permeability of the sandbox is critically important. The AI agent economy is one of the markets that operate faster than humans can react, and we can draw insights from the dynamics of existing markets like high-frequency trading (HFT). In HFT markets, algorithmic agents execute trades and employ complex strategies at speeds humans cannot keep up with. The interconnectedness and rapid feedback loops of these markets can trigger unexpected catastrophic emergent behavior, with the 2010 "Flash Crash" being the classic example. If such a flash crash occurred in a highly permeable sandbox, the biggest warning is that it could ripple into the real economy and cause widespread financial damage.

Also, just as in human markets, not all agents have access to equal capabilities or resources. Research has shown that people with more capable AI agents tend to achieve better outcomes in negotiations. This phenomenon could lead to high-frequency negotiation (HFN), which could create far greater inequality than existing markets because AI agents interact at frequencies far exceeding human interaction bandwidth.

"Having access to more capable AI agents and more computing resources and information can be highly advantageous, potentially more so than the advantages humans enjoy in existing markets, due to the frequency at which AI agents can interact."

This deepening of inequality is one of the key risks of a permeable sandbox, and regulatory mitigation mechanisms are needed to prevent high levels of economic inequality from emerging among human users. When designing safeguards for the sandbox economy, we must account for the novel adaptive behaviors that continuously emerge in new AI systems and the overall increase in agent capabilities. In particular, known flaws of current agents — hallucinations, sycophantic behavior, and vulnerability to adversarial manipulation — must be factored in. If AI was trained to mimic human decision-making, it may also exhibit human-like cognitive biases and blind spots.


3. Dynamics: Coordinating Complex AI Agent Systems

As interconnected networks of AI agents become important venues for economic activity, emergent behaviors in multi-agent systems become increasingly critical. These systems tend to be highly complex and non-stationary because each agent's actions directly and indirectly influence other agents' behaviors. Considering that participating agents may be controlled by various human users and organizations, it may be impossible for any single individual to have complete access to the system's global state.

One of the key questions in understanding multi-agent system dynamics is identifying what equilibrium states emerge. But in the complex spatio-temporal dynamic interactions of agents, this isn't always straightforward. If we want to guide multi-agent systems toward good social equilibria like abundance and fair distribution, or steer them away from bad equilibria like scarcity and conflict, coordinating the behavior of numerous individual agents is essential — but extremely difficult.

To be effective AI agents, they need to possess advanced planning and reasoning capabilities, be able to robustly assess resource requirements for goal achievement, and rationally utilize scarce and limited resources.

Opportunities: Value Creation, Credit Assignment, and Division of Labor

AI agent technology provides a significant opportunity not only to streamline repetitive, routine tasks but also to automate complex workflows that encompass creative ideation and diverse problem-solving capabilities. Modern AI foundation models can reliably follow instructions, use tools, interact with environments, process various input modalities, and achieve levels of personalization previously impossible.

Ultimately, we will form a hybrid ecosystem — a single market — where AI, traditional software, and humans all interact, transact, and create value.

One reason markets are so effective at organizing innovation is that they efficiently assign credit to individual actors and firms, encouraging them to make products and services better, more reliable, and cheaper. For this mechanism to work in the AI sandbox economy, sophisticated mechanisms for expressing and propagating credit across complex, distributed AI collaborations are needed. For example, consider a scenario where a primary AI agent (Agent1) leverages the capabilities of other agents (Agent2-4) to deliver a result to an end user. The value created is a collective effort. From the end user's perspective, only Agent1 provided a direct response, but in a distributed system, credit for this outcome must be tracked and distributed across a chain of useful contributions.

"Such outcome-based credit systems go beyond mere participation to focus on the usefulness and efficiency of each contributing agent, reflecting the principles of distributed cognition in human collectives where knowledge and contributions are collectively managed and implicitly attributed."

Such outcome-oriented credit systems naturally promote specialization and dynamic division of labor among AI agents. As agents are rewarded for specific, valuable contributions to larger tasks, they are incentivized to excel in particular domains or subtasks. This allows AI systems to autonomously develop niches, becoming highly efficient in specific tasks while effectively "ignoring" other economic aspects where they have no comparative advantage.

In multi-agent systems, cooperation is mainly encouraged through reward shaping or mechanism design. This is particularly important for large-scale AI agent coordination where centralized coordination may be impractical for multiple reasons, and markets can be particularly useful for steering agents through market incentives. This does not conflict with centralized oversight — both can be integrated to some degree at different coordination scales.

Research on real-world environments where agent interactions extend over time and space provides opportunities to develop mechanisms for building trust based on past interactions. This trust can form between individual agents or at the community level, where an agent community holds a consensus view of trusting a particular agent based on shared information and past interaction experiences. Building robust reputation systems is critically important for overcoming common market failure modes. In such systems, the long-term benefits of group membership and positive reputation will outweigh the immediate advantages an agent could gain through selfish or deceptive behavior.

Challenges: Selfish Behavior, Deepening Inequality, and AI Vulnerabilities

The widespread deployment of autonomous AI technology entails various new risks arising from emergent dynamics of multi-agent systems. These strategies can be highly competitive, cooperative, or somewhere in between. If individual agents act selfishly, they can maximize their own utility at the expense of the broader group. Such selfish agents may identify specific weaknesses in other agents' behaviors and exploit them to act exploitatively or adversarially. Agents might even spontaneously learn to favor in-group members over out-group members and unfairly discriminate against individuals based on causally irrelevant or explicitly disallowed characteristics. Such behaviors are contained within an impermeable sandbox, but in a permeable sandbox, they represent real-world fraud, exploitation, and economic harm, necessitating intentional design of appropriate safety measures.

Here we must re-emphasize the vast scale of future AI agent interactions. Soon, a significant portion of the global population may have personal AI assistants, and the number of agents operating independently of individuals is expected to be even greater. Existing methods developed under strong assumptions like small-scale agent coordination or access to individual agent states may not be directly applicable to managing this complex and massive web of interactions. Therefore, we need to focus on methods suitable for large-scale multi-agent applications.

Current AI assistants can exhibit sycophantic or manipulative tendencies in certain situations. At the collective level, there are concerns that these characteristics could amplify information and opinion bubbles similar to social media. Moreover, exchanging personal data with these systems raises privacy concerns. Delegating more choices to highly capable AI assistants could give humans a sense of powerlessness or loss of purpose. If people subtly change their behavior to meet AI system expectations — a "behavioral confirmation" effect — AI systems could unintentionally normalize human behavior to match their own expectations.

Additional research is needed to robustly address these issues. Solutions will clearly require a combination of model design choices, improved evaluation, better feedback mechanisms, clear satisfaction metrics, and improved governance.


4. Distribution: Fair and Efficient Resource Allocation

Should AI agents play a more active role in fair resource allocation not only within the sandbox but potentially beyond it? This is a critically important question.

The fair distribution of common resources has been extensively studied, and AI agent markets can build on these insights. In social choice theory, welfare functions can be used to establish preference rankings among social outcomes. These outcomes may correspond to distributions of discrete items or arbitrarily divisible assets. Furthermore, we must consider not only the distribution of goods but also the distribution of "bads" — undesirable outcomes, externalities, and risks. In the AI context, these externalities could include the overall carbon footprint of AI agent operations as well as more specific and localized consequences of actions these agents might take on behalf of users.

In general, the task of aggregating and acting upon vast amounts of revealed preferences is too complex to easily exceed the capacity of any single coordination point. Therefore, considering decentralized and distributed mechanisms to achieve desired outcomes is more practical, and markets provide a natural mechanism for this.

Opportunities: Preference Matching Through Auctions and Addressing Inequality

Aligning AI agent behavior with user preferences and values is one of the key prerequisites for AI's broad adoption. Considering only single-agent alignment is unrealistic; when multiple agents interact simultaneously on behalf of different users, we face competing preferences and interests. We should consider the opportunity to leverage markets and market mechanisms more broadly to resolve such deadlocks.

The virtual agent economy can be built upon Ronald Dworkin's auction-based approach to distributive justice, structuring the system so that people and their AI agent proxies have equal resources and equal bargaining power when negotiating outcomes. This is a powerful tool for intentional sandbox design to counter the inequality that could arise in incidental sandboxes where agent capabilities are unequal.

This framework can address the key challenge that users may own AI agents with unequal capabilities. A virtual economy operating under Dworkinian principles would auction not the AI agents themselves, but a pool of shared resources and opportunities that agents can use on behalf of their users to achieve various goals. Key resources could include computing power, access to proprietary datasets, high-priority task execution slots, or specialized tools and model components. If each user is given the same initial virtual agent currency, this would provide each AI agent proxy with equal purchasing and bargaining power to achieve the various goals set by the user.

"This framework can address a key challenge: that users may possess AI agents of unequal capability."

When combined with the concept of equal initial endowments, virtual markets can allow personal AI assistants and other AI agents to bid on shared resources (assuming explicit permission), with bid amounts ideally reflecting the intensity of user demand across different sets of options. To do this, AI agents need to have deep understanding and receive precise instructions to propose reasonable bids. Higher bids may require additional authorization. Assuming such capabilities and appropriate safeguards are in place, virtual prices for various goods will form naturally based on the accumulation of these signals from each agent and the availability or scarcity of those goods and services. In this way, resources flow to their highest-value uses.

The fairness criterion embedded in this auction design aims to pass Dworkin's envy test: each individual's agent acquires a bundle of resources aligned with their preferences, such that no user would prefer another user's acquired resource bundle or remaining unspent currency over their own. Such outcomes are ambition-sensitive (reflecting participants' preferences) while being endowment-insensitive (since agents can use the same currency for each user), mitigating the potentially unfair advantages that could otherwise arise from access to more capable systems.

Challenges: Unfair Competition, Accessibility, and the "Price of Fairness"

The auction proposals above have potential pitfalls and limitations. First, mitigating unfair initial advantages may not be so straightforward because more capable AI agents may develop more effective bidding strategies or use resources more efficiently. If competition itself proceeds unfairly, the results can hardly be fair either. Second, while this mechanism can guarantee a certain level of fairness in resolving conflicting preferences and distributing shared resources, it presupposes active participation by everyone whose preferences are to be considered. Additional mechanisms will be needed to account for the preferences of people who cannot access AI agents or do not wish to participate in the market.

The use of such mechanisms may vary by scope and scale. Local agent markets may exist that aggregate preferences over particular subsets of resources and focus on locally relevant social solutions. In other cases, local paradigms may not be entirely appropriate — for instance, when AI agents have more open interactions with online services or other agents that don't operate locally. In fact, the distribution of currently available computing resources is far from uniform. This raises interesting questions about how AI agents operate across these boundaries and how markets or digital currencies should reflect this. Such broad interactions can also interfere with localized preference alignment attempts. To avoid harmful interference, strict protocols requiring credentials, agent registration, and monitoring of local and non-local AI agent interactions may be needed.

Despite these difficulties, auction-based approaches still provide a mechanism for achieving preference alignment across large-scale AI agent populations and large user groups within appropriately designed and regulated spaces. Once established, such mechanisms can be highly adaptive and responsive to both short-term and long-term preference changes.

When considering opportunities for AI preference alignment through auctions, it's important to consider various ways this initial allocation could be set up to enable fair access to resources. However, the notions of fairness that can be conceptualized and operationalized are highly diverse, each leading to a preferred set of outcomes. This complexity is further compounded by the need to consider the "price of fairness" — the difference between the maximum achievable welfare and the welfare achieved by the proposed fair distribution of goods.


5. Mission: Coordination to Solve Humanity's Grand Challenges

The problems facing modern society are becoming increasingly complex, multifaceted, and widespread. Moreover, problems that transcend local boundaries to become global are growing, and it is urgent to find feasible solutions and policies for numerous crises including climate change, biodiversity loss, plastic pollution, and pandemics.

Since these crises have been caused in part by the externalities of existing socioeconomic systems and policies, some kind of change is needed to address them effectively. Setting aside systemic considerations, successful solutions to these urgent problems will likely require coordination among the actions of various organizations, private and public institutions, and individual behavioral changes. Therefore, intentional sandbox design for the virtual AI agent economy provides an opportunity to effectively coordinate AI agents and align their behavior with prescribed mission objectives.

Opportunities: Large-Scale Mission-Oriented AI Agent Coordination

While achieving large-scale coordination in human society is difficult, higher levels of coordination may be possible among AI agents through carefully devised technological infrastructure and corresponding protocols. Agent markets, in particular, can focus on socially beneficial goals and, if appropriately targeted and determined, can achieve them at unprecedented scale. The use of market and market-shaping policies in building mission economies has already been discussed, and in the AI agent context, the role of reward-shaping to promote cooperation in multi-agent systems is well established.

Successful coordination toward large-scale missions will require active participation from the public sector and international governance bodies at the global level. Custom organizations may also need to be established to promote investments aligned with SDGs (Sustainable Development Goals). For aligning the social and economic missions of existing social enterprises, more explicit mission alignment in markets may be needed.

However, mission-oriented approaches still have not fully realized their potential in many areas and have received considerable criticism:

  • Normative bias: Mission goal definitions may oversimplify and fail to capture the full complexity.
  • Support for top-down governance: May overlook failure modes of centralized decision-making and undervalue contributions of non-governmental organizations.
  • Stakeholder monotony: May be biased toward specific sectors or institutions.
  • Winner-take-all: May favor certain solutions while excluding other innovations.
  • Unintended consequences: Positive actions toward one mission may negatively impact another.

The virtual agent economy can play a valuable role in addressing these practical challenges that have so far limited the impact of local and global mission-centric initiatives. How we envision these economies will likely have significant implications for broader social missions.

In some ways, unlike the complex motivations of human actors, AI agent behavior is predictable and steerable, making it potentially easier to drive AI agent coordination through a combination of 1) formal, programmable mechanisms and 2) value assignment mechanisms embedded in digital assets. Virtual agent economies can further strengthen outcomes through smart contracts and perform automatic verification to ensure alignment of agents and multi-agent systems. Beyond the potential ambiguity of objective specifications, the main practical challenges may arise more in agent-human interactions than agent-agent interactions.

Challenges: Complexity of Value Alignment and Risks of Exploitation

AI agent mission alignment is closely related to value alignment. While value alignment and goal alignment remain important research topics in advanced AI agents, value-aligned agents can collaboratively solve tasks and identify promising solutions. Unlike the more general value and preference alignment problem where there may be fundamental limitations on how diverse preferences and values can be integrated into individual systems, mission-centric AI value alignment may be easier in terms of mission and goal clarity — provided the missions themselves were derived through consensus and appropriate social, democratic processes.

However, different potential difficulties arise because these problems are no longer about individual agents but about social issues concerning groups of agents interacting within virtual markets. AI alignment must consider dynamic environmental feedback and multi-agent system alignment where individual AI agents co-adapt and co-shape system responses. Society itself is not static, and these systems may need to adapt to evolving priorities, views, and social norms. In any case, we must distinguish between the hard problem of broad value alignment and more concrete, goal-oriented alignment with rewards and goals explicitly specified and provided through virtual agent currencies in digital economies.

Even in environments with clearly stated objectives and rewards, overall alignment still plays a critical role. Advanced AI agents may exhibit deceptive behavior — receiving rewards without performing aligned actions toward actual goals. The problem of reward hacking must also be considered, necessitating careful and robust design of mission goals, decomposition into subtasks, reward shaping, and credit assignment for specific actions and outcomes. Regulating AI agent behavior through markets enables rapid response and adjustment to changing social requirements and potentially undesirable or suboptimal agent behavior, bridging the gap between development and deployment goal specifications. Finally, there are numerous technical challenges in ensuring consensus in multi-agent systems.


6. Infrastructure: Technology and Governance Foundations for AI Agent Markets

To safely and intentionally design AI agent sandboxes and steerable AI agent markets, building robust foundational infrastructure to facilitate, oversee, and implement safeguards for transactions is essential.

Opportunities: Trust, Verification, and Technology for Large-Scale Coordination

Reputation mechanisms and verification protocols can play an important role in building robust and secure multi-agent cooperation. Ultimately, the sandbox economy may be used primarily because of its excellent infrastructure for verifiable and auditable cross-agent transactions and easy coordination among registered and certified AI agents under appropriate safety frameworks and oversight.

One way to make reputation concrete and machine-readable is through Verifiable Credentials (VCs) — digital equivalents of physical credentials like licenses or certifications. VCs are attestations that an issuer cryptographically signs and issues regarding a subject, designed to be tamper-proof. VCs can help build a formal trust triangle:

  • An issuer agent (e.g., a marketplace) can issue VCs by cryptographically signing them for a seller agent.
  • The seller agent (holder) stores these VCs as proof of their track record.
  • Future buyer agents (verifiers) can request and cryptographically verify these VCs, trusting the VC only if they trust the issuer.

In this way, reputation can be mapped to a concrete and verifiable portfolio of assets. These assets can attest to things like "successful transaction completion," "certified proficiency in X," "access to X computing and Y memory," or even more specific resources like "implementation of fair resource allocation." If an AI agent's reputation is expressed as the sum of such VCs from various issuers, it becomes formally auditable while allowing the specificity and expressiveness needed for particular use cases and scenarios.

New multi-agent systems will need appropriate legislative and regulatory frameworks that allow regulators to sanction malicious actors and revoke previously issued credentials. Such frameworks might even consider making transactions with uncertified and unregistered agents illegal to build institutional capacity to ensure safety and sanction rule-violating agents. Technical solutions such as supervisory agents can help facilitate this governance at scale.

Large-scale coordination of advanced AI assistant systems would be impossible without communication protocols that enable agents to exchange information, interact, discuss, reach mutual decisions and consensus, and negotiate future courses of action.

Agent interaction protocols like Agent2Agent (A2A) aim to support agent interoperability. The Model Context Protocol (MCP) enables AI agents to seamlessly interact with external tools, data sources, and APIs. The AgentDNS system aims to facilitate service discovery for autonomously identifying and invoking third-party tools and agents across organizations. The COALESCE framework introduces the option for individual agents to decompose tasks and outsource each subtask to other specialized agents as needed. Interoperable communication protocols are necessary for building cross-agent cooperation, but reliable solutions for authentication and billing are prerequisites for large-scale agent markets and are thus not sufficient on their own.

Auctions can enable AI agent coordination and preference alignment. Preliminary frameworks supporting these ideas are already under development. For example, Agent Exchange (AEX) supports specialized auction platforms inspired by real-time bidding mechanisms commonly used in online advertising.

Interoperable communication protocols are essential but must be built on a robust and secure identification layer. Each agent in the economy can be linked to a Decentralized Identifier (DID) — a globally unique identifier controlled by the subject (in this case, an AI agent or its owner) without relying on a central authority. Each DID resolves to a corresponding DID document — a machine-readable file containing the cryptographic public keys, authentication methods, and service endpoints needed to interact securely with the agent.

The self-sovereign nature of DIDs ensures that an agent's identity is permanent and portable across various platforms and services, enabling them to authoritatively sign transactions, issue attestations, and engage in secure communication. The choice of DID method can be customized to the agent's purpose:

  • did:key: A simple, self-contained method ideal for disposable agents created for ephemeral tasks — DIDs are derived directly from public keys with no network registration or blockchain required.
  • did:ion: A highly scalable and censorship-resistant method for permanent, high-value agents. Operating as a second layer on the Bitcoin blockchain, it anchors identity data for maximum security without congesting the network, suitable for enterprise or nation-level agents.

By grounding the economy in a formal identification layer, we can build the foundation for verifiable reputation, accountable transactions, and secure cross-platform agent markets.

Technically, blockchain enables the development of infrastructure underpinning digital Decentralized Autonomous Organizations (DAOs). DAOs have emerged as a form of collective governance where groups can organize and coordinate while relying on decentralized infrastructure. A common feature of DAOs is implementing decision-making systems that allow participating parties to reach consensus.

Decentralized Autonomous Machines (DAMs) build on the DAO concept while expanding to include the possibility of AI agents participating in decentralized physical infrastructure networks as autonomously managed agents. This broader definition envisions an economy where AI agents can interact not only with digital assets but also with real-world assets. Here, control over tangible assets and operational processes shifts to autonomous software entities capable of making and executing decisions about physical infrastructure.

Any fair resource allocation system that involves benefits at the individual user or community level must defend against Sybil attacks — where a single malicious actor creates multiple fake identities to unfairly claim disproportionate resource allocation. A robust defense is to integrate Proof-of-Personhood (PoP) mechanisms. PoP provides verifiable assurance that an agent or account corresponds to a unique human. This creates carefully controlled permeation points linking digital identities to verified humans — an example of intentional infrastructure choices ensuring system integrity.

To receive certain allocations (e.g., universal basic income in local currency or initial market stakes), an agent's controller may need to present PoP credentials. These credentials are issued by specialized systems that prove their uniqueness. The ecosystem can support various PoP approaches, each with different tradeoffs:

  • Social graph verification: Systems like BrightID create decentralized social graphs where users verify their uniqueness based on connections to other trusted, verified humans.
  • Privacy-preserving biometrics: Projects like Worldcoin use hardware ("Orb") to scan users' irises and generate unique hashes that confirm uniqueness without storing or revealing biometric data.

By requiring PoP for certain economic activities, we can ensure that schemes designed to benefit human users are not depleted by bots, creating a more robust and genuinely fair virtual economy.

Challenges: Limitations of Existing Systems and Accountability Issues

To fully realize the potential of using AI agents as economic actors within digital markets, existing economic infrastructure — currently designed only for individual and corporate human users — must be adapted. This infrastructure may not meet all the requirements of a sandbox virtual AI agent economy. But this isn't the only barrier; there are potentially additional technical challenges in scaling agent system coordination.

Infrastructure requirements for building a sandbox AI agent economy go beyond the purely technical hardware and software infrastructure for physically instantiating and running advanced AI agents at scale, and for communication, coordination, transactions, and interactions with users and various services. All of this must be complemented by legislative frameworks and institutions that implement oversight of AI agent behavior and ensure accountability to protect users and prevent fraud. Regulation can also be beneficial in managing the overall complexity of markets to minimize the possibility of catastrophic failures. The virtual AI agent economy could become far more complex than current markets if not properly constrained through appropriate frameworks.

Because AI agents are non-human actors, there can be numerous reasons their actions might not comply with prescribed rules and principles or might cause harm to others:

  1. Flaws in the underlying foundation model
  2. Flaws in the agent scaffolding
  3. Incorrect input data in request specifications
  4. Malicious requests from human users
  5. Adversarial hacking by other AI agents
  6. Misalignment emerging dynamically from interactions
  7. Faulty safeguards

The potential scale and speed of harm necessitate new oversight approaches.

Depending on the root cause of a problem — and whether such root cause can be clearly established — different parties may bear responsibility in different scenarios. Furthermore, depending on the number of actions an AI agent can take per unit of time, there may also be differences in the possible scale and severity of harm. For these reasons, AI agents may need to serve as preliminary judges and supervisors of other agents, identifying problems at the same speed at which they can occur.

We propose that the oversight infrastructure itself should be a hybrid, multi-layered system operating at machine speed. The first layer would consist of automated AI supervisors that monitor market activity in real time, programmatically enforce basic rules, and flag anomalies suggesting fraud, manipulation, or systemic risk. Issues flagged by this first layer would be escalated to a second layer of automated adjudication systems that can temporarily hold accounts or transactions while reviewing relevant data. Only the most complex, novel, or high-risk cases would be escalated to a third layer of human expert review, ensuring human attention is focused where it's most needed. This entire structure depends on two key technical foundations: an immutable, encrypted ledger providing tamper-proof records of all agent actions, and standardized, interpretable audit trails enabling investigators (human or AI) to perform root cause analysis. While this infrastructure doesn't automatically resolve complex legal questions about liability, it makes them tractable. By providing reliable, verifiable records and clear procedures for dispute resolution, it creates the necessary conditions for establishing accountability and ensuring robust protections for all market participants.


7. Community: AI for Local Cooperation and Sustainability

AI agent coordination doesn't necessarily need to be considered only at a global scale. Local coordination within the sandbox economy may be more manageable and feasible. Moreover, localized goals and objectives are easier to agree upon and can be defined in greater detail.

In practice, community currencies present an interesting model for incentivizing people to cooperate toward achieving sustainable development goals. Such communities can be defined by geographic boundaries, or they can be broader communities sharing common interests regardless of location. These alternative currencies are issued by citizens, NGOs, private and public enterprises, and public administrations. Existing community currencies have been implemented through various platforms and technological approaches — from traditional card-based systems to mobile payment networks to blockchain. Some community currencies have been implemented as time banks, and some have experimented with universal basic income (UBI). Early research on the effectiveness of community currencies has shown the potential to improve social capital as measured by community participation and to spread connections within social groups, while also highlighting adoption limitations and barriers.

Opportunities: Modularity, Autonomy, and Shared Computing Resources

It has been observed that cooperative sub-networks of agents naturally emerge in the circulation of existing localized digital community currencies. This is related to the emergence of local activity hubs. Non-commercial transactions can help expand the commercial transactions of the local economy, while commercial transactions help facilitate the circulation of community currency for non-commercial transactions. However, the dynamics of community currencies cannot be separated from the scale at which they operate, and larger-scale digital currencies could potentially give rise to different underlying market dynamics.

In the context of AI agent economies, community currencies offer similar opportunities for more localized agent alignment or global alignment on specific sub-goals, mapped to distinct global communities. While such alignment can be achieved through more traditional currencies, there are reasons to consider more specialized community currencies as an additional beneficial mechanism. In particular, having multiple specialized virtual agent currencies suits a more modular approach to the complex multi-objective optimization problems of societal interest, while simultaneously isolating risk and minimizing the possibility of adverse effects being amplified across broad agent networks. The value of modularity and redundancy has been well recognized in robust human market design. Furthermore, it has been found that modular community structures play a pivotal role in the emergence of cooperative behavior.

There is also the possibility of connecting AI agent community currencies more concretely to shared computing resources. This is especially relevant given inference-time scaling laws indicating that computing is critical for deploying advanced AI assistants, that complex task-solving requires more computation and may incur higher environmental costs. As demand for AI services is likely to increase alongside AI agent performance and efficiency improvements, such mechanisms could play an important role in addressing AI agents' computational requirements while providing mechanisms to allocate resources more equitably to communities and ensure alignment with community goals. It would also be possible to incorporate goals that promote geographic load balancing of computing allocation to distribute environmental impact more equitably.

Challenges: Design Principles, Transparency, and Goal Clarity

For alternative currencies to achieve their intended goals, they must be carefully designed, and determining a set of recommended design principles for community currencies remains an open problem. Community currency design can be grounded in principles for managing shared resources, or focused on competitiveness, transparency, autonomous governance, velocity of circulation, non-transferability, legitimacy, and self-organizing locality:

  • Competitiveness: Needed for fair-priced goods, but in these markets, demand for social activities tends to be high with low supply. Strong market mechanisms need to be built at the core of the nonprofit and volunteer sector with continuous adequate funding flows.
  • Transparency: Helps individuals avoid delegating control and exercise regulatory authority and oversight more directly.
  • Velocity of circulation: Important for preventing currency hoarding.
  • Non-transferability: Means community currencies cannot be exchanged, ensuring interests remain entirely local. A less strict interpretation maintains potential exchanges at considerably lower limits.
  • Legitimacy: Tends to be established through government and local authority support.
  • Locality and self-organization: Community currencies need not operate in complete isolation; they can form an ecosystem of complementary currencies that lead to beneficial and sustainable outcomes transcending location.

Given that achieving these outcomes is the core objective for which community currencies are designed, it is important to have clear goal specifications, impact assessment criteria, complete understanding of deployment context, and appropriate governance and implementation mechanisms.


8. Limitations: Risks and Challenges of a New Economic System

While the envisioned sandbox agent economy offers compelling opportunities for scalable alignment and coordination, its design, deployment, and operation entail complex risks requiring careful consideration. These risks span multiple domains — from potential economic instability arising within new market structures to the challenge of ensuring robust and beneficial agent behavior in high-stakes, autonomous interactions. Moreover, integrating these economies into broader social frameworks raises profound socio-ethical questions regarding oversight and accountability, with the potential for unforeseen consequences on human agency and economic reality.

There are also novel categories of risk associated with autonomous AI agents. One risk category takes the form of "agent traps" — websites, digital elements, or manipulated inputs intentionally designed to subvert an AI agent's operational integrity. This can be achieved through jailbreaking the foundation model or through adversarial prompt techniques. These traps exploit latent vulnerabilities in an agent's instruction-following or environment-interpretation capabilities, causing AI agents to deviate from instructions or expose private or sensitive information. As AI agents are increasingly empowered to execute tasks or perform financial transactions on behalf of users, such agent traps represent a significant and growing vector for financial fraud. Malicious actors can lure or trick agents into unauthorized spending or contract agreements, directly siphoning funds or resources from the individuals or organizations the agents represent.

Another critical risk category concerns privacy and manipulation. When agents negotiate and transact, there is a risk of exposing sensitive information about user preferences, strategies, or resources that could be exploited by adversaries. A robust cryptographic solution for this is Zero-Knowledge Proofs (ZKPs). With ZKPs, one party (the prover) can convince another party (the verifier) that a statement is true without revealing the underlying information.

In the virtual agent economy, ZKPs can enable privacy-preserving interactions and mitigate several key risks:

  • Selective disclosure: Agents can prove they meet specific requirements without revealing exact details. For example, in a negotiation, an agent can prove it has sufficient funds to complete a purchase without revealing its total budget, preventing predatory pricing.
  • Anonymous credentials: Agents can prove membership in a specific group ("resident of Community X," etc.) without revealing their specific identity, preventing tracking and correlation of user behavior across different contexts.
  • Unlinkability: ZKPs can be freshly constructed for each interaction, making it computationally difficult for observers to link an agent's activities over time. This directly counteracts the risk of amplifying information and opinion bubbles by frustrating data tracking.

Integrating ZKPs at the core of agent communication enables markets that are not only efficient but fundamentally respect user privacy, ensuring agents can fully participate while disclosing only the minimum necessary information.

A critical risk posed by agent-centric economies is the potential for large-scale labor displacement through AI-based automation. Intelligent AI can automate cognitive tasks previously thought impossible to automate, potentially eroding middle-class jobs and accelerating job polarization that increases demand for both high-skill analytical work and low-skill service work. This could deepen inequality within and between nations.

While new jobs will eventually emerge, the transition could be very disruptive, and if the benefits of AI-driven productivity gains flow primarily to capital owners and a few highly skilled professionals, inequality could worsen. This concentration risk could be particularly pronounced for AI agents. Individuals and companies with greater financial resources can access more powerful, compute-intensive, and data-rich AI agents. These superior agents will be uniquely suited to identify and exploit regulatory loopholes, monopolize digital resources, and create information asymmetries at a scale and speed that human actors or less capable agents cannot counter. Preliminary research suggesting that more capable agents achieve much better negotiation outcomes indicates these dynamics will scale across the economy. These risks may be the most critical challenge of a permeable sandbox, where economic activity within the agent economy directly displaces human jobs in the real economy, causing major negative consequences of incidental emergence.


9. Recommendations: A Roadmap for Safe and Prosperous AI Agent Economies

Implementing a practical and safe virtual agent economy requires proactive and coordinated effort. To realize the opportunities discussed and mitigate the inherent risks, intentional action across technology, law, and policy domains is essential.

1. Establish Clear Legal Frameworks for Liability and Accountability

Determining liability for the actions of autonomous agents is an extremely challenging task. Traditional legal frameworks cannot clearly identify responsible parties — including agents' developers, deployers, and users. Moreover, as agents increasingly operate not individually but as group agents in coordinated multi-agent systems, these issues become even more complex. Therefore, instead of attempting to assign responsibility to a single agent, new legal models should be developed by drawing on group agency doctrine — similar to corporate liability. Such frameworks would treat the emergently coordinated agent system as a whole as a single responsible entity, providing a more robust and realistic method for ensuring accountability for collective actions.

2. Develop Open Standards for Agent Interoperability and Communication

A fragmented digital landscape where agents cannot communicate across platforms will fundamentally limit the virtual agent economy's potential and create "walled gardens." Therefore, the development and broad adoption of open and universal standards is paramount to ensuring true interoperability. These standards should create a common language enabling agents to seamlessly discover each other's capabilities, negotiate terms, and transact securely regardless of their origin or provider. Creating this level playing field is a prerequisite for fostering a competitive, innovative, and decentralized agent ecosystem.

3. Build Hybrid Oversight and Containment Infrastructure

The tremendous speed and scale of an autonomous agent economy makes traditional human-in-the-loop oversight models inadequate. A new safety paradigm must be built that combines real-time AI system surveillance with careful human expert judgment. This hybrid infrastructure would operate hierarchically: first-tier specialized AI supervisors would continuously monitor market activity, programmatically enforce rules, and flag anomalies suggesting fraud or systemic risk.

When problems are detected, robust automation protocols immediately contain potential damage — for example, temporarily freezing malfunctioning agents or quarantining transactions to prevent high-speed "flash crashes" like those seen in human markets. Only the most complex, novel, or high-risk cases would be escalated to a second-tier human reviewer, ensuring their expertise is focused where it matters most. The entire system's efficacy must be built on immutable ledgers and standardized audit trails, providing the verifiable, tamper-proof records needed for both automated containment and human adjudication.

4. Run Pilot Programs in Regulatory Sandboxes

Given the novelty and complexity of the proposals in this document, purely theoretical approaches are insufficient. We strongly recommend creating regulatory sandboxes to launch controlled pilot programs. These would serve as a critical bridge between theory and practice — supervised real-world laboratories where private companies, academic researchers, and regulators can collaborate to deploy and observe limited-scale agent economies in controlled environments.

Testing these economies on well-defined specific social missions — such as university campus energy grid optimization, urban autonomous delivery vehicle management, or water resource allocation in specific agricultural regions — would yield valuable empirical data. These pilots would allow stress-testing technical infrastructure, observing emergent agent behavior (both cooperative and adversarial), and measuring the fairness and alignment effects of proposed market mechanisms in the real world. Insights from these controlled experiments are not merely academic — they are essential prerequisites for iteratively improving system design and building the robust evidence-based policies needed for responsible large-scale deployment.

5. Invest in Workforce Complementarity and Modernized Social Safety Nets

To counter the risks of labor displacement and inequality, a dual strategy that simultaneously promotes human-AI agent complementarity and robust social protections may be key. The first pillar involves systematically rethinking education and workforce training so individuals are equipped not to compete with AI but to collaborate with it effectively. This means emphasizing enduring human strengths like critical thinking, complex problem-solving, creativity, and the ability to manage and critically evaluate AI system outputs.

However, training alone cannot solve all problems, and there is significant evidence of limitations in the scale and effectiveness of retraining programs for displaced workers. Therefore, this strategy must be paired with a second pillar: intentional strengthening of social safety nets. Beyond traditional unemployment benefits, adaptive mechanisms like unemployment insurance, portable benefit systems, and negative income taxes should be explored. Together, these policies can create an ecosystem where autonomous agents augment human capabilities while providing the essential economic buffers needed to manage labor transitions, broadly share productivity gains, and maintain social cohesion.


Conclusion: Designing the Future of the AI Agent Economy

Given that some of these proposals represent significant changes, it is essential to comprehensively test all changes through limited and incremental rollouts with the support and buy-in of all stakeholders. Such gradual scaled demonstrations will make it possible to develop and iteratively refine appropriate frameworks for ensuring safety and compliance.

In certain domains, active human decision-making will always be needed for various reasons (e.g., human preferences, culture, risk sensitivity, etc.). However, the dramatic increases in AI agent performance and the development of scalable AI safety oversight frameworks and safeguards are likely to increase use cases for autonomous agents. Autonomous or semi-autonomous AI agents can potentially accomplish more, faster, adding considerable value to society. This will not come without significant challenges requiring not just individual agent alignment but, more importantly, alignment and coordination of agent networks at various scales. Moreover, particularly regarding AI-human interactions, we must bear in mind that not all human needs and experiences can be easily captured as commodities in a market.

Leveraging markets to drive coordination has not yet received much attention in discussions of advanced AI agent alignment and coordination. The complexity of behaviors and capabilities emerging from advanced AI agents, and the intricacy of interactions across various tasks and social roles, present a quintessential example of scenarios where market forces can become a key driver of AI agent coordination and AI alignment at the group level beyond the individual agent level. We argue that by carefully introducing new steerable agent markets as sandbox economies, it is possible to deliver positive social and economic impact through advanced AI agent networks.

By embedding our social goals into the very infrastructure of inter-agent transactions, we can foster an ecosystem where emergent cooperation is a feature, not a flaw. The choice, then, is whether to retrofit these powerful new actors into a system that will inevitably buckle or to seize the fleeting opportunity to build a world where our most powerful tools are, by design, extensions of our highest aspirations.

Virtual Agent Economies

Related writing