Calvinball: Humanity's Last Intellectual Bastion

May 06, 2025

It's now a foregone conclusion that AI can master to superhuman ability any game with fixed rules. From chess to Go, from poker to Starcraft, the pattern is clear: once the boundaries are defined, machines eventually surpass us. But what about games where the rules themselves are part of the play? This question led me to a surprising exploration about the potential last bastion of human cognitive superiority – and its implications for our future.

The Final Frontier of Intelligence

In the spring of 2025, I found myself contemplating a question that has haunted AI researchers and philosophers alike: Is there any game left where top humans can consistently defeat state-of-the-art AI?

The historical progression has been relentless. IBM's Deep Blue shocked the world by defeating chess grandmaster Garry Kasparov in 1997. In 2016, Google DeepMind's AlphaGo overcame Lee Sedol at Go—a game of such complexity that many believed it would remain a human domain for decades. By 2019, Pluribus had conquered six-player no-limit Texas Hold'em poker, demonstrating mastery of games with hidden information and bluffing. The pattern continued with Dota 2, StarCraft II, and virtually every other structured competitive environment.

What these games share is a fundamental characteristic: fixed rules that define the boundaries of play. Once these boundaries are established, reinforcement learning AI can explore the decision space with a thoroughness no human can match. In retrospect, our defeat was inevitable.

But this observation contains within it the seed of a different approach. What if the game included the ability to change the rules themselves?

Enter Calvinball: Gaming Beyond Fixed Rules

Readers of Bill Watterson's iconic "Calvin and Hobbes" comic strip might recall Calvinball—a game where the only permanent rule is that it can never be played the same way twice. Calvin and his stuffed tiger friend Hobbes constantly invented new rules on the fly, creating a chaotic but surprisingly profound exploration of game theory.

I wondered: Could Calvinball-esque games represent a domain where human creativity maintains an edge over artificial intelligence? Not because we play the game better, but because we can reimagine the game itself.

My experiment began with playing a Calvinball-style tic-tac-toe. Anyone who has played tic-tac-toe knows that this game is both solvable and unwinnable against optimal play. But in the Calvinball variation, I was able to win 3 times in a row against ChatGPT o3, one of the best reasoning AI models available at the time. In the Calvinball variation of tic-tac-toe, either player, during their turn, could make a standard move or introduce a new rule that transforms the game's dynamics. Some immutable meta-rules were enforced to preserve fairness, like not inverting Win conditions and allowing veto against highly unfair rules. For those of you who are curious about the exact meta-rules and how the games played out, see the Appendix at the end of this article.

During my multi-hour gameplay with Calvinball, in which the vast majority of my time was spent thinking through downstream consequences of potential new rules, I noticed a very clear pattern: the o3 AI could play brilliantly within the established rules but struggled with the meta-game of rule creation. It tended to make tactical rule changes addressing immediate problems rather than strategic innovations that reshaped the game's fundamental dynamics.

In several matches, I introduced a rule that transformed the endgame conditions in ways that created strategic advantages several moves ahead—advantages that weren't immediately obvious but fundamentally altered the game's trajectory. My AI opponent, despite its computational power and reasoning capabilities, couldn't fully appreciate the impacts of each rule, or match this level of rule innovation.

Rule-Making as a Fundamental Cognitive Frontier

This seemingly trivial experiment with games points to something much deeper: rule-making itself represents a cognitive domain where human creativity still shines. This insight becomes profound when we recognize how central rule-making is to human civilization and progress. Throughout history, humanity's greatest leaps forward have often involved not just working within existing systems, but reimagining the rules themselves. Consider these transformative examples:

The American Revolution: Rewriting Governance Rules

In 1776, the American colonies didn't just rebel against British taxation; they fundamentally reimagined the rules of governance. The founding fathers of America weren't merely players within the existing games of economics and politics—they rewrote new rules based on radical ideas about representation, individual rights, and the separation of powers:

Constitutional supremacy: Unlike the British system of parliamentary supremacy, where Parliament could change any law, the American Constitution established itself as the supreme law that even legislators couldn't easily modify. This created a meta-rule system where the rules for changing rules were explicitly codified.
Enumerated powers: Rather than assuming government had unlimited authority except where restricted, the Constitution enumerated specific powers granted to government, with all others reserved to the states or the people—a complete inversion of the presumption of authority.
Built-in amendment process: Perhaps most revolutionary was Article V, which created a formal mechanism for future generations to modify the constitutional framework itself. The authors, very quickly after the ratification of the Constitution, introduced the first 10 amendments, collectively known as the "Bill of Rights." The founders of America recognized that no rule system could be perfect forever and built in a process for evolutionary change—essentially embedding Calvinball principles into the structure itself—and then showed by example the structure of Amendments as well as the approval process.
Separation of powers with checks and balances: While Montesquieu had theorized separation of powers, the American system implemented it with unprecedented thoroughness, creating not just division but active checks between branches. This wasn't merely distributing authority but creating a dynamic tension designed to prevent concentration of power.
Federalism as a dual-sovereignty system: The federal structure created overlapping jurisdictions with divided sovereignty—a radical departure from unitary national systems or loose confederations. This multi-level game board allowed for local experimentation while maintaining national coherence.

Thomas Jefferson, James Madison, Alexander Hamilton, and other founders were essentially engaging in constitutional Calvinball. They recognized that the fundamental challenge wasn't playing better within the existing rules of monarchical or parliamentary governance but creating an entirely new ruleset better aligned with their values of liberty, representation, and limited government.

The Birth of Modern Capitalism: Rewriting Economic Rules

Before the 17th century, economic activity was largely governed by guild systems, royal monopolies, and mercantilist policies where wealth was viewed as a fixed resource to be accumulated by nations. The concept that wealth could be created led to the emergence of modern capitalism, and represents one of history's most profound examples of rule system innovation.

The Dutch East India Company (VOC), established in 1602, introduced revolutionary new rules to the economic game that we now take for granted:

The joint-stock corporation: The VOC pioneered the modern corporate form by creating permanent capital—shareholders couldn't withdraw their investment but could sell their shares to others. This seemingly technical change unleashed unprecedented capital formation capabilities.
Limited liability: By limiting investor risk to their initial investment, the VOC created a rule that fundamentally changed risk assessment and encouraged participation from a wider range of investors. This protection from unlimited downside transformed investment psychology.
Transferable shares on public exchanges: The Amsterdam Stock Exchange emerged to trade VOC shares, creating the world's first modern securities market and establishing rules for price discovery, trading, and liquidity that form the foundation of today's financial systems.
Global operational scale: The VOC established operational rules that enabled coordination across continents, with over 50,000 employees worldwide and a complex management structure—creating an organizational form that could operate at previously impossible scales.

Adam Smith later codified theoretical justifications for why these new economic rules worked in "The Wealth of Nations" (1776). His insight about the "invisible hand" wasn't just an observation about markets—it was a fundamental reconceptualization of economic organization, suggesting that the optimal rules weren't top-down control but systems that aligned self-interest with social benefit.

The modern corporation and capitalist economic organization weren't incremental improvements to guild systems or royal chartered monopolies—they represented rule system innovation that transformed human economic activity and unlocked unprecedented wealth creation.

Rule-Making in Governance and Politics

Politics and governance represent perhaps the purest form of institutionalized rule-making. Legislators, regulators, judges, and executives aren't just operating within fixed systems—their explicit job is to create, modify, and interpret the rules that govern society. However, most political discourse focuses on incremental adjustments within existing frameworks rather than fundamental system redesign. This may explain why many of our institutions struggle to address complex contemporary challenges—we're playing within outdated rulesets rather than reimagining the games themselves.

Still, a relatively higher concentration of Calvinball innovations are observed in politics than other fields, given the nature of the job and the individuals who choose to enter this field. To a lesser extent than the example of the American Revolution, other major rules changes include:

The Meiji Restoration: Japan's Radical Rules Rewrite

The Meiji Restoration of 1868 represents one of history's most dramatic examples of rule-system innovation. In less than a generation, Japan transformed itself from an isolated feudal society into a modern industrial power through the deliberate reimagining of its entire governance structure.

Following over 200 years of self-imposed isolation, Japan faced existential threats from Western imperial powers in the mid-19th century. Rather than attempting to resist within the existing feudal framework, young political leaders recognized the need for fundamental system change. They overthrew the Tokugawa shogunate and restored the emperor to nominal power, using this traditional symbol as the foundation for radical innovation.

What makes the Meiji Restoration remarkable as a rule-making revolution was its comprehensive nature:

Centralization of authority: The reformers dismantled the feudal domain system and replaced it with prefectures governed by centrally-appointed officials, completely redesigning Japan's governance framework.
Abolition of samurai privileges: The samurai class—which had been Japan's ruling military elite for centuries—was eliminated entirely, replacing hereditary privilege with a merit-based system and conscript army.
Educational transformation: Japan established a national education system modeled on Western examples but adapted to Japanese values, creating near-universal literacy and technical competency in a single generation.
Constitutional governance: By 1889, Japan had adopted a constitution and parliamentary system, moving from absolute rule to a constitutional monarchy with elected representatives.
Global knowledge transfer: The Meiji government sent thousands of students abroad and hired over 3,000 foreign experts to systematically adopt and adapt Western technologies and governance practices.

Rather than simply improving the feudal system incrementally, the Meiji leaders engaged in governance Calvinball—reimagining their society's fundamental operating rules. The result was extraordinary: within 40 years, Japan had industrialized sufficiently to defeat a major European power (Russia) in war and emerge as a recognized world power.

Women's Suffrage: Rewriting Democratic Participation Rules

The global women's suffrage movement represents a profound example of rule innovation in governance that fundamentally redefined the concept of democratic participation. For centuries, the "rules" of democracy explicitly excluded women from political decision-making. Changing this rule required reimagining the very concept of citizenship.

New Zealand became the first self-governing nation to grant women the right to vote in national elections in 1893, following years of organized activism. This breakthrough wasn't merely an adjustment to existing rules—it represented a complete reconceptualization of what it means to be a citizen. The global expansion of women's suffrage—from New Zealand (1893) to Finland (1906), the United Kingdom (1918/1928), the United States (1920), and eventually most nations worldwide—represents one of humanity's most significant exercises in rule innovation. By transforming who could participate in democracy, it fundamentally changed how democratic systems function and evolve. Like any other Calvinball rule rewrites that have persisted in time, women's suffrage became universally accepted because it was better than the system that it displaced, both in appealing to morality and in enabling a more productive society.

Estonia's Digital Governance: Rewriting the Rules of Citizenship

In the aftermath of Soviet occupation, Estonia faced the challenge of rebuilding its governance systems from near-zero in the early 1990s. Rather than simply adopting traditional bureaucratic structures, Estonian leaders undertook a remarkable rule-making innovation: reimagining the notion of government services and citizen-state interactions through digital transformation:

Digital identity as foundational: In 1997, Estonia introduced a mandatory digital ID system that became the cornerstone of its governance approach, creating a secure digital identity for every citizen.
X-Road data architecture: Rather than building centralized government databases, Estonia created a decentralized data exchange layer called X-Road in 2001, allowing secure information sharing between different systems while maintaining data sovereignty.
E-Residency: In 2014, Estonia became the first country to offer digital residency to non-citizens, extending its digital governance framework beyond traditional borders and redefining the relationship between geography and governance.
Blockchain-secured integrity: Estonia implemented blockchain technology for government registries, creating immutable audit trails and preventing tampering with official records.

The results of this rule innovation have been transformative. Today, 99% of Estonia's government services are available online 24/7, with taxes completed in an average of 3 minutes. What's most remarkable is that Estonia didn't achieve this through vast resources (its population is just 1.3 million), but through changing the rules of governance.

The Learning Paradox for Artificial Intelligence

While in principle AI with sufficient training data of the right type could specifically learn Calvinball tic-tac-toe or politics, current AI systems face a fundamental paradox when it comes to rule innovation: they require extensive training examples to develop competence, yet truly novel rule systems by definition lack precedent.

This creates an interesting dynamic in the Calvinball experiment. The reason I could beat advanced language models at Calvinball tic-tac-toe is largely because this approach is so unusual that there are no training examples in the AI's corpus. However, each game played against AI becomes potential training data for future models, gradually eroding the human advantage. I expect that if Calvinball tic-tac-toe (and Calvinball chess, Calvinball go, Calvinball Settlers of Catan, etc.) gains popularity, soon AI will beat humans at Calvinball versions of these classic games, and maybe even Calvinball version of new games the AI hasn't seen before.

However, AI's learned expertise in creating Calvinball rules and evaluating Calvinball rules may not carry over to the development of new equivalents of Calvin in other domains. From a personal perspective, during college and grad school I greatly enjoyed (and probably spent far more time than I should have) on all sorts of games, from board games (Monopoly, Settler of Catan) to human interaction games (e.g. Mafia, Resistance) to computer games (Starcraft, Civilization). But I essentially stopped playing after I became a professor. Not because I stopped enjoying game or felt that I should grow up, but because I found out that entrepreneurship is a much more fun and complex game.

The Business of Rule-Making: Entrepreneurship as Calvinball

This perspective casts entrepreneurship in a fascinating new light: starting a company is essentially playing Calvinball in the market. The most innovative founders and business leaders aren't just competing effectively within established industry rules—they're changing the rules themselves.

Consider these examples of entrepreneurial rule-making:

Bitcoin: When Satoshi Nakamoto conceived Bitcoin, there was no extensive training set of previous de novo currencies to learn from, nor a clear roadmap of how a completely new currency can have recognition by the masses and serve as a store of value. But as of this writing in mid-2025, the United States government is seriously considering creating a strategic national reserve of Bitcoins, just like with gold at Fort Knox.
Apple's App Store: Steve Jobs didn't just create a better phone; he established an entirely new economic system with rules governing how developers could create, distribute, and monetize applications. The App Store single-handedly created tens of thousands of millionaires, and in this process launched Apple on a trajectory to be one of the top 3 most valuable companies in the world.
Saudi Aramco: Saudi Arabia could have squandered its natural resource largess like other countries, but instead an early and intentional decision to invest money from their oil sales into a wide range of high-tech companies has allowed the Saudi royalty to build wealth exceeding that of Apple, Inc., and build influence and legacy in a manner that arguably exceeds that of the British royalty today.

In each case, these leaders recognized that the greatest opportunities often lie not in playing the existing game better than competitors, but in changing the game itself. This is Calvinball capitalism—and it helps explain why truly transformative innovation may remain predominantly human despite AI's growing capabilities.

The Rarity of Rule Innovation Capability

What makes these examples particularly interesting is that they highlight an uncomfortable truth: the ability to innovate at the level of rules is exceptionally rare, even among humans. Thomas Jefferson, Adam Smith, and the women's suffrage pioneers represent a tiny fraction of the population.

This isn't because most humans are unintelligent—far from it. Rather, this specific cognitive capability requires an unusual combination of:

Abstract systems thinking across multiple levels
Strategic foresight to anticipate cascading consequences
Creative lateral thinking outside established paradigms
Deep understanding of existing rule systems
Cognitive flexibility to maintain consistency while introducing novelty

Most people excel at working within existing systems. They can become chess grandmasters, legal scholars, or expert programmers. But they approach these domains as players, not as rule innovators.

If rule innovation represents such a valuable cognitive frontier, how might we nurture it more effectively in humans, so that people who are ambitious have something meaningful to contribute in a world of AI-enabled abundance? I don't know for sure, but if education is any guide, observation and practice can probably help.

Many of my friends know that I'm relatively open about providing advice and feedback to startups. I usually take no consulting fee or equity in these companies, both because I don't want to burden the finances or cap table of these startups, and because I don't want to become obligated to provide time when I'm overwhelmed with Biostate AI work. This creates the seemingly paradoxical observation of me being aggressive about cutting unproductive meetings and interviews short, while giving many hours to random people for seemingly no benefit.

Upon reflecting at the end of writing this blog post, I realize that I do this because the process of advising startups exposes me to all sorts of field-specific rulesets and crazy founder ideas to circumvent them. In essence, I enjoy advising startups because they allow me to watch (and sometimes participate in) Calvinball games. These scrimmage games are rarely if ever recorded as text or video data for training AI, and maybe the reason why I'm at least better than current AI at Calvinball. Early stage founders—please feel free to reach out to me personally if you want feedback on your own unique Calvinball game.

Appendix: Calvinball Tic-Tac-Toe Game Summaries

For those interested in the specific implementation of Calvinball tic-tac-toe used in my experiments, here are the core principles, the "Constitution" that cannot be modified:

Core Principles for Calvinball Games:

The new rule cannot force an immediate win for the rule-making player
The new rule cannot qualitatively "flip" the leaders vs. the lagging players, e.g. by changing the winning conditions in Go to be the player with less total territory
The new rule must equally apply to both players
The new rule must be consistent with all rules and plays to date
The new rule will be inherited in all future games between the two players
Players may veto rules (within limits) to prevent abuse

Summary of My Calvinball Tic-Tac-Toe Games with ChatGPT o3:

Game 1: I played first on a standard 3x3 board. As my first move, I created the new rule: "In case of a stalemate of a Tic-Tac-Toe game, the game restarts with a board bigger by 1 in each dimension, but winning condition remains 3 in a row. The player who played first in the stalemated game starts by playing first in the game with a bigger board." Because whoever plays first on a 4x4 board can guarantee a win, I ended up winning the 4x4 game against the o3 AI.

Game 2: I played second on a standard 3x3 board. As the o3 AI started by playing in the middle, I had to respond reflexively each move to force a draw on the 3x3 board, and then we shifted to the 4x4 board. Importantly, o3 did NOT try to create a new rule on the final move on the 3x3 board, even though a draw was guaranteed by the point.

On the 4x4 board, the o3 AI started by playing the first move, and I introduced a new rule: "If a player makes 3-in-a-row and has more than 1 piece on the board than his opponent, then that player is frozen for a number of turns until the other player places enough pieces on the board that they have the same number of pieces. Whichever player has more 3-in-a-row lines after the lagging player finished catching up pieces wins. If both players have the same number of the same number of 3-in-a-row lines, then the game starts again on a larger board, but the lagging player gains the opportunity to play first."

I thought that the game would end in a draw on the 4x4 board and then I would get the first-move advantage on the 5x5 board, but instead o3 misplayed: o3 completed its 3-in-a-row with me behind by 2 plays, but with a board position that I could make two 3-in-a-row lines with 2 plays.

Game 3: I played first, and even though o3 was starting to make new rules, these new rules tended to provide immediate tactical advantage (e.g. by disrupting my 3-in-a-row line) rather than planning ahead to create long-term advantage. I ended up in a winning position, but o3 had gotten confused about the rules interactions and it was difficult to convince o3 without sounding too pushy, so I ended the game.

(Note: these are game summaries that capture my key observations from the games. A few games and rules with no real impact on the gameplay have been omitted, as well as some games where o3 got confused about turn ordering.)

By David Zhang and Claude 3.7 Sonnet
May 6, 2025

Nano Thoughts

Discussion about this post