The Model Thinker

What You Need to Know to Make Data Work for You


By Scott E. Page

Formats and Prices




$24.99 CAD

This item is a preorder. Your payment method will be charged immediately, and the product is expected to ship on or around March 16, 2021. This date is subject to change due to shipping delays beyond our control.

Work with data like a pro using this guide that breaks down how to organize, apply, and most importantly, understand what you are analyzing in order to become a true data ninja.

From the stock market to genomics laboratories, census figures to marketing email blasts, we are awash with data. But as anyone who has ever opened up a spreadsheet packed with seemingly infinite lines of data knows, numbers aren't enough: we need to know how to make those numbers talk. In The Model Thinker, social scientist Scott E. Page shows us the mathematical, statistical, and computational models—from linear regression to random walks and far beyond—that can turn anyone into a genius. At the core of the book is Page's "many-model paradigm," which shows the reader how to apply multiple models to organize the data, leading to wiser choices, more accurate predictions, and more robust designs. The Model Thinker provides a toolkit for business people, students, scientists, pollsters, and bloggers to make them better, clearer thinkers, able to leverage data and information to their advantage.


1. The Many-Model Thinker

To become wise you’ve got to have models in your head. And you’ve got to array your experience—both vicarious and direct—on this latticework of models.

—Charlie Munger

This is a book about models. It describes dozens of models in straightforward language and explains how to apply them. Models are formal structures represented in mathematics and diagrams that help us to understand the world. Mastery of models improves your ability to reason, explain, design, communicate, act, predict, and explore.

This book promotes a many-model thinking approach: the application of ensembles of models to make sense of complex phenomena. The core idea is that many-model thinking produces wisdom through a diverse ensemble of logical frames. The various models accentuate different causal forces. Their insights and implications overlap and interweave. By engaging many models as frames, we develop nuanced, deep understandings. The book includes formal arguments to make the case for multiple models along with myriad real-world examples.

The book has a pragmatic focus. Many-model thinking has tremendous practical value. Practice it, and you will better understand complex phenomena. You will reason better. You exhibit fewer gaps in your reasoning and make more robust decisions in your career, community activities, and personal life. You may even become wise.

Twenty-five years ago, a book of models would have been intended for professors and graduate students studying business, policy, and the social sciences along with financial analysts, actuaries, and members of the intelligence community. These were the people who applied models and, not coincidentally, they were also the people most engaged with large data sets. Today, a book of models has a much larger audience: the vast universe of knowledge workers, who, owing to the rise of big data, now find working with models a part of their daily lives.

Organizing and interpreting data with models has become a core competency for business strategists, urban planners, economists, medical professionals, engineers, actuaries, and environmental scientists among others. Anyone who analyzes data, formulates business strategies, allocates resources, designs products and protocols, or makes hiring decisions encounters models. It follows that mastering the material in this book—particularly the models covering innovation, forecasting, data binning, learning, and market entry timing—will be of practical value to many.

Thinking with models will do more than improve your performance at work. It will make you a better citizen and a more thoughtful contributor to civic life. It will make you more adept at evaluating economic and political events. You will be able to identify flaws in your logic and in that of others. You will learn to identify when you are allowing ideology to supplant reason and have richer, more layered insights into the implications of policy initiatives, whether they be proposed greenbelts or mandatory drug tests.

These benefits will accrue from an engagement with a variety of models—not hundreds, but a few dozen. The models in this book offer a good starting collection. They come from multiple disciplines and include the Prisoners’ Dilemma, the Race to the Bottom, and the SIR model of disease transmission. All of these models share a common form: they assume a set of entities—often people or organizations—and describe how they interact.

The models we cover fall into three classes: simplifications of the world, mathematical analogies, and exploratory, artificial constructs. In whatever form, a model must be tractable. It must be simple enough that within it we can apply logic. For example, we cover a model of communicable diseases that consists of infected, susceptible, and recovered people that assumes a rate of contagion. Using the model we can derive a contagion threshold, a tipping point, above which the disease spreads. We can also determine the proportion of people we must vaccinate to stop the disease from spreading.

As powerful as single models can be, a collection of models accomplishes even more. With many models, we avoid the narrowness inherent in each individual model. A many-models approach illuminates each component model’s blind spots. Policy choices made based on single models may ignore important features of the world such as income disparity, identity diversity, and interdependencies with other systems.1 With many models, we build logical understandings of multiple processes. We see how causal processes overlap and interact. We create the possibility of making sense of the complexity that characterizes our economic, political, and social worlds. And, we do so without abandoning rigor—model thinking ensures logical coherence. That logic can be then be grounded in evidence by taking models to data to test, refine, and improve them. In sum, when our thinking is informed by diverse logically consistent, empirically validated frames, we are more likely to make wise choices.

Models in the Age of Data

The appearance of a book on models may seem out of place in the era of big data. Today, data exists in unprecedented dimensionality and granularity. Customer purchase data, which used to arrive in monthly aggregates on printed paper, now streams instantaneously with geospatial, temporal, and consumer tags. Student academic performance data now includes scores on every homework, paper, quiz, and exam, as opposed to semester-end summary grades. In the past, a farmer might mention dry ground at a monthly Grange meeting. Now, tractors transmit instantaneous data on soil conditions and moisture levels in square-foot increments. Investment firms track dozens of ratios and trends for thousands of stocks and use natural-language processing tools to parse documents. Doctors can pull up page upon page of individual patient records that can include relevant genetic markers.

A mere twenty-five years ago, most of us had access to little more than a few bookshelves’ worth of knowledge. Perhaps your place of work had a small reference library, or at home you had a collection of encyclopedias and a few dozen reference books. Academics and government and private-sector researchers had access to large library collections, but even they had to physically visit the material. As late as the turn of the millennium, academics could be found shuttling back and forth between card catalog rooms, microfiche collections, library stacks, and special collections in search of information.

That has all changed. Content that had been paper-bound for centuries now flows in tiny packets through the air. So too does the information about the here and now. News that arrived on our doorsteps on newsprint once a day now flows in a continuous digital stream into our personal devices. Stock prices, sports scores, and news of political events and cultural happenings can all be accessed with a swipe or query.

As impressive as the data may be, it is no panacea. We now know what has happened and is happening, but, owing to the complexity of the modern world, we may be less capable of understanding why it happened. Empirical findings may be misleading. Data on piece-rate work often shows that the more people are paid per unit of output, the less they produce. A model in which pay depends on work conditions can explain those data. If conditions are poor so that producing output is difficult, per unit pay may be high. If conditions are good, per unit pay may be low. Thus, higher pay does not lead to less productivity. Instead, more difficult work conditions require higher per unit pay.2

In addition, most of our social data—that is, data about our economic, social, and political phenomena—documents only moments or intervals in time. It rarely tells us universal truths. Our economic, social, and political worlds are not stationary. Boys may outscore girls on standardized tests in one decade and girls may outscore boys the next. The reasons people vote today may differ from the reasons they vote in coming decades.

We need models to make sense of the fire-hose-like streams of data that cross our computer screens. Thus, it is because we have so much data that this might also be called the age of many models. Look across the academy, government, the business world, and the nonprofit sector, and you struggle to find a domain of inquiry or decision not informed by models. Consulting giants McKinsey and Deloitte build models to formulate business strategies. Financial firms such as BlackRock and JPMorgan Chase apply models to select investments. Actuaries at State Farm and Allstate use models to calibrate risk when pricing insurance policies. The people team at Google builds predictive analytic models to evaluate its more than three million job applicants. College and university admissions officers construct predictive models to select from among tens of thousands of applicants.

The Office of Management and Budget constructs economic models to predict the effects of tax policies. Warner Brothers applies data analytics to create models of audience responses. Amazon develops machine learning models to make product recommendations. Researchers funded by the National Institutes of Health build mathematical models of human genomics to search for and evaluate potential cures for cancer. The Gates Foundation uses epidemiological models to design vaccination strategies. Even sports teams use models to evaluate draft prospects and trade opportunities and to formulate within-game strategies. By relying on models to select players and strategies, the Chicago Cubs won a World Series championship after more than a century of failures.

To people who use models, the rise of model thinking has an even simpler explanation: models make us smarter. Without models, people suffer from a laundry list of cognitive shortcomings: we overweight recent events, we assign probabilities based on reasonableness, and we ignore base rates. Without models, we have limited capacity to include data. With models, we clarify assumptions and think logically. And, we can leverage big data to fit, calibrate, and test causal and correlative claims. With models, we think better. In head-to-head competitions between models and people, models win.3

Why We Need Many Models

In this book we advocate using not just one model in a given situation but many models. The logic behind the many-model approach builds on the age-old idea that we achieve wisdom through a multiplicity of lenses. This idea traces back to Aristotle, who wrote of the value of combining the excellences of many. A diversity of perspectives was also a motivation for the great-books movement, which collected 102 important transferable ideas in The Great Ideas: A Syntopicon of Great Books of the Western World. The approach finds a modern voice in the work of Maxine Hong Kingston, who wrote in The Woman Warrior, “I learned to make my mind large, as the universe is large, so that there is room for paradoxes.” It is also the basis for pragmatic actions in the world of business and policy. Recent books argue that if we want to understand of international relations, we should not model the world exclusively as a group of self-interested nations with well-defined objectives, or only as an evolving nexus of multinational corporations and intergovernmental organizations. We should do both.4

As commonsensical as the many-model approach may seem, keep in mind that it runs counter to how we teach models and the practice of modeling. The traditional approach—the one taught in high school—relies on a one-to-one logic: one problem requires one model. For example: now we apply Newton’s first law; now we apply the second; now the third. Or: here we use the replicator equation to show the size of the rabbit population in the next period. In this traditional approach, the objective is to (a) identify the one proper model and (b) apply it correctly. Many-model thinking challenges that approach. It advocates trying many models. Had you used many-model thinking in ninth grade, you might have been held back. Use it now, and you will move forward.

Academic papers, for the most part, follow the one-to-one approach as well, even though they use those single models to explain complex phenomena: Trump voters in the 2016 election were those who had been left behind economically. Or: the quality of a child’s second-grade teacher determines how economically successful that child will be as an adult.5 A stream of best-selling nonfiction titles present cures for our ills based on single-model thinking: Educational success depends on grit. Inequality results from concentrations of capital. Our nation’s poor health is due to sugar consumption. Each of these models may be true, but none is comprehensive. To confront the complexity of these challenges, to create a world of broader educational achievement, will require lattices of models.

By learning the models in this book, you can begin to build your own lattice. The models originate from a broad spectrum of disciplines, addressing phenomena as varied as the causes of income inequality, the distribution of power, the spread of diseases and fads, the conditions that precede social uprisings, the evolution of cooperation, the emergence of order in cities, and the structure of the internet. The models vary in their assumptions and their structure. Some describe small numbers of rational, self-interested actors. Others describe large populations of rule-following altruists. Some describe equilibrium processes. Others produce path dependence and complexity. The models also differ in their uses. Some help predict and explain. Others guide actions, inform designs, or facilitate communication. Still others create artificial worlds for our minds to explore.

The models share three common characteristics: First, they simplify, stripping away unnecessary details, abstracting from reality, or creating anew from whole cloth. Second, they formalize, making precise definitions. Models use mathematics, not words. A model might represent beliefs as probability distributions over states of the world or preferences as rankings of alternatives. By simplifying and making precise, they create tractable spaces within which we can work through logic, generate hypotheses, design solutions, and fit data. Models create structures within which we can think logically. As Wittgenstein wrote in his Tractatus Logico-Philosophicus, “Logic takes care of itself; all we have to do is to look and see how it does it.” The logic will help to explain, predict, communicate, and design. But the logic comes at a cost, which leads to their third characteristic: all models are wrong, as George Box noted.6 That is true of all models; even the sublime creations of Newton that we refer to as laws hold only at certain scales. Models are wrong because they simplify. They omit details. By considering many models, we can overcome the narrowing of rigor by crisscrossing the landscape of the possible.

To rely on a single model is hubris. It invites disaster. To believe that a single equation can explain or predict complex real-world phenomena is to fall prey to the charisma of clean, spare mathematical forms. We should not expect any one model to produce exact numerical predictions of sea levels in 10,000 years or of unemployment rates in 10 months. We need many models to make sense of complex systems. Complex systems like politics, the economy, international relations, or the brain exhibit ever-changing emergent structures and patterns that lie between ordered and random. By definition, complex phenomena are difficult to explain, evolve, or predict.7

Thus, we confront a disconnect. On the one hand, we need models to think coherently. On the other hand, any single model with a few moving parts cannot make sense of high-dimensional, complex phenomena such as patterns in international trade policy, trends in the consumer products industry, or adaptive responses within the brain. No Newton can write a three-variable equation that explains monthly employment, election outcomes, or reductions in crime. If we hope to understand the spread of diseases, variation in educational performance, the variety of flora and fauna, the effect of artificial intelligence on job markets, the impact of humans on the earth’s climate, or the likelihood of social uprisings, we must come at them with machine learning models, systems dynamics models, game theory models, and agent-based models.

The Wisdom Hierarchy

To sketch the argument for many-model thinking, we begin with a query from poet and dramatist T. S. Eliot: “Where is the wisdom we have lost in knowledge? Where is the knowledge we have lost in information?” To that we might add, where is the information we have lost in all this data?

Eliot’s questioning can be formalized as the wisdom hierarchy. At the bottom of the hierarchy lie data: raw, uncoded events, experiences, and phenomena. Births, deaths, market transactions, votes, music downloads, rainfall, soccer matches, and speciation events. Data can be long strings of zeros and ones, time stamps, and linkages between pages. Data lack meaning, organization, or structure.

Information names and partitions data into categories. Examples clarify the distinction between data and information. Rain falling on your head is data. Total rainfall for the month of July in Burlington, Vermont, and Lake Ontario’s water level are information. The bright red peppers and yellow corn on farmers’ stands surrounding the capitol in Madison, Wisconsin, on market Saturdays are data. The farmers’ total sales are information.

Figure 1.1: How Models Transform Data into Wisdom

We live in an age of abundant information. A century and a half ago, knowing information brought great economic and social status. Jane Austen’s Emma asks if Frank Churchill is “a young man of information.” Today she would not care. Churchill, like everyone else, would have a smartphone. The question is whether he could put that information to use. As Fyodor Dostoyevsky writes in Crime and Punishment, “We’ve got facts, they say. But facts aren’t everything; at least half the battle consists in how one makes use of them!”

Plato defined knowledge as justified true belief. More modern definitions refer to it as understandings of correlative, causal, and logical relationships. Knowledge organizes information. Knowledge often takes model form. Economic models of market competition, sociological models of networks, geological models of earthquakes, ecological models of niche formation, and psychological models of learning all embed knowledge. Those models explain and predict. Models of chemical bonds explain why metallic bonds prevent us from putting our hands through steel doors while hydrogen bonds yield to our weight when we dive into a lake.8

Atop the hierarchy lies wisdom, the ability to identify and apply relevant knowledge. Wisdom requires many-model thinking. Sometimes, wisdom consists of selecting the best model, as if drawing from a quiver of arrows. Other times, wisdom can be achieved by averaging models; this is common when making predictions. (We discuss the value of model averaging in the next section.) When taking actions, wise people apply multiple models like a doctor’s set of diagnostic tests. They use models to rule out some actions and privilege others. Wise people and teams construct a dialogue across models, exploring their overlaps and differences.

Wisdom can consist of selecting the correct knowledge or model; consider the following physics problem: A small stuffed cheetah falls from an airplane’s hold at 20,000 feet. How much damage will it do upon landing? A student might know a gravity model and a terminal velocity model. The two models give different insights. The gravity model predicts that the stuffed animal would tear through a car’s roof. The terminal velocity model predicts that the toy cheetah’s speed tops out at around 10 mph.9 Wisdom consists of knowing to apply the terminal velocity model. A person could stand on the ground and catch the soft cheetah in her hands. To quote the evolutionary biologist J. B. S. Haldane, “You can drop a mouse down a thousand-yard mine shaft; and, on arriving at the bottom, it gets a slight shock and walks away, provided that the ground is fairly soft. A rat is killed, a man is broken, a horse splashes.”

In the stuffed-cheetah problem, arriving at the correct solution requires information (the weight of the toy), knowledge (the terminal velocity model), and wisdom (selecting the correct model). Business and policy leaders also rely on information and knowledge to make wise choices. On October 9, 2008, the value of Iceland’s currency, the króna, began a free fall. Eric Ball, then treasurer of software giant Oracle, was faced with a decision. A few weeks prior he had dealt with the domestic repercussions of the home mortgage crisis. Iceland’s situations posed an international concern. Oracle held billions of dollars in overseas assets. Ball considered network contagion models of financial collapse. He also thought of economic models of supply and demand in which the magnitude of a price change correlates with the size of the market shock. In 2008, Iceland had a GDP of $12 billion, or less than six months’ revenues for McDonald’s Corporation. Ball recollected thinking, “Iceland is smaller than Fresno. Go back to work.”10 The key to understanding this event, and many-model thinking generally, lies in recognizing that Ball did not search among many models to find one that supported an action that he had already decided to take. He did not use many models to find one that justified his action. Instead, he evaluated two models as possibly useful and then chose the better one. Ball had the right information (Iceland is small), chose the right model (supply and demand), and made a wise choice.

We next show how to create a dialogue among multiple models by reconsidering two historical events: the 2008 global financial market collapse, which reduced total wealth (or what had been thought to be wealth) by trillions of dollars, resulting in a four-year global recession, and the 1961 Cuban missile crisis, which nearly resulted in nuclear war.

The 2008 financial collapse has multiple explanations: too much foreign investment, over-leveraged investment banks, lack of oversight in the mortgage approval process, blissful optimism among home-flipping consumers, the complexity of financial instruments, a misunderstanding of risk, and greedy bankers who knew the bubble existed and expected a bailout. Superficial evidence aligns with each of these accounts: money flowed in from China, loan originators wrote toxic mortgages, investment banks had high leverage ratios, financial instruments were too complex for most to understand, and some banks expected a bailout. With models we can adjudicate between these accounts and check the internal consistency of these accounts: Do they make logical sense? We can also calibrate the models and test the magnitude of the effects.

The economist Andrew Lo, exercising many-model thinking, evaluates twenty-one accounts of the crisis. He finds each to be lacking. It does not make sense that investors would contribute to a bubble that they knew would lead to a global crisis. Hence, the extent of the bubble must have been a surprise to many. Financial firms may well have assumed the other firms had done due diligence when in fact they had not. Second, what were, in retrospect, clearly toxic (low-quality) bundles of mortgages found buyers. Had global collapse been a foregone conclusion, the buyers would not have existed. And while leverage ratios had increased since 2002, they were not much higher than they had been in 1998. And as for the notion that the government would bail out the banks, Lehman Brothers collapsed on September 15, 2008; with over $600 billion in holdings, it was the largest bankruptcy in US history. The government did not intervene.

Lo finds that each account contains a logical gap. The data, such as it is, privileges no single explanation. As Lo summarizes: “We should strive at the outset to entertain as many interpretations of the same set of objective facts as we can, and hope that a more nuanced and internally consistent understanding of the crisis emerges in the fullness of time.” He goes on to say, “Only by collecting a diverse and often mutually contradictory set of narratives can we eventually develop a more complete understanding of the crisis.” No single model suffices.11

In Essence of Decision, Graham Allison undertakes a many-model approach to explain the Cuban missile crisis. On April 17, 1961, a CIA-trained paramilitary group landed on the shores of Cuba in a failed attempt to overthrow Fidel Castro’s communist regime, increasing tensions between the United States and the Soviet Union, Cuba’s ally. In response, Soviet premier Nikita Khrushchev moved short-range nuclear missiles to Cuba. President John F. Kennedy responded by blockading Cuba. The Soviet Union backed down, and the crisis ended.

Allison interprets events with three models. He applies a rational-actor model to show that Kennedy had three possible actions: start a nuclear war, invade Cuba, or impose a blockade. He chose the blockade. The rational-actor model assumes that Kennedy draws a game tree with each action followed by the possible responses by the Soviets. Kennedy then thinks through the Soviets’ optimal response. If, for example, Kennedy launched a nuclear attack, the Soviets would strike back, resulting in millions dead. If Kennedy imposed a blockade, he would starve the Cubans. The Soviet Union could either back down or launch missiles. Given that choice, the Soviet Union should back down. The model reveals the central strategic logic at play and provides a rationale for Kennedy’s bold choice to blockade Cuba.

Like all models, though, it is wrong. It ignores relevant details, allowing it to initially appear a better explanation than it really is. The model neglects to add a stage in which the Soviets put the missiles in Cuba. If the Soviets had been rational, they should have drawn the same tree as Kennedy and realized that they would have to remove the missiles. The rational-actor model also fails to explain why the Soviets did not hide the missiles.

Allison applies an organizational process model to explain these inconsistencies. A lack of organizational capacity explains the Soviets’ failure to hide the missiles. The same model can explain Kennedy’s choice to blockade. At the time, the United States Air Force lacked the capacity to wipe out the missiles in a single strike. If even a single missile remained, it could kill millions of Americans. Allison deftly combines the two models. An insight from the organizational model changes the payoffs in the rational-choice model.

Allison adds a governmental process model. The other two models reduce countries to their leaders: Kennedy acts for the United States and Khrushchev for the Soviet Union. The government process model recognizes that Kennedy had to contend with Congress and that Khrushchev needed to maintain a political base of support. Thus, Khrushchev’s placing of the missiles in Cuba signaled strength.

Allison’s book shows the power of models alone and in dialogue. Each model clarifies our thinking. The rational-actor model identifies possible actions once the missiles have arrived and allows us to see the implications of those actions. The organizational model draws our attention to the fact that organizations, not individuals, carry out those actions. The governmental process model highlights the political cost of invasion. By evaluating events through all three lenses, we gain a broader and deeper understanding. All models are wrong; many are useful.

In both examples, the different models explicate distinct causal forces. Multiple models can also focus on different scales. In an oft-repeated tale, a child claims that the Earth rests on the back of a giant elephant. A scientist asks the child what the elephant stands on, to which the child replies, “A giant turtle.” Anticipating what’s about to come next, the child quickly adds, “Don’t even ask. It’s turtles all the way down.”12


  • Choice award for outstanding academic title
  • "Scott Page's The Model Thinker is a deeper dive into the theory of mental models and the math behind them. Page is a professor at the University of Michigan, and his book explores mental models in a wonderful way."—Tomasz Tunguz
  • "A tremendously significant book embracing a creative, innovative approach for thinking about the complex mechanisms of social and natural phenomena"—S-T. Kim, North Carolina A&T State University, Choice
  • "A hands-on reference for the working data scientist, The Model Thinker challenges us to consider that the historical methods we have used for data analysis are no longer adequate given the complexity of today's world....What has given this book a place in my permanent library is its deep dives into dozens of models. Equations and the diagrams are here, but so are applications."—Carol Wells, Inside Big Data: Your Source for Machine Learning
  • "This book offers a remarkably comprehensive and insightful introduction to mathematical models in the social sciences, written by one who is a master of the field and a brilliant teacher."—Roger Myerson, Winner of the Nobel Memorial Prize in Economic Sciences (2007) and Glen A. Lloyd Distinguished Service Professor of Economics at the University of Chicago
  • "An original and thought-provoking book, and a challenging one for a one-model thinker like myself. Brace yourself for an entirely new perspective."—Daron Acemoglu, professor of economics at MIT and co-author of Why Nations Fail
  • "The clarity of Scott's thinking has been awing me since our days together as doctoral students at Kellogg. Beautifully written, this book teaches us how to stay logical, coherent and effective at work and at life more broadly--amidst a world awash in ever more data, distraction and complexity."—Sally Blount, former dean of the Kellogg School of Management at Northwestern University
  • "With the exception of physics, science--and particularly the social sciences--currently resides in a liminal period characterized by hints of universal principles and the tantalizing possibility of robust prediction. In this accessible and pragmatic book, Page persuasively shows us how in these transition periods we can use the wisdom of crowds to improve our decision-making. The twist is that the 'crowd' is not made of individuals but of well-chosen models each of which offers a different window on the world."—Jessica Flack, professor at the Santa Fe Institute and director of the Collective Computation Group
  • "Page explains the value of applying several models to a single problem, and then provides a conceptual toolkit for doing so. His book is a labor of love."—Art Friedman, co-author of Quantum Mechanics

On Sale
Mar 16, 2021
Page Count
464 pages
Basic Books

Scott E. Page

About the Author

Scott E. Page is the Leonid Hurwicz Collegiate Professor of Complex Systems, Political Science, and Economics at the University of Michigan and an external faculty member of the Santa Fe Institute.

Learn more about this author