How to Win a Nobel Prize Using Mickey Mouse Numbers

We Need to Talk about Acemoglu, Johnson, and Robinson

May 26, 2026

∙ Paid

Much of economic history’s recent problems can be traced to the success of Daron Acemoglu, Simon Johnson, and James Robinson (AJR). They pioneered the use of what D. C. M. Platt called “Mickey Mouse numbers” to support a simplistic narrative in which strong property rights and other proto-liberal-democratic institutions explain the “rise of the West.”

Here, I will show that AJR’s results in two of their most famous papers were dependent on the particular choices they made in the construction of their datasets. Their work provides an illustration of how the “garden of forking paths” applies not only to specification searching in econometrics, but also to data construction, especially when it is poorly documented and research assistants are involved. AJR may have believed that proto-liberal-democratic institutions made the West rich, and therefore expected the historical statistics to demonstrate the same. In this way, they would have been led to make decisions that suited their priors. What follows, then, is not a story of intentional manipulation of data but of confirmation bias run amok.

Even when awarding the Nobel Prize in Economics, the Royal Swedish Academy of Sciences conceded that the key data in AJR’s most widely-cited article were “sometimes sketchy.” In their 2001 American Economic Review article “The Colonial Origins of Comparative Development: An Empirical Investigation,” AJR had used the mortality rates of European settlers as an instrument for good institutions. Because they could not regress modern wealth on institutions directly (since wealth also affects institutions), they needed a variable that affected wealth only through its historical effect on institutions: an exogenous source of variation to identify causation. Settler mortality was supposed to be that variable. According to the Swedish economists, AJR’s data used “rough estimates on initial settler mortality,” but they still seemed to have successfully demonstrated that favourable disease environments allowed Europeans to establish good institutions with strong protections for property rights. For this reason, AJR argued, those former colonies where settler mortality had been low are wealthy today.

Yet the Royal Swedish Academy of Sciences’ economists had not fully absorbed David Y. Albouy’s critique of AJR, which was eventually published as a comment in the American Economic Review in 2012. Three data construction issues that Albouy identified were particularly important. First, there was substantial “pseudoreplication”: AJR had taken 28 more or less genuine datapoints for particular countries and imputed them to neighbouring countries, artificially inflating their sample size to 64. Second, AJR had mixed mortality data from when European soldiers were in barracks with when they were on campaigns, as if the mortality rates between the two should be the same. Third, some mortality rates were not even for European settlers at all, but rather for African laborers who had been transported to work elsewhere.

When the regressions are corrected for these issues, the settler mortality instrument ceases to provide useful information. The extent of the debacle can be seen in my replication below. In econometrics, an instrument is considered “weak” if it doesn’t strongly predict institutions; a weak instrument produces standard errors so large that the confidence interval covers almost everything, and an F-statistic below about 10 means the instrument cannot reliably identify anything. In Table 1, we can actually see this collapse. The baseline regression produces an F-statistic of just 2.57, and the resulting confidence intervals stretch to infinity. Only when all African countries are dropped (reducing the sample to just 13 countries) does their two-stage least squares (2SLS) regression produce a confidence interval in which neither bound is infinite. The apparently useful information the settler mortality instrument had provided was an artifact of its construction.¹

AJR’s reply was unconvincing. They made, for example, inaccurate statements about Albouy’s “campaign” dummy, a 0/1 indicator variable for whether troops were on campaign. AJR’s claims about this indicator are directly contradicted by their main source of settler mortality data, Philip D. Curtin’s two books, Death by Migration: Europe’s Encounter with the Tropical World in the Nineteenth Century (1989) and Disease and Empire: The Health of European Troops in the Conquest of Africa (1998). According to AJR’s reply to Albouy, “except during times of major wars (which are excluded from the data), there is little difference in practice between what soldiers were engaged in during ‘campaigns’ and at other times. As a result, it does not in general make sense, and in fact it is not possible, to systematically distinguish campaigns and noncampaigns, and Curtin does not do so” (p. 3098). Yet the distinction is clearly made by Curtin in Death by Migration. He describes it as “one of the fundamental facts of military medical experience; troops in barracks are much healthier than troops on campaign, even disregarding losses from combat” (p. 4). Similarly, he states that campaigning “usually brought a sharp increase in deaths from disease as well as battle” (p. 13). AJR’s claim that Curtin “does not offer a systematic noncampaigns versus campaign distinction” (p. 3098n47) is incorrect. Indeed, the distinction is so fundamental to Curtin’s Disease and Empire that it features in the first paragraph of the blurb on the cover jacket:

“BEFORE THE NINETEENTH CENTURY, European soldiers serving in the tropics died from disease at a rate several times higher than that of soldiers serving at home. Then, from about 1815 to 1914, the death rates of European soldiers, both those serving at home and abroad, dropped by nearly 90 percent. But this drop apllied [sic] mainly to soldiers in barracks. Soldiers on campaign, especially in the tropics, continued to die from disease at rates as high as ever, in sharp contrast to the drop in barracks death rates.”

The campaign-barracks distinction is thus a basic framing principle for the entire book. It is obvious to Curtin that campaign rates were higher. “Troops on campaign were, of course, expected to sustain higher disease deaths rates than those in barracks—often three to ten times higher,” is how Curtin puts it in Disease and Empire (p. 43).

Having misread their principal source, AJR then engaged in an extensive specification search to salvage their results. It had three main elements. First, they arbitrarily capped the settler mortality rate at 250 per 1,000. Second, they recoded Albouy’s “campaign” dummy based on questionable criteria and an apparent lack of familiarity with or even interest in their notional sources.² Third, they excluded the Gambia, despite having noted in their 2000 NBER working paper that “doing so would help our hypothesis” (p. 19n15). Twelve years later, they finally pulled the trigger, thereby contributing to the underwhelming results shown below. After such extensive specification searching, their baseline results still only met the criteria for statistical significance by a whisker. When geographical controls were added, however, F-statistics shrank and confidence intervals crossed zero (meaning the results were no longer statistically significant).

For the economists working for the Swedish central bank, however, it was enough. In their report on the scientific background to the 2024 Nobel Prize in Economics, they showed only a cursory understanding of Albouy’s findings, while they missed the nature of AJR’s response. They seemed to believe it was significant that “the core results hold up when the authors exclude African nations,” while ignoring the complete collapse of identification when the “neo-Europes” were excluded from Albouy’s 28-country sample. In reality, AJR’s IV was a proxy for being a country like Canada or the United States, and offered no exogenous identification of institutions. All their regressions show is that Europeans sensibly settled where they were less likely to die, and those countries subsequently became wealthy and developed strong property rights. Their research design can say nothing meaningful about the relationship between the two. The story they tell is ultimately a matter of faith.

In this way, the Nobel committee’s essay on the scientific background to AJR’s award reflected the general misunderstanding of Albouy’s critique. A recent survey by the Institute for Replication (I4R) illustrates how effective the rhetorical strategies deployed by AJR in their reply proved to be. In their reply, AJR pointed toward a working paper version in which they claimed that “Different sources of data for Latin America and different benchmarking procedures lead to very similar and robust results.” In making this statement, AJR were specifically referring to Latin America; all they really did was rescale the same Curtin numbers.³ Nonetheless, the I4R told the economists surveyed that AJR “show that their results are robust to alternative assignments of mortality rates.” In this way, the I4R gave an even stronger version of AJR’s defence and framed it as a “neutral” statement of fact, helping to push the percentage of respondents who viewed the reply favourably up from 50.5 to 54.6 percent. My impression is that few have actually taken the time to understand what AJR did with their data.

Similar issues can, moreover, be found in AJR’s next major contribution to economic history, the article “Reversal of Fortune: Geography and Institutions in the Making of the Modern World Income Distribution,” published in the Quarterly Journal of Economics in 2002. Once again, their results are a function of their data construction.

Continue reading this post for free, courtesy of Joseph Francis.

Or purchase a paid subscription.

The Poor Rich World

How to Win a Nobel Prize Using Mickey Mouse Numbers

We Need to Talk about Acemoglu, Johnson, and Robinson

Continue reading this post for free, courtesy of Joseph Francis.