A Reminder: Year-to-Year CPI Comparisons for Individual Countries are Meaningless, Misleading, and Should Be Avoided

Today, Transparency International released its new Corruption Perceptions Index (CPI) for 2018. At some point, hopefully soon, I’ll have time to look closely at the new data and accompanying materials, and if I have something to say about it, I’ll post it here. But that will probably take a while, and since the media coverage of the CPI is usually pretty intense in the first few days after the release, and dissipates in a week or two, I wanted to get out at least one post right now, on the day of the release, with a plea to everyone out there–especially journalists, but civil society activists and others as well:

DO NOT COMPARE ANY GIVEN COUNTRY’S CPI SCORE TO LAST YEAR’S SCORE TO MAKE CLAIMS ABOUT WHAT’S HAPPENING IN THE FIGHT AGAINST CORRUPTION.

Just don’t do it. Don’t. I know the temptation can seem overwhelming. Who’s up? Who’s down? Things are getting better! Things are getting worse! Nothing is changing! So many stories can be written based on these changes (or non-changes).

But these sorts of comparisons are virtually all completely useless, and probably counterproductive.

In past years I’ve dwelt at length on the reasons one can’t usually learn anything useful from comparing a country’s CPI score in a given year to its score in previous years. (See, for example, here, here, here, here, here, and here.) In brief, score changes can sometimes be driven by technical changes (such as changes in the underlying sources used to calculate the scores), and there’s so much statistical noise in the CPI estimates that small changes are often not statistically meaningful–and even for those changes that are statistically significant at conventional levels, one would expect that in a sample of 180 countries, some would exhibit “statistically significant” changes even if it were all random noise. More substantively, perceptions of corruption are slow to change, and influenced, at least in the short term, by factors other than the underlying level of corruption, such as whether a major scandal has gotten a lot of press coverage.

TI itself has gotten a bit better in how it presents and frames CPI changes, and is certainly better than a lot of the journalists and others who focus on small, statistically meaningless changes in scores or rankings as if they told us something useful. But TI continues to be a bit careless about this, especially in the materials intended for media consumption. For example, in this year’s CPI press release, TI emphasizes the fact that Brazil’s score dropped two points, from 37 to 35, even though by TI’s own analysis we don’t know whether this was a real change or just statistical noise, and if it was indeed a worsening it may well be the result not of worsening corruption but of exposure of corruption that took place in previous years, when Brazil’s score was (somewhat) higher.

Moreover, the title of TI’s press release, “Corruption Perceptions Index 2018 Shows Anti-Corruption Efforts Stalled in Most Countries,” is just wrong, and in a way that matters:

  • First, because there’s so much statistical noise, the fact that most countries haven’t exhibited a statistically significant improvement doesn’t mean things haven’t improved. That’s just an inherent problem with noisy data: small movements might not mean anything, because it might just be noise, but at the same time the noise in the data may prevent us from detecting genuine improvements.
  • Second, and perhaps more important, anticorruption efforts take time to be effective, and it would be a mistake to write them off as a failure just because they don’t produce an immediate, detectable improvement in a country’s CPI score. I know TI means well here–they want to emphasize the seriousness of the problem and the urgency of devoting more energy and resources to combating corruption. But as we’ve noted on this blog previously (see, for example, here and here), interpreting a lack of change in the CPI as evidence that anticorruption efforts aren’t working can be counterproductive: They can be demoralizing, and breed more cynicism and fatalism, attitudes which themselves make corruption harder to control. Now, if the CPI data really did show conclusively that anticorruption efforts aren’t working, then the fact that spreading that message might have demoralizing effects might not be a good argument for suppressing or downplaying this truth. But since a lack of statistically significant changes in the CPI data does not clearly show that anticorruption efforts have “stalled,” putting this spin on the data is a rhetorical choice, and I fear a misguided one.

I’ve said it before and I’ll say it again (and again and again and again, until the message gets through): the CPI is useful for many things, but making year-to-year comparisons in any individual country’s score is not one of them.

5 thoughts on “A Reminder: Year-to-Year CPI Comparisons for Individual Countries are Meaningless, Misleading, and Should Be Avoided

  1. Great insight here. There’s definitely so much noise in year to year country scores, that it is silly to compare them. It seems to me that the main uses of the CPI are seeing a country’s overall trend over time (for example, the dramatic ascent of Uruguay over the last 2 decades) and comparing countries to each other (maybe even using multi-year averages to reduce noise).

  2. Unfortunately, the noncomparability of CPI is a message that bears repeating. While reading a law firm bulletin on Vietnam’s new anticorruption laws (See https://globalanticorruptionblog.com/2019/01/30/vietnam-enlists-the-private-sector-in-the-fight-against-corruption/), I noticed the following passage illustrating the precise problem you discuss here:

    “Although no prosecutions have yet been made under the New Penal Code, the government’s anticorruption crackdown appears to have at least elevated public confidence as measured by the
    Corruption Perceptions Index (CPI). Vietnam’s CPI ranking improved from 113th out of 176
    countries in 2016, to 107th out of 180 in 2017. Although still low, Vietnam was one of the few
    countries that improved its position in 2017.”

    (See https://www.hoganlovells.com/en/publications/vietnam-continues-to-make-strides-on-anti-corruption-efforts).

    • Thanks very much for bringing this index to my attention. I hadn’t seen it before. On a quick skim (I haven’t yet had time to read it carefully), it appears that the problems of inter-temporal comparability would not be as serious with this index, since the underlying measures used to construct it are (mostly) “hard” quantitative measures, like the number of police per capita, the ratio of prison inmates to prison capacity, the percentage of those who have contact with the police who eventually appear before a court, etc.

      That said, based on my quick skim, I can think of a few reasons to be nervous about comparing scores on this index across time: (1) The index does concern at least some measures, most notably a human rights indicator, that is more subjective and for that reason might not mean the same thing in different years. (2) The index, as I understand it, aggregates lots of different kinds of data, so even if we focus only on the “hard” quantitative data, a change in the aggregate index score could be driven by lots of different kinds of changes in the underlying data, and my worry is that it won’t end up telling you that much about what’s happening. This also means that the particular aggregation system that’s used may determine whether the score in a given year goes up or down, depending on how the scores are weighted. To give a simple example, suppose there’s an index that’s constructed from two measures, A and B, each of which is on a 1-10 scale. Suppose in year 1, a country gets a 5 on both A and B, while in year 2, the country’s score on A increases to 7 and its score on B drops to 3. If the aggregate index is calculated as a simple average, the score will be the same in both years. If (for substantive reasons) the aggregation method puts more weight on factor A, the overall index will go up. If the aggregation method puts more weight on factor B, the overall index will go down. What’s the right amount of weight to put on factors A and B when calculating the “overall” score?That’s a substantive judgment call, which just looking at the headline index numbers might obscure.

      But again, that’s an off-the-cuff reaction, and may be misguided in this particular case. I’ll need to read the document you forwarded more carefully to form a more considered opinion.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.