OK, I know (as Rick pointed out in a recent post) that a lot — maybe too much — of the content on this blog has focused on measurement issues, so I apologize for yet another post on that topic, but this has really been bugging me:
Transparency International has been publishing its well-known and widely-used Corruption Perceptions Index (CPI) since 1995. The index has its pros and cons, several of which have been discussed on this blog (see here, here, here, here, and here). But putting other debates about the CPI’s validity and utility to the side, one thing should be perfectly clear: At least prior to 2012 (when TI changed its method and scoring system for the CPI), a country’s CPI scores CANNOT be compared across years. The fact that Country X scores, say, a 4.4 in 2002, and scores a 4.9 in 2005, does NOT mean that (perceived) corruption has declined in country X. Maybe it did, but it might have stayed the same, or gotten worse. At most, the pre-2012 CPI provides information about country’s ranking relative to other countries, within a single year, with respect to corruption perceptions.
TI itself could not be more explicit about this, stating bluntly “CPI scores before 2012 are not comparable over time.” Yet I keep coming across sources — news articles, presentations by leading international organizations, academic papers — that use year-to-year CPI comparisons to make claims about how corruption in a particular country or region is improving or worsening, or about whether a particular policy intervention is working or not. YOU CAN’T DO THIS! PLEASE STOP!!
I won’t bother going through in detail why year to year comparisons on the pre-2012 CPI are not appropriate (a useful critical summary can be found here). But in brief, here are the main reasons:
- The pre-TI method for scoring countries on a 10 point scale was based entirely on country’s (average) relative ranking, with respect to other countries, in the data sources TI aggregated. That means Country X’s CPI score could worsen, even if corruption in country X was perceived as lower than before, if other countries’ perceived corruption dropped even more. In also means that CPI scores might be very stable, or very unstable, depending on a country’s relative distance from the country’s ranked just above or below. When there’s a big gap, even significant changes in corruption perceptions won’t show up in the data.
- Also, and closely related to the above point, the pre-2012 CPI was rescaled every year, deliberately (or at least consciously) zeroing out any general trends in perceived corruption across the world.
- On top of that, the addition or removal of new data sources year to year could alter relative rankings, and hence CPI scores, independently of any change in the country’s actual perceived corruption in any of the underlying sources. Imagine that in 2001, Country X gets a 5.3, and in 2002, TI adds a new survey that gives Country X a bad score on corruption; suppose this drops Country X’s score to 4.9. This doesn’t necessarily mean corruption in Country X has gotten worse. Suppose the survey TI added gave Country X an even worse score in 2001 (or would have done so, had the survey existed then), so that if this survey had been included in the 2001 CPI, Country X’s 2001 score would have been 4.7, not 5.3. According to this new source, corruption has, if anything, gone down in Country X. But the fact that this new source happens to be more negative about Country X than other sources in the CPI means that it’s addition to the CPI in 2002 causes Country X’s 2002 score to worsen, relative to 2001.
- In addition to changes in sources, especially in the CPI’s first decade or so, the aggregation methodology and the country coverage changed year to year, which can also cause changes to a country’s score that are unrelated to actual changes in perceived corruption.
- Maybe most important, TI is an aggregation of multiple underlying sources. If those sources themselves are not comparable from year to year — and they may well not be, for all of the reasons above, and perhaps more — then an aggregation of those sources will also not be comparable across time.
Now, TI changed its methodology, and its scaling system, as of the 2012 CPI, and TI now claims that from 2012 forward, a country’s scores across years can be compared to see if perceived corruption is getting better or worse. I need to reflect a bit more to decide whether I think TI’s methodological changes have fixed the problem. That will, I hope, be the subject of a future post. But for now, the message from TI is clear, and I’m only trying to amplify it:
DO NOT COMPARE CPI SCORES ACROSS YEARS, AT LEAST FOR THE 1995-2011 INDEXES.
DO NOT BELIEVE ANY SUCH COMPARISONS YOU HAPPEN TO SEE OR READ.