A little while back, I expressed some skepticism about whether Transparency International’s Corruption Perceptions Index (CPI) scores can be compared across time, even after TI changed its methodology in 2012 and claimed that its new scores would now be comparable across years. More recently, I criticized TI’s 2014 CPI for burying the information on the margins of error associated with the CPI values, and for wrongly asserting that changes in the CPI score between 2013 and 2014 for certain countries (most notably China) were substantively meaningful. (In fact, not only does the change in China’s score between 2013 and 2014 seem not to be statistically significant, but the change was due almost entirely to the dropping of a source in which China did abnormally well in 2013, and an abnormally large movement in a single other source.) I decided to follow up on this by taking a closer look at the other ten countries that TI singled out as having experienced significant CPI changes (in either direction) between 2013 and 2014.
Upon closer examination, I’m even more certain that CPI scores cannot be compared over time. I’m also more confident in my judgment that TI has been unforgivably sloppy — and downright misleading — in how it, and its representatives, have portrayed the substantive significance of these CPI changes. It turns out that the problem I found with the China calculations was not unusual. For almost all of the eleven countries TI identified as big movers, the CPI changes were driven by (1) the addition or elimination of sources from year to year for particular countries, and/or (2) abnormally large (indeed, implausibly large) movements in a single source. Until TI fixes its methodology, the safest thing to do is to ignore year-to-year changes in the CPI. And for the sake of preserving its own integrity and credibility, TI should either (A) persuasively explain why I am wrong in my analysis of the data (in which case I will gladly concede error), or (B) issue some sort of retraction or correction to its earlier press releases, and either drop the claim that post-2012 CPI scores can be compared across time or fix its methodology going forward.
Allow me to elaborate my analysis of the data:
First of all, for seven out of the eleven countries that TI identified as having experienced substantial changes in perceived corruption between 2013 and 2014, much–and in some cases most–of the difference was due to the addition or elimination of a source in calculating the CPI. As I noted in my earlier post, most of the drop in China’s 2014 CPI score was due to the fact that a source on which China did unusually well in 2013 was not included in 2014. This is also true for three of the other four countries that TI claimed had experienced a significant worsening in perceived corruption (Angola, Rwanda, and Turkey). In fact, for Rwanda, the worsening of the CPI was due entirely to the fact that the 2014 CPI did not use a source on which Rwanda had scored especially well in 2013. Of the four sources that were used for Rwanda in both years, three exhibited no change and one actually showed a slight improvement. The same phenomenon was at work for three of the seven countries where TI claimed a big improvement in the CPI. For both Egypt and Swaziland, the 2013 CPI was calculated using a source on which those countries got scores notably lower than their scores on other sources, but the 2014 CPI did not include that source. And for Afghanistan, the 2014 CPI incorporated a new source, on which Afghanistan’s score was well above its average from the other sources.
Aside from changes caused by addition or subtraction of sources, most of the rest of the changes in the CPI scores were driven not by a consistent movement picked up in a number of sources, but rather an abnormally large — often an implausibly large — change in a single source. For example, consider Mali, which TI identifies as another country in which perceived corruption significantly decreased. In both 2013 and 2014, Mali’s CPI score was based on six sources. In four of those sources, Mali’s 2013 and 2014 scores were identical, and in one there was a modest improvement (+6 points on a 100 point scale); on the sixth source (Global Insight’s Country Risk Ratings), Mali’s improvement was enormous–a full 20 points (from 22 to 42). Now, it’s not impossible that there was in fact a big change that only this source picked up, and there’s a case to be made, I suppose, that averaging the sources and calculating the confidence interval should address any concerns. But my instinctive view is that if five out of six sources find no more than a small change, and one source detects a massive movement (covering 20% of the scale), it’s more likely that something screwy is going on with that one source.
And Mali isn’t the only country where the statistically significant change that TI reports appears to be driven by an idiosyncratically large change in a single source. For instance, St. Vincent & Grenadines–like Mali characterized by TI as a big improver–showed no change on two sources, but a very large (17 point) improvement on one (the Economist Intelligence Unit (EIU) index). In an admittedly closer case, Malawi–listed by TI as a country where perceived corruption got much worse–did indeed show significant worsening in two of the eight sources used to calculate its score (a 12 point drop on the World Bank Institutional Assessment, and a 17 point drop on the EIU). But on the other six sources, four showed no change, one had a very small (1 point) worsening, and one actually showed a modest (4 point) improvement.
This is not to say that every significant within-country 2013-2014 CPI change was the result of adding or dropping sources, or of big anomalous changes in only one or two sources. In Jordan, for example, although four of the seven sources used to calculate Jordan’s 2013 and 2014 scores showed no change, three other sources picked up improvements of roughly similar size (+8 on the IMD World Competitiveness Yearbook, +8 on the World Economic Forum (WEF) Executive Opinion Survey, and +10 on Global Insight). To me, that suggests a genuine change in perceptions. Cote d’Ivoire is a bit more ambiguous–much of the improvement from 2013 to 2014 appears due to a facially implausible 27-point jump on one source (the WEF survey), and four of the other seven sources show no change whatsoever. Still, the three remaining sources do all show improvements (of +1, +3, and +12), so there does seem to be evidence that some improvement in perceptions is occurring and getting picked up by multiple sources.
But for the most part, the changes in CPI scores that TI emphasizes in its press releases and other public statements do not seem to be based on reliable evidence of true changes in corruption perceptions. Most often, the changes are driven by the addition or subtraction of sources from year to year, and/or by implausibly and idiosyncratically large changes in a single source. To me, this seems like a very good reason not to pay any heed to year-to-year changes in CPI scores, even given TI’s welcome changes to its methodology in 2012. (By the way, the numbers I’m using for all of the above discussion come from TI’s data for the 2013 and 2014 CPI, available here and here (click on the “download info package” link below the results table in both cases). I hope others will double-check my calculations. It is entirely possible that I have made some errors, and if so I will happily and promptly correct them.)
Now, maybe there are further adjustments that TI could make in its approach that would allow it to make meaningful cross-year comparisons. In a future post, I might try to be a bit more constructive by trying to suggest some thoughts along those lines. But for now I’ll just double down on my critique: a careful examination of the underlying data reveals that CPI scores should not, as a general matter, be compared across time, and TI’s public statements about which countries improved or worsened significantly are not well-grounded in TI’s evidence, and ought to be retracted or qualified as soon as possible, in order for TI to preserve its credibility.