A regular readers of this blog know, I’ve been quite critical of the idea that one can measure changes in corruption (or even the perception of corruption) using within-country year-to-year variation in the Transparency International Corruption Perceptions Index (CPI). To be clear, I’m not one of those people who like to trash the CPI across the board – I actually think it can be quite useful. But given the way the index is calculated, there are big problems with looking at an individual country’s CPI score this year, comparing it to previous years, and drawing conclusions as to whether (perceived) corruption is getting worse or better. Among the many problems with making these sort of year-to-year comparisons is the fact the sources used to calculate any individual country’s CPI score may change from year to year, and the fact that a big, idiosyncratic movement in an individual source can have an outsized influence on the change in the composite score. (For more discussion of these points, see here, here, and here.) Also, while TI does provide 90% confidence intervals for its yearly estimates, the fact that confidence intervals overlap does not necessarily mean that there’s no statistically significant difference between the scores (an important point I’ll confess to sometimes neglecting in my own prior discussions of these issues).
Although there are lots of other problems with the CPI, and in particular with making over-time CPI comparisons, I think there’s a fairly simple procedure that TI (or anybody working with the TI data) could implement to address the problems just discussed. Since TI will be releasing the 2015 CPI within the next month, I thought this might be a good time to lay out what I think one ought to do to evaluate whether there have been statistically significant within-country changes in the CPI from one year to another. (I should say up front that I’m not an expert in statistical analysis, so it’s entirely possible I’ve screwed this up in some way. But I think I’ve got the important parts basically right.) Here goes:
- Step 1: Pick the comparison years. The most natural thing to do would be to compare the current year with the preceding year (say, 2015 compared to 2014), but given that both corruption and corruption perceptions tend to change slowly, I’d prefer a longer lag, maybe three years (say, 2015 to 2012). (This is a bit arbitrary, but it has the added bonus that 2012 was the first year of the CPI’s revised methodology, and so the earliest year with which the 2015 data can be compared.)
- Step 2: For each country, determine which of the sources used to construct the CPI are available for both years to be compared. If a country has a score from a particular data source in only one of the two comparison years (for example, if a particular index includes the country in 2012 but not in 2015), then drop that source. If dropping sources reduces the total sources available for any given country below four, then drop that country from the dataset. (Here I’m following TI’s rule of thumb of including only those countries for which four separate data sources are available. If one wanted to get really fancy, I suppose one could use multiple-imputation techniques to estimate missing data, but to me that seems like overkill, so I won’t go down that road.)
- Step 3: For each country in the data, calculate the within-source change from the first year to the second year. So, for example, if there are five sources available for a given country (in both years), this procedure would generate five (positive or negative) values. Call these values the “deltas” for each source.
- Step 4: Calculate the mean (the simple average) for each country’s source deltas. The mean is the point estimate for the change in the country’s perceived corruption from the first year to the second year.
- Step 5: Calculate the variance associated with the estimated change in the CPI score. (The variance is the sum of the squares of the differences between each individual delta and the mean, divided by (n-1), where n is the number of delta values–that is, sources.) The standard deviation (call it sigma) is the square root of the variance.
- Step 6: Calculate the t-statistic by multiplying the (absolute value of) the mean delta by the square root of n, and dividing this product by sigma.
- Step 7: Calculate a threshold value for statistical significance. The threshold will depend on the number of sources used and the level of statistical confidence desired. Following TI’s practice of focusing on 90% confidence level (two-tailed), the threshold values would be as follows: if there are four sources, the threshold is 2.353; five sources, 2.132; six sources, 2.015; seven sources, 1.943; eight sources, 1.895; nine sources, 1.860; ten sources, 1.833.
- Step 8: Compare the (absolute value of) the t-statistic calculated in Step 6 to the threshold selected at Step 7. If the t-statistic exceeds the threshold, then the estimated positive or negative change in the CPI score (that is, the average delta) is statistically significant at the 90% level. (This means that if there were in fact no difference, then if we were to do the evaluations over and over again an arbitrarily large number of times, we would get a mean delta of at least the magnitude that we actually observed less than 10% of the time.)
Just to illustrate with a concrete example, let’s take China, which TI singled out as a “big changer” (for the worse) between 2013 and 2014:
- Step 1: Even though I suggested above comparisons over a longer period of time, for this example I’ll go with 2013 and 2014, just to follow TI’s own comparisons as closely as possible.
- Step 2: In 2013, China’s score was based on nine sources: (1) the Bertelsmann Transformation Index,(2) the IMD World Competitiveness Yearbook, (3) the Political Risk Services International Country Risk Guide, (4) the World Economic Forum Executive Opinion Survey, (5) the World Justice Project Rule of Law Index, (6) the Economist Intelligence Unit Country Risk Ratings, (7) the Global Insight Country Risk Ratings, (8) the Political and Economic Risk Consultancy Asian Intelligence scores, and (9) the 2011 Transparency International Bribe Payers Survey. However, only the first eight were also used in 2014, so those are the only eight sources to be considered in making the over-time comparison.
- Step 3: The deltas for the eight sources are (following the order in which they are listed above, and rounding to the nearest integer) are: 0, +2, 0, -2, -4, 0, -10, +2. (Again, this is the difference in value between the 2014 and 2013 scores for each source; positive values indicate improvement, negative values indicate worsening.)
- Step 4: The mean of the deltas (and therefore the point estimate of the change in China’s CPI) is -1.5. (Actually, when one doesn’t round to integers, the mean is -1.622.)
- Step 5: The variance is the sum of the squares of the differences between each individual delta and the mean. I won’t try to write out the full calculation here; unless I screwed up the math (quite possible, I admit), the variance is 16.262, making the standard deviation (sigma) equal to approximately 4.033.
- Step 6: The absolute value of the mean is 1.622, the number of sources is 8 (the square root of which is approximately 2.828), and sigma is approximately 4.033. That gives us a t-statistic of approximately 1.137.
- Step 7: Because there are eight sources, the 90% statistical significance threshold is 1.895.
- Step 8: Because 1.137 is smaller than 1.895, we would conclude that there is no evidence of a statistically significant change in China’s level of perceived corruption between 2013 and 2014 (given our chosen level of statistical confidence). That is, we cannot confidently reject the null hypothesis that there was no change, even though the point estimate is negative (seeming to indicate a worsening of perceived corruption).
I haven’t had a chance to go through and apply this procedure to every country in the sample. I’m holding out hope that someone in the TI research department will save me the trouble by incorporating this procedure when the 2015 CPI is released, and reporting, for each country, both the estimated CPI change and whether this change is statistically significant. If that doesn’t happen, I may try to do it myself using the Excel datasets that TI provides, though it may be a bit time consuming.
To be clear, this procedure won’t address all the concerns I have about within-country year-to-year CPI comparisons. Everything in the above procedure, and TI’s own approach, is premised on the idea that the underlying sources are themselves comparable from year to year, but that may not be true. And there are all the standard concerns about the nature of corruption perceptions and what influences them, as well as concerns that Rick has emphasized about the adverse political effects of paying too much attention to these fluctuations. But still, at least some of the biggest problems with TI’s current approach to making year-to-year CPI comparisons could be fixed quite easily with the above approach.
Pingback: A Quick (Partial) Fix for the CPI | Anti Corruption Digest
This is an excellent post. It should be particularly reassuring to those who think statistical issues are beyond them. What it shows is that when explained clearly statistics is readily understandable. It is not that some people just “don’t get” statistics or mathematics, it is that they have had the misfortune to have had a bad teacher.