Last week, I used Professor Michael Johnston’s recent post on the methodological and conceptual problems with national-level perceived corruption indicators as an opportunity to respond to some common criticisms of research that relies on these indicators. In particular, I have frequently heard (and interpreted Professor Johnston as advancing) two related criticisms: (1) composite indicators of “corruption” are inherently flawed because “corruption” is a multifaceted phenomenon, comprised of a range of diverse activities that cannot be compared on the same scale, let alone aggregated into a single metric; and (2) corruption is sufficiently diverse within a single country that it is inappropriate to offer a national-level summary statistic for corruption. (These points are related but separate: One could believe that corruption is a sufficiently coherent concept that one can sensibly talk about the level of “corruption,” but still object to attempting to represent an entire country’s corruption level with a single number; one could also endorse the idea that national-level summary statistics can be useful and appropriate, even when there’s a lot of intra-country variation, but still object to the idea that “corruption” is a sufficiently coherent phenomenon that one can capture different sorts of corruption on the same scale.) For the reasons I laid out in my original post, while I share some of the concerns about over-reliance on national-level perceived corruption indicators, I think these critiques—if understood as fundamental conceptual objections—are misguided. Most of the measures and proxies we use in studying social phenomena aggregate distinct phenomena, and in this regard (perceived) corruption is no different from war, wealth, cancer, or any number of other objects of study.
Professor Johnston has written a nuanced, thoughtful reply (with a terrific title, “1.39 Cheers for Quantitative Analysis”). It is clear that he and I basically agree on many of the most fundamental points. Still, I think there are still a few places where I might respectfully disagree with his position. I realize that this back-and-forth might start to seem a little arcane, but since so much corruption research uses aggregate measures like the Corruption Perceptions Index (CPI), and since criticisms of these measures are likewise so common, I thought that perhaps one more round on this might not be a bad idea.
Let me address the two main lines of criticism noted above, and then make some more general observations.
First, with respect to the fact that, as Professor Johnston puts it, “corruption indices … flatten out critical variations among and within societies,” I agree. But, as I emphasized in my original response, all indicators that we use do that to some degree. I take it Professor Johnston would agree with that statement. The question we need to ask is whether “corruption” is a sufficiently coherent phenomenon that it makes sense ever to try to capture it in a single indicator. The reason I emphasize “ever” is that I don’t disagree—and no reasonable person would disagree—with the claim that reducing corruption to a single metric sometimes obscures critical distinctions between different types of corruption. If Professor Johnston and I have a disagreement here (and, again, I’m not sure we do) it concerns whether “corruption” is so internally heterogeneous that there’s never any point in talking about it as a single category. I don’t think that strong claim is tenable.
To re-purpose an example from my earlier post, corruption is in this regard like “cancer”—a category that is hugely internally diverse, but where it sometimes makes sense to talk about it, and measure it, as if it were one big thing with different manifestations. Or, to make a similar point in a slightly different way, in his most recent post Professor Johnston offers, as an alternative to the single corruption metric, four distinct “syndromes of corruption,” which he has identified and investigated in his important scholarly work. But of course, as Professor Johnston would likely be the first to acknowledge, his four “syndromes” are also ideal types, and within each syndrome—say, “power chasing wealth” or “wealth chasing power”—there are also “critical variations among and within [the] societies” that he sorts into each of those four categories. Someone else could come along and subdivide each of Professor Johnston’s four “syndrome” ideal types into several sub-types, and sub-sub-types, and so on and so forth until each individual corrupt act that has ever taken place gets its own unique category (since no two are ever exactly alike). But that can’t be the right way to proceed.
Indeed, I’m fairly certain that what Professor Johnston is really worried about is not some from-first-principles claim that “corruption” can’t be measured on a single scale because it’s an internally diverse category, but rather the fact that, in his view, too much research focuses on the aggregate “corruption” measure and ignores interesting and important variation in the type of corruption. If that’s what he’s saying, then we have no fundamental conceptual disagreement. That said, I might still quibble a bit with his suggestion that there is a “longstanding inability to come to a working consensus definition on how to define corruption,” and that the related, and more implicit suggestion that the various components used to construct the aggregate corruption indicators are just so unrelated to one another that these indexes are basically tallying up scores on measures that have little to do with one another.
As to the definition, we do have a “working consensus definition” of corruption: The abuse of entrusted power for private gain. Now, that definition has all sorts of well-known problems, most notably the question of the normative baseline one uses to define “abuse of power.” But in practice there’s quite a bit of overlapping consensus on what this means. There’s the “black” heart of corruption (mainly bribery, embezzlement, and their variants), and a “grey” area at the margins (stuff like conflict of interest, influence peddling, etc.). The borders of the concept are a bit fuzzy, but again, that’s true for lots of important social science concepts, and we seem to be able to get along.
Now, as for the question whether the various components that are used to construct these composite indexes are all measuring different manifestations of more or less the same underlying phenomenon, this is something we can test empirically. And despite the fact that, as Professor Johnston correctly points out, the aggregated perceptions are gathered “from different groups of respondents (some within a society, some not) who are asked at different times to make contrasting kinds of judgments, and whose relationships with that being judged (as international experts, small business owners, extortion victims) can differ starkly,” the underlying components used to construct these indexes are extremely highly correlated with one another. Indeed, statistical analysis strongly suggests that a single underlying component (or “latent factor”) explains most of the variance in these various measures. While not definitive, to me this suggests that in fact there is an underlying phenomenon we can fairly call “corruption” that is being picked up by these different measures. (That’s not the only possible explanation. It could be that some other factor—say, wealth—has such a strong relationship with each of them that we see a correlation among the various indicators even though they are all measuring quite different things. Maybe. But the prima facie evidence is that in fact “corruption” appears to be a coherent concept that manifests in many different ways, and that observers with quite different backgrounds perceive in very similar ways.)
As for the second issue, about whether perceived corruption can be expressed at the national level with a summary variable, I can be briefer, because Professor Johnston makes clear that he “[i]n no way … object[s] to aggregating evidence at the national level” (and I apologize for having misinterpreted his first post regarding this point). The arguments Professor Johnston raises here about the problems of national-level aggregation are in fact the same basic problems discussed above—the concern that corruption, unlike other variables we might want to express with national-level statistics, cannot be “assessed and added up on a common underlying dimension.” To restate my earlier response, I agree that composite perceived corruption indexes aggregate data from many different sources that are measuring different aspects of the phenomenon, but I’m not as troubled by this, first because as a conceptual matter a composite index can still be valid if the different measures are (imperfect) proxies for the same underlying phenomenon, and because, as an empirical matter, it seems that these various sources are indeed proxies for a common latent variable, which we could fairly label “perceived corruption.”
Now, one other slight quibble, that I think may go to our larger difference in perspective. Professor Johnston concludes his post by stating, “I would be the last to argue that such indices should be abandoned, but I am not as ready as some analysts seem to be to treat them as literal truth.” The first clause indicated that Professor Johnston and I agree far more than I originally thought. The second clause, though, I thought was a bit unfair and inaccurate. I don’t know a single competent analyst (at least in the academic world) who treats indexes like the CPI as “literal truth.” It’s well understood that these measures are, at best, highly imperfect proxies. So I would suggest, gently and respectfully, that in this last line Professor Johnston may perhaps have gone a bit too far. But if the point he means to make is the more moderate one that, although these aggregate measures have their uses, there’s a tendency to rely on them too much, at the expense of other approaches and indicators, then I think I largely agree. For me, this isn’t so much because of any underappreciated inherent flaws with the indicators (the flaws are well known, and not always fatal), but rather because we’ve now had 20-odd years of cross-country research using these indicators–assessing correlations with just about every conceivable national-level variable of interest (and many not of interest)–and at this point we’re not likely to be able to squeeze much more blood from that particular turnip.