The Level-of-Aggregation Question in Corruption Measurement

Posted on September 27, 2016 by Matthew Stephenson

Recently I learned that CDA Collaborative (a nonprofit organization that works on a variety of development and conflict-resolution projects) has launched a new blog on corruption. Though it’s a new platform, they already have a few of interesting posts up, and it’s worth a look.

While I’m always happy to advertise new platforms in the anticorruption blogosphere, in this post I mostly want to focus on the first entry in the CDA’s new blog, a post by Professor Michael Johnston entitled “Breaking Out of the Methodological Cage.” It’s basically a critique of the anticorruption research literature’s alleged (over-)reliance on quantitative methods, in particular cross-national regression analyses using country-level corruption indices (such at the Corruption Perceptions Index (CPI) or Worldwide Governance Indicators (WGI) graft index). There are some things in Professor Johnston’s post that I agree with, and much that I disagree with. I want to focus on one issue in particular: the question of the right unit of analysis, or level of aggregation, to use when attempting to measure corruption.

Professor Johnston has two related complaints (or maybe two variants on the same underlying complaint) regarding these national-level perceived corruption measures. First, he complains these “[o]ne dimensional indices tell us … that corruption is the same thing everywhere, varying only in amount[.]” In other words, corruption indices lump a whole bunch of disparate phenomena together under the same umbrella term “corruption,” ignoring the internal diversity of that category. Second, he contends that “relying … on country-level data is to assume that corruption is a national attribute, like GDP per capita” when in fact “corruption arises in highly specific processes, structural niches, and relationships.” Corruption, he explains, is not an attribute of countries, but of more specific contexts, involving “real people … in complex situations[.]”

Respectfully, I think that these points are either wrong or irrelevant, depending on how they are framed.

I want to focus on this issue because it seems to come up a lot. In a way I’m picking on Professor Johnston’s post not because his points are unusual, but rather because his post is but one particularly articulate example of a line of criticism that one hears over and over and over again—that one cannot measure corruption at the country level, because “corruption” is not a useful category given the great variety of “corrupt” activities, and that one cannot usefully measure any of these kinds of corruption at the country level, given the internal heterogeneity within countries (say, across regions or government departments) with respect to the prevalence and type of corruption.

The statements (1) that the term “corruption” is a broad general category, and (2) that there’s a lot of internal diversity within countries, are both true. It does not follow, however, that one can’t or shouldn’t ever use a broad measure of corruption, assessed at the country level. Sometimes you shouldn’t, but sometimes you should—it depends on the research question. Virtually all categories we use, particularly those we use to describe characteristics of countries, have the two features noted above; corruption is not unique in this regard, so pointing out these facts about corruption is not, by itself, all that useful.

I think Professor Johnston inadvertently illustrates something like the point I’m getting at in the passage I quoted earlier, where he says that corruption is not “a national attribute, like GDP per capita[.]” But why, exactly, does Professor Johnston think that GDP per capita is a “national attribute”? GDP per capita is a shorthand way of capturing the annual average value of goods and services produced by an individual in the country in question. It is a summary measure that flattens out the great diversity of human activity that may create economic value. One could say of “GDP per capita” almost exactly what Professor Johnston says about corruption: That although macro-level factors may affect opportunities and constraints for economically productive activities, “real people decide to take advantage of them … or not – in complex situations, defined by perceived alternatives and consequences”—and national-level indicators like GDP per capita “tell us little about those complexities.” Does this mean we shouldn’t use GDP per capita when researching causes and consequences of economic development? Well, maybe it does; there are lots of criticisms of the various ways in which GDP per capita is used as a summary statistic, for example its omission of other aspects of human welfare. But the point is that there’s nothing that makes GDP per capita, but not (perceived) corruption, a “national attribute.” What matters is whether that level of aggregation (and inevitable simplification) is useful for the research question at hand.

Perhaps another example can further illustrate the point. There are statistics out there on national-level cancer rates. These aggregated statistics mask variation in the type of cancer. For some purposes, aggregating all cancers into a single summary statistic is fine. In other cases, disaggregating by type of cancer (pancreas, liver, etc.) is vital. In other cases, further disaggregation is required—sometimes each patient must be treated as unique. And sometimes the question we’re asking really means that per capita cancer rates should be integrated into a higher-level category, such as overall disease rates, rather than treated as a separate category. So too with corruption. For some research purposes, it’s sensible and appropriate to use a national-level aggregate evaluation. For other purposes, it’s essential to disaggregate by type of corruption (bribery, embezzlement, nepotism, etc.), or by magnitude (grand vs. petty, for example), or by region or department or something else. Sometimes each individual instance must be analyzed as sui generis. And sometimes “corruption” is actually too narrow a category, and we should instead try to measure, for example, the overall level of economic crime, or failures of governance, or what have you.

Now, I’m not sure that Professor Johnston would necessarily disagree with that. It might be possible to read his post not as an argument that there’s some inherent reason why perceived corruption—unlike GDP or cancer rates—simply cannot be expressed in terms of a single national-level summary statistic. Maybe what he means is that for the research questions we do or should care about regarding the causes and consequences of corruption, identifying correlates of these summary measures is not useful. If that’s what he’s saying, then he and I have no conceptual or methodological disagreement, though I think he may be underestimating the usefulness of some cross-country correlational studies. But again, my beef isn’t with Professor Johnston specifically—his post just provided a useful opportunity for me to lay out my criticisms of a point that I’ve heard plenty of other smart people make.

The bottom line here is that simply pointing out that a national-level summary statistic, such as a country’s score on the CPI, masks both diversity within the category and the country, is not by itself a deep or devastating critique of research that uses those measures. Those points are true but obvious, and common to just about every national statistical measure we might use for anything. The question is, or should be, whether the research using measures like the CPI and WGI can shed light on important questions.

21 thoughts on “The Level-of-Aggregation Question in Corruption Measurement”

Jacob Eisler on September 27, 2016 at 7:46 pm said:

Hi Matt- I think you raise some interesting methodological questions, but you perhaps don’t give enough weight to the normative thrust of Michael’s argument. To restate his point, trans-national or cultural quantification of corruption may be difficult (and perhaps ultimately less informative) because it is ultimately wholly dependent upon normative context (perhaps unsurprisingly, a similar theme underlies Michael’s book Syndromes of Corruption). For example, we can imagine a society in which ‘gifts’ to public officials (which may be done in the open and tightly controlled by informal practice) that look a lot like ‘bribes’ to us comprise a lot of the public official’s salary, and are classified as bribes by outside observers, but within the closed system of that society have no more perverse impact that income.

Conversely, GDP (which can ultimately be converted into currencies, through the meat hammer of exchange rates) and cancer (you can go completely post-modern on this and say there is no CLEAR line where a cell growth becomes metastatic, but I’m willing to defer to medical expertise on this one, [as long as Big Pharma doesn’t play too much of a role in the underlying research, of course]) seem more tractable to ‘objective’ identification.

Perhaps an interesting way to bridge this gap is to ask where perceptions of corruption diverge between outside observers and those within a society. It would be an interesting (though very costly) piece of field research; if the differentiation is universally slight, it supports your claim; if it’s more extensive, it supports Michael’s claim of corruption being harder to universally quantify through a single metric (unless, of course, you accept there is a single Platonic concept of good governance).

Reply ↓
- Matthew Stephenson on September 27, 2016 at 10:23 pm said:
  
  Your comment is very helpful in that it highlights two conceptually distinct grounds for criticizing the use of national-level corruption indicators in empirical research.
  
  The first criticism is that there’s so much internal heterogeneity in the types and manifestations of corrupt activity within countries that one cannot sensibly reduce all of this heterogeneous activity to a single number that measures “corruption.” (One could advance this criticism even if one stipulates, for the sake of argument, that the understanding of what counts as “corruption” is constant across countries and cultures.)
  
  The second criticism is that the meaning of “corruption” differs across countries, and the many activities may be perceived or classified as “corrupt” by those inside the society are considered acceptable/legitimate outside the society, or vice versa. (One could advance this criticism even if one stipulates, for the sake of argument, that behaviors and practices are sufficiently consistent within a country that national-level generalizations are appropriate.)
  
  These criticisms are not mutually exclusive, but they are different.
  
  I understood Professor Johnston’s post to have been making the first point, while you read him as making the second point. You may well be right; my reading of Professor Johnston’s post may well have been inaccurate (though I hope not uncharitable). But this doesn’t really matter — smart people have advanced both critiques, but I respectfully think that both are (mostly) wrong.
  
  My grounds for rejecting the first critique are mainly conceptual, and laid out in my original post. The right level of aggregation (and simplification) depends on the research question, not on some idea of which variables are naturally or inherently national-level variables.
  
  My grounds for rejecting the second critique are mainly empirical. I agree that if different countries had radically different understandings of corruption, such that national-level perception measures simply misclassified countries (labeling some as “corrupt” for practices that are considered entirely legitimate within the society, for example), this could be a big problem. The situation might be like if we had a global etiquette index that purported to measure how polite people are in different countries, but applied a parochial standard of politeness that was insensitive to norms within the society. (One might imagine an international etiquette index giving Asian societies a low score because people bow instead of shaking hands, or slurp their noodles.) I totally get that as a hypothetical problem that corruption indexes might have. But in the case of corruption (at least the “core” forms of corruption like bribery and embezzlement) it just doesn’t seem to be empirically true. A fair amount of research along the lines of what you suggest in your last paragraph has in fact already been done, and it turns out taht corruption perceptions on international indexes show very high correlations with corruption perceptions from household surveys taken within individual countries. Research also shows that, although there are of course some variations at the margins, people around the world seem to have very similar attitudes towards the sorts of informal payments we’d typically classify as “bribes.” Understandings of corruption may not be quite as consistent across countries as are definitions of cancer, but they’re much closer to that than to understandings of proper etiquette. For this sort of consistency, you don’t need a single Platonic concept of good government any more than you need a single Platonic concept of human health in order to reach agreement on the definition of certain diseases.
  
  Reply ↓
Michael Callan on September 28, 2016 at 1:04 am said:

Matthew, while I agree with your views expressed in the critique of Professor Johnston’s blog (note I didn’t say criticism of that blog – an important distinction) I think the main message is being lost in the debate. Statistics at a national or sub-national level are a great thing, if that thing has a purpose. If I use this crude example: Cancer rates in Country A are higher than Country B, therefore Country A has a cancer problem. That is a nonsense argument because there is no granularity in the statistics. Nationally aggregated data does that by reducing the argument to a single figure that is essentially meaningless. Australia (My home) has a CPI score of 79, while Indonesia has a score of 36. Indonesia is considered a country with high corruption, while Australia is considered a low corruption country. However, this is misleading and does not reflect the real situation. Australia has a high level of political corruption, with recent events playing out some pretty poor behaviours amongst our politicians. Indonesia has prosecuted 80 high ranking judges, lawyers and politicians in the last year for corruption while Australia has prosecuted none in the last ten years. I understand the argument but there needs to be a deeper analysis of the issues and causes based on each country. Comparing scores does not prove anything. Each individual country needs to evaluate its situation based on such things as quality of government and the vulnerability of society. These require a multi-faceted review of the individual situation not a single data view. I spent last night at a presentation by one of Indonesia’s Anti Corruption Commissioners who said they want to see their CPI score raised to 50 and he thinks this is doable. In the same breath he also stated that he wanted more measures to determine if he is actually winning the war against corruption. A single score will not help him to do that. I think this is the real debate we should be having in the Anti-Corruption field.

Reply ↓
- Matthew Stephenson on September 28, 2016 at 7:53 am said:
  
  I think I basically agree with most of what you say. National-level aggregate statistics are useful for some purposes, but not for others, and they are often misused. National-level aggregate scores are particularly ill-suited to providing guidance as to what an individual country should do to address its own corruption problems, and to tracking changes over time–a subject I’ve written about before. (Also, though this is a separate point, I’m extremely sympathetic to the view that we’ve already learned most of what we’re going to learn by running cross-country regressions on the existing national-level datasets, such that the marginal value of devoting time and energy to that sort of research is now quite low, relative to other sorts of research.)
  
  If I were to quibble (very slightly and very gently) with some of your points, I think perhaps you might overstate the uselessness of national-level statistics just a bit. Sometimes people offer recommendations based on general hypotheses about corruption’s causes and consequences, and we often need to rely on those sorts of generalizations (and often do so implicitly), simply because it’s impossible to know everything about a particular situation in minute detail. Quantitative statistical analysis allows us to test some of these broad generalizations in ways that can be helpful, as long as we are careful not to abuse or over-interpret the results. For example, to build on a couple of my recent posts, a lot of smart people have argued that the key to cutting corruption is to cut “big government.” But an analysis of the statistical data shows that, in fact, government size has a negative correlation with national-level corruption indicators. Now, this doesn’t mean that in _some_ countries, cuts to government budgets might help fight corruption, and it would be a mistake to treat the above finding as some sort of iron law. But it does provide occasion for critical reflection on the reasons for this correlation in the aggregate data, which might cause us to re-examine unstated assumptions or implicit hypotheses when doing more country-specific analysis. I’ll use the GDP analogy again: Knowing that Australia has a higher per capita GDP than Indonesia doesn’t tell either country what to do to address the specific economic problems that each one faces. But boy, GDP per capita can be a useful shorthand metric, both for figuring out which countries are richer overall, and for testing theories about possible correlates of national-level income.
  
  But again, I don’t think we disagree about anything fundamental. The main point, which I think you’d endorse, is that the choice of corruption measure depends critically on what one is using the measure _for_. For some purposes, national-level aggregate indexes, for all their simplifications, are helpful summary statistics. For other purposes, these national-level summary measures are useless, or worse than useless.
  
  Reply ↓
  - Kiely on October 5, 2016 at 9:14 am said:
    
    Mathew and Michael – I rather like the common thread coming out of these most recent discussions, which is one centered on usefulness. To focus this conversation on learning and growing the field of corruption study, usefulness must be at the forefront – particularly for practitioners looking to develop innovative approaches to anti-corruption work. Further, to those thinking of corruption as a system of complex interactions (as we do: http://www.blog.cdacollaborative.org/identifying-leverage-points-in-systemic-analysis-and-planning-for-anti-corruption-action/), national-level statistics are indeed crucial but all-too-insufficient when researching these complexities. While larger trends have the potential to guide implementers to a new watering hole when innovation is needed (i.e., providing a new ‘broad generalization’ to dig into), they rarely offer much more. Yet where would innovation be if those pursuing it failed to take a step back to ask these questions? Thank you both for making sure to emphasize this point in your discussion, and for continuing the conversation Michael Johnston first started a few weeks ago.
    
    Reply ↓
  - kielybw on October 5, 2016 at 9:16 am said:
    
    Mathew and Michael – I rather like the common thread coming out of these most recent discussions, which is one centered on usefulness. To focus this conversation on learning and growing the field of corruption study, usefulness must be at the forefront – particularly for practitioners looking to develop innovative approaches to anti-corruption work. Further, to those thinking of corruption as a system of complex interactions (as we do: http://www.blog.cdacollaborative.org/identifying-leverage-points-in-systemic-analysis-and-planning-for-anti-corruption-action/), national-level statistics are indeed crucial but all-too-insufficient when researching these complexities. While larger trends have the potential to guide implementers to a new watering hole when innovation is needed (i.e., providing a new ‘broad generalization’ to dig into), they rarely offer much more. Yet where would innovation be if those pursuing it failed to take a step back to ask these questions? Thank you both for making sure to emphasize this point in your discussion, and for continuing the conversation Michael Johnston first started a few weeks ago.
    
    Reply ↓
Pingback: The Level-of-Aggregation Question in Corruption Measurement | Anti Corruption Digest
depatridge on September 28, 2016 at 3:35 am said:

Reblogged this on Matthews' Blog.

Reply ↓
Pingback: 1.39 Cheers for Quantitative Analysis – CDA Perspectives
Pingback: Breaking out of the Methodological Cage – CDA Perspectives
Pingback: The Metaphysics of “Corruption” (or, The Fundamental Challenge to Comparative Corruption Measurement) | Anti Corruption Digest
Pingback: The Metaphysics of “Corruption” (or, The Fundamental Challenge to Comparative Corruption Measurement) | |
Pingback: The Corruption in Fragile States Blog Series | CDA Perspectives
Pingback: What We Learned About Blogging in a Year | CDA Perspectives
Pingback: The Corruption in Fragile States Blog Series - CDA Collaborative
Pingback: Breaking out of the Methodological Cage - CDA Collaborative
Pingback: 1.39 Cheers for Quantitative Analysis - CDA Collaborative
Pingback: What We Learned About Blogging in a Year - CDA Collaborative
Pingback: What We Learned About Blogging in a Year – Institute for Human Security
Pingback: Breaking out of the Methodological Cage – Institute for Human Security
Pingback: 1.39 Cheers for Quantitative Analysis – Institute for Human Security