Guest Post–Assessing Corruption with Big Data

Today’s guest post is from Enestor Dos Santos, principal economist at BBVA Research.

Ascertaining the actual level of corruption is not easy, given that it is usually a clandestine activity, and much of the available data is not comparable across countries or across time. Survey data on corruption experience can be helpful, but it is often limited to very specific kinds of corruption (such as petty bribery). Researchers and analysts have therefore, quite reasonably, tended to rely on subjective corruption perception data, such as Transparency International’s well-known Corruption Perceptions Index (CPI). (The CPI aggregates corruption perception data from a variety of other sources, mostly expert assessments.) But conventional corruption perception measures (including those use to construct the CPI) have well-known problems, including limited coverage (with respect to both years and countries) and relatively low frequency (usually annual). And they rely on the perceptions of a handful of experts, which may not necessarily be representative. These limitations mean that while traditional perception measures like the CPI may be useful for some purposes, they are not as helpful for others, such as measuring the impact of individual events or news reports on corruption perceptions, or how changes in corruption perceptions affect government approval ratings.

To address these concerns, a recent study by BBVA Research, entitled Assessing Corruption with Big Data, offered an alternative, complementary type of corruption perceptions measure, based on Google web searches about corruption. To construct this index, we examined all web searches classified by Google Trends in the “Law and Government” category for individual countries, and calculated the proportion of those searches that contain the word “corruption” (in any language and including its misspellings and synonyms). Our index, which begins in 2004, covers more than 190 countries and, unlike traditional corruption indicators, is available in real-time and with high-frequency (monthly). Moreover, it can be reproduced very easily and at very low cost.

Here are some of our main findings:

  • First, looking at the global data, it appears that the proportion of searches that concern corruption have become more common since about 2009-2010, suggesting an increasing worldwide concern about the issue. However, the results vary greatly from one country to another. In most Latin American countries, as well as in countries such as China, France, and Slovakia, citizen interest in corruption (at least as measured by Google searches) increased significantly (by at least 30%) between 2012 and 2017. In other countries, such as Canada, Germany, Finland, India, Egypt, Indonesia, and Poland, the proportion of searches that concerned corruption declined, perhaps suggesting that the problem (or at least attention to the problem) eased in the period. In other countries, such as the US, Sweden, and South Africa, we don’t see much of a trend in either direction.
  • Second, the comparison between how countries rank on our Google Trends index and the CPI may be illuminating. Perhaps unsurprisingly, these rankings are positively correlated—in those countries where corruption is perceived as a bigger problem (according to the experts whose views are used to construct the CPI), searches for the term “corruption” (or equivalent) comprise a higher proportion of all searches related to “Law and Government” topics. Relatedly, our Google Trends index, like the CPI, indicates that interest in corruption is generally higher in less developed countries. At the same time, though, many countries rank very differently on the Google Trends index as compared to the CPI. For example, Google searches for the term corruption are much more common in certain wealthy countries, such as the US, UK, Denmark, Austria, New Zealand, and Canada, than their favourable CPI rankings might predict. It seems that corruption in these countries is perceived as a major issue (or at least one that people are strongly interested in), even though international experts don’t seem to regard corruption in those countries as a major structural problem.
  • Third, one advantage of using Google Trends search data is that, given its greater frequency and potential variability, it is better suited to gauge the effects of (perceived) corruption on government approval ratings, consumer and business confidence, election results, and so forth. As a preliminary foray into putting the information to this use, we examine that data from Brazil, where recent scandals have made the misuse of public resources a major political issue. We propose a simple statistical model where the proportion of Google Trends searches concerning corruption (in Brazil) is one of the principal explanatory variables, and the outcome variable is either the government’s approval rating, a measure of consumer confidence, or a measure of business confidence. (In addition to the Google Trends corruption index, we also include a number of other explanatory variables as controls, including unemployment, inflation, and terms of trade.) We use monthly data for the period beginning in January 2004 and ending in December 2017. We find that our measure of corruption perception has a statistically significant effect on government’s approval rating as well as on both consumer and business confidence. Most notably, at the beginning of 2016 there is a sharp increase in the proportion of Google searches concerning corruption in Brazil—most likely due to the “Car Wash” operation—and this increase is associated with a 50% drop in the government’s approval rating, a 14% drop in consumer confidence, and a 7% drop in business confidence.

2 thoughts on “Guest Post–Assessing Corruption with Big Data

  1. Pingback: Guest Post–Assessing Corruption with Big Data – Matthews' Blog

  2. “… international experts don’t seem to regard corruption in [wealthy] countries as a major structural problem.”

    That may have been true a decade ago, but attitudes started to change around the time OECD called into question perception-based surveys, such as those of Transparency International.

    Today, you have major outlets such as The Guardian quoting development experts as wealthy countries raking in trillions from poor countries.

    At the link to their site, a PDF further claims “When looking at worldwide searches
    on “corruption”, Google Trends also provides data on the relative frequency of searches by country which allows us to compare the perception of corruption for 191 countries.

    They go on to claim: “Results are unsurprising: in general, the perception of corruption is higher in less developed countries.”

    Unsurprising? Again, this perceptions-based approach seems to ignore well-documented benefits to developed countries, such as reported by the Tax Justice Network. There is also TI’s own corruption barometer, which reports actual corruption experiences, not perceptions although the authors dismiss this, with no evidence, as being confined to so-called “petty corruption”.

    Most troublesome is that this approach copies from TI in limiting focus to misuse of public resources by selecting the category “Law & Government” i.e. only the public sector (government), not the private sector (business).

    This approach risks missing corrupt practice in business that may be legal, but that impacts negatively on “public resources” such as the estimated billions in clean up costs to toxic sites in America. The fact that toxic dumps were enabled by political and legal systems heavily weighted towards business interests does not lessen their impact. What may have been a cash-in-envelope bribe in “less developed” countries may instead be seen in “developed” countries as legalised corruption, such as lobbying, revolving doors, and speaker “fees”.

    How then do we capture private sector corruption.

    I’m no “expert”, just an old hack. But I can’t help wondering about examples set by the global environmental movement, culminating today in global development circles referring to “sustainable development”, using such measures as “triple bottom line” accounting. I’d suggest anti-corruption communities need to apply a similar approach to negative business impacts e.g. corruption of public resources.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.