Kevin Davis, the Beller Family Professor of Business Law at New York University School of Law, contributes today’s guest post, based on his recent working paper.
Academics and policymakers enthusiastically endorse “evidence-based” policymaking, for obvious reasons. (After all, what is the alternative? Faith? Popularity contests?) But while evidence—including quantitative evidence—is often helpful, we must be mindful of the limits on what empirical analysis can tell us about important topics. Take the regulation of transnational bribery. Scholars and policymakers would like to know if the current regime—laws like the U.S. Foreign Corrupt Practices Act (FCPA) and U.K. Bribery Act, and international instruments like the OECD Anti-Bribery Convention—has “worked.” That is, have these instruments reduced bribery by the firms that they cover? And did those laws have additional, possibly undesirable collateral consequences, for example reducing investment in countries perceived to be corrupt?
The most sophisticated efforts to answer these questions (see, for example, here and here and here) essentially rely on what social scientists call “natural experiments.” First, the intervention (the law or policy change) of interest, which (in a borrowing from medical terminology) researchers call the “treatment.” Next, one must identify the population of interest—say, firms or countries—and an outcome of interest (such as the frequency of bribery or the level of investment). Then, the researcher identifies the subset of those entities that are affected by the intervention (for example, the firms that fall under the jurisdiction of the new anti-bribery law); this is the “treatment group.” The researcher also identifies another subset of entities—the “control group”—that appears otherwise similar to the treatment group, but did not receive the treatment (for example, a group of firms that are outside the jurisdiction of the new law). The big difference between a “controlled experiment” and a “natural experiment” is that in a controlled experiment the researcher can randomly choose which members of the population receive the treatment (for example by randomly selecting some patients to get a new drug and giving the other patients a placebo), but in a natural experiment, the assignment of the treatment is done not by the researcher, but by some “natural” process in the world. In trying to figure out the effect of an anti-corruption law, it generally is not feasible to conduct a controlled experiment: researchers can’t decide that these firms but not those firms, selected at random, will fall under the jurisdiction of an anti-bribery law. So the best that researchers can do is to rely on natural experiments and try to account as best they can for possible differences between the control group and the treatment group by including additional control variables in a multivariate regression.
Unfortunately, when it comes to studying the effects of transnational anti-bribery laws, these sorts of studies face several fundamental challenges, which are all too often overlooked or understated.
- First, as with just about all studies that rely on natural experiments, there is a deep problem with attributing differences in outcomes to the treatment, as opposed to other differences between the treatment and control group. It’s impossible to control for everything, especially since some of the variables that might affect both selection into the treatment group and the outcome of interest may not be the sorts of things researchers can observe.
- Second, in this context many of the outcomes of interest are difficult to measure—particularly illicit activity like bribery. Data from surveys of experiences are available for many countries and there have been some interesting methodological advances in recent years (for discussions see here and here). However, those surveys rarely provide enough information about respondents to determine which foreign laws apply to them. And even some outcomes of interest that might seem easier to measure sometimes prove challenging. For example, a common question in this area is whether anti-bribery laws have discouraged investment in countries perceived as corrupt. These studies are hampered by the limited availability of data on aggregate bilateral investment flows—the popular UNCTAD dataset only provides figures up to 2012. Meanwhile, country-specific firm-level data on foreign direct investment generally are only available for publicly traded firms, and the reported flows often are aggregated across multiple countries.
- Third, it is often challenging to figure out which subjects belong in the treatment group. Take the example of comparing firms that are subject to anti-bribery laws like the FCPA to otherwise similar firms that are not subject to those laws. The FCPA’s anti-bribery provisions clearly apply to any firm incorporated or headquartered in the U.S., and they also apply, with qualifications, to firms listed in the U.S. However, as many European firms have learned the hard way, the FCPA’s anti-bribery provisions also apply to firms that participate in bribery schemes that are partly conceived or implemented in U.S. territory or through the U.S. financial system. There is no straightforward way to identify firms affected in this way. Moreover, the mere fact that a particular firm is subject to the FCPA does not mean the statute applies to misconduct committed by that firm’s parent or subsidiary or affiliates.
- A variant on that problem arises when the main comparison is over time—that is, when the treatment and control groups are not different entities, but the same entities before and after the intervention. If the treatment is something like the adoption of a new law, the timing of the treatment isn’t so hard to determine. But what if the treatment is something like increased enforcement, which is presumably associated with perceptions of increases in either the probability or the magnitude of sanctions? For example, there was a well-documented increase in FCPA enforcement in the years after the OECD Convention came into force. Suppose we want to evaluate the effect of this increased enforcement, by doing a before-and-after comparison. When did the level of enforcement change, and when was that change perceived by firms? Was the relevant date 2005, when we saw an uptick in the number of settlements? Or a few years earlier, when the investigations that led to those settlements were initiated? Or earlier still, when other countries joined the OECD Convention and firms anticipated that this would make it easier for US enforcement agencies to secure cooperation in transnational investigations? Or perhaps as late as 2008, when corporate directors around the world read newspaper articles about the mega-settlement with Siemens?
Given these obstacles, what is to be done? The standard academic response is to advocate for the collection of better data. In meantime, for policymakers in the real world, the only possible response is to muddle along with what we’ve got—less than perfect evidence and theories that draw upon as broad a range of perspectives as possible combined with a commitment to reconsideration and revision in the face of new evidence or insights. The kind of evidence favored by proponents of evidence-based analysis is generally difficult to come by in connection with illicit transnational activity. Consequently, we must of necessity explore alternatives to evidence-based policymaking.
Professor Davis, I completely agree that it is crucial to acknowledge these constraints of empirical work, specifically natural experiments, but I find it hard to reconcile them with your conclusion of “necessity explore alternatives to evidence-based policymaking” and the title “The Infeasibility of Evidence-Based Evaluation”. Understanding the constraints of statistical analysis shouldn’t result in throwing it away, but rather in taking it for what it is and not overstating it, as is often done. What is the alternative to evidence-based policymaking that you would propose? It’s clear that theory can and often does make great contributions to policy making, with no theory empiricists would be out of work, but a theory should not be implemented if its predictions are proven to be incorrect. Obviously, a “proof” (to one way or another) never exists with complete certainty, which is the point of methodological constraints that you present, but it doesn’t mean that we should forsake making progress on understanding how likely it is that a theory actually generates its hypothesized effects, keeping in mind what any analysis does and does not tell us.
Technical note: as to your first point, at least in the context of natural experiments I believe you will agree that it isn’t required to control all variables that are likely to affect the outcome, but only (not “especially”) those that are are also correlated with selection into the treatment/control group
Your comment on the headline is fair. The paper upon which the post is based has the more modest title, “The Limits of Evidence-Based Regulation…”
Is there an alternative to the evidence-based approach? It depends on what you mean by “evidence-based.” In the literature, that term often is used narrowly to refer to analyses based on evidence generated through systematic research. If we adopt that narrow definition then the alternative is evidence collected unsystematically, which I refer to as “judgment.” In the paper I propose that more attention be paid to ways to incorporate judgment into regulatory decision-making.
I agree with your technical note. In the context of a natural experiment it may not be essential to control for everything but it is important to control for variables that affect the outcome and are correlated with selection into the treatment/control group. My point is simply that those variables are often difficult to observe.
Thank you for this post Professor Davis. I certainly take your point that there are limits to what we can expect to learn from natural experiments, particularly given the state of data on a topic like corruption. And calling for “better data” is, as you point out, often an unachievable and facile answer to a very complex problem.
With that said, I share some of Haggai’s concerns. If we are too quick to dismiss empirically-based approaches, it seems that we lose a valuable opportunity to gather at least some information on the state of our hypotheses. Might not even an imperfect natural experiment, if well-designed, at least advance our understanding of how corruption (and other social phenomena) impacts the political world, and give us new avenues for research? Perhaps our goal should simply be to be more circumspect about our findings and more transparent about the limitations of what we can assert. I’m certainly fairly skeptical of any “grand theories” that come out of natural experiment studies. But I would also be hesitant to marginalize the approach too much, given the real methodological advantages it can offer. Perhaps one way of dealing with unobserved variables is simply to promote a stronger scholarly culture of replication of studies–if more people are incentivized to examine natural experiments, more missing variables should be identified, and the reliability of these studies should thereby be improved.
With that said, I’m very interested by your response to Haggai, particularly your advocacy for the use of a judgment-based approach. What would this look like in practice? How would you translate it into policy-making? And would this be designed to replace or supplement a more traditionally empirical approach?
Thank you for a thought-provoking discussion.