Guest Post: Using Open Data To Combat Corruption—Moving Beyond the Hype

Robert Palmer, the Director of Partnerships and Communication at the Open Data Charter, contributes today’s guest post:

In order to tackle corruption effectively, one first needs to understand the networks that link government officials, businesses, and professional intermediaries, and then work to either dismantle these networks or at least ensure that these webs of connections are not exploited to enrich individuals and undermine good government. Fortunately, these clandestine networks often leave traces in government-held databases, such as company registers, land title deeds, asset disclosures, and other official records. That’s where open data can be helpful. When the government provides easily accessible public information, it makes it easier for government officials, journalists, and citizens to follow financial flows, understand who’s providing government services, and to spot suspect behavior. And that’s why there has been so much enthusiasm about the open data in the anticorruption community. In 2015, for example, the G20 anticorruption working group announced a common approach saying that “Open Data can help prevent, detect, investigate and reduce corruption.”

Yet what’s happening on the ground isn’t living up to this hype. Part of the reason is that, as the Web Foundation and Transparency International found in a recent study of five G20 countries, many countries have made only limited progress toward meeting international commitments on open data. But even where open data is available, relatively few organizations are actually using open data to expose and combat corruption. There are, of course, exceptions, including Global Witness, the data journalists at Organised Crime and Corruption Reporting project, and accountability groups such as BudgIT. Yet the potential for open data to help fight corruption remains largely unrealized.

To help address this shortcoming, the Open Data Charter has spent the last year pulling together a guide for how to use open data to combat corruption. The guide lists 30 types of datasets that could help expose and combat corruption if they are released in the right way, as well as key data standards to ensure consistency and quality between different countries. Of course, the underlying assumptions here are that the types of data listed in the guide can be collected and released by governments in the ways the guide advises, and that there are anticorruption actors who can process this data in ways that are helpful in exposing or preventing corruption. In order to probe these assumptions, the Open Data Charter has teamed up with the Government of Mexico to “road-test” the guide. This will include working out which of the 30 datasets in our guide the government already publishes, which further ones can be released, and how to engage potential users. We’re interested in understanding how if data is released in the right way, users such as journalists, law enforcement, and civil society can process the data and then use it to have an impact on corruption.

Our approach to this piece of work is guided by a real desire to learn what works: what’s helpful to the government and what’s helpful to external stakeholders who want to tackle corruption. We hope to be able to report on our initial findings over August. If you’re interested in learning more, please get in touch with me: robert [at] In the spirit of transparency and collaboration, the guide itself is open to comment here.

Guest Post: Turning Big Data Into a Useful Anticorruption Tool in Africa

GAB is delighted to welcome back Dr. Elizabeth Dávid-Barrett of the University of Sussex, who contributes today’s guest post:

Many anticorruption advocates are excited about the prospects that “big data” will help detect and deter graft and other forms of malfeasance. As part of a project in this vein, titled Curbing Corruption in Development Aid-Funded Procurement, Mihály Fazekas, Olli Hellmann, and I have collected contract-level data on how aid money from three major donors is spent through national procurement systems; our dataset comprises more than half a million contracts and stretching back almost 20 years. But good data alone isn’t enough. To be useful, there must be a group of interested and informed users, who have both the tools and the skills to analyse the data to uncover misconduct, and then lobby governments and donors to listen to and act on the findings. The analysis of big datasets to find evidence of corruption – for example, the method developed by Mihály Fazekas to identify “red flags” of corruption risks in procurement contract data—requires statistical skills and software, both of which are in short supply in many parts of the developing world, such as sub-Saharan Africa.

Yet some ambitious recent initiatives are trying to address this problem. Lately I’ve had the privilege to be involved in one such initiative, led by Oxford mathematician Balázs Szendrői, that helps empower a group of young African mathematicians to analyse “big data” on public corruption. Continue reading

Guest Post: Using Big Data to Detect Collusive Bidding in Public Procurement

Bence Tóth and Mihály Fazekas of the Corruption Research Center Budapest (CRCB) contribute the following guest post:

As several earlier posts on this blog have discussed (see, for example, here, here, and here), collusion and corruption in public procurement is a significant problem, one that is extremely difficult to detect and combat. The nature of public procurement markets makes collusion easier to sustain, as pay-offs are higher (demand is often inelastic due to the auction mechanisms used), administrative costs increase entry barriers, and the transparency of procurement contract awards–often intended as an anticorruption device–can actually make it easier for cartel members to monitor one another and punish cheating. Law enforcement agencies have tried various techniques for breaking these cartels, for example by offering leniency to the first company that “defects” on the other cartel members by exposing the collusive arrangement. However, although leniency policies have sometimes proven to be an effective tool to fight coordinated company behavior, the efficacy of this approach is limited given the relative unlikelihood that the government will ever acquire convincing evidence of collusion absent such a defection by an insider. Hence, there is great need for alternative methods to identify collusive rings and guide tradition investigation.

In many markets, using quantitative indicators to detect collusion has not been feasible, as gathering meaningful tender-level data (or even market-level data) is too costly, or simply impossible. However, in the case of public procurement markets, there is a huge amount of publicly available data, which makes the use of “Big Data” techniques to pinpoint collusion-related irregularities more feasible. Indeed, in collaboration with our colleagues at CRCB, we have developed a simple, yet novel approach for detecting collusive behavior. Continue reading

Big Data and Anticorruption: A Great Fit

There is no shortage of buzz about Big Data in the anticorruption world. It’s everywhere — from public efforts like Transparency International’s public procurement analysis to cutting-edge private-sector FCPA compliance programs implemented by Ernst & Young. TI has blogged about Big Data and corruption, with titles like “Can Big Data Solve the World’s Problems, Including Corruption?” and “The Potential of Fighting Corruption Through Data Mining.” Ernst & Young’s conclusion is more definite: “Anti-Corruption Compliance Now Requires Big Data Analytics.”

In previous posts, contributors to this blog have written about how the anticorruption community was excited about social media-style apps (“crowdsourcing”) in anticorruption efforts. Apps like iPaidABribe allow citizens to report their encounters with corrupt officials, generating a fertile data set for anticorruption activists. Big Data is a related effort: activists can mine huge amounts of data for patterns that reveal corrupt activity, making it a powerful tool for transparency. However, as the name suggests, Big Data requires massive amounts of data in order to be useful.The anticorruption community should throw its weight behind proposals to open up data sets for Big Data analysis. As with crowdsourced anticorruption efforts, the excitement surrounding Big Data could quickly turn into disappointment unless this tool can be integrated into the broader anticorruption effort. Continue reading