Improving Anti-Money Laundering Models with Synthetic Data

As readers of this blog are well aware, an effective anti-money laundering (AML) regime is crucial for fighting grand corruption, as well as other organized criminal activity. A key part of the AML system is the requirement that banks and other financial institutions identify suspicious transactions and file so-called suspicious activity reports (SARs) with the appropriate government agencies. This is an enormous task, given the volume of financial transactions that banks need to monitor and the challenge of identifying which of those transactions ought to be considered suspicious. Banks spend billions on AML compliance every year, and have developed complex automated systems to assist them in flagging suspect transactions, but existing systems’ ability to efficiently sort suspicious from innocent transactions is limited by the sheer complexity of the task. (False positive rates with current systems, for example, frequently top 90%.)

Many believe that artificial intelligence (AI) systems, such as those employing machine learning (ML), hold enormous promise for improving AML compliance and reducing cost. ML algorithms scrutinize vast datasets to identify patterns that can be used to fashion predictive models. In the AML context, ML algorithms identify those transaction characteristics (or complex combinations of transaction characteristics) that are associated with money laundering, and use these patterns to more efficiently and effectively identify suspicious transactions.  

But some commentators have suggested reasons for skepticism, or at least caution. For example, Mayze Teitler recently wrote on this blog about a number of challenges to operationalizing AI-derived algorithms in the AML context, primarily those arising from limitations in the data on which those algorithms are based. As Mayze correctly pointed out, ML algorithms require vast datasets from which to learn, and the data demands are compounded by the relatively rarity of known money laundering cases in the existing datasets.

Despite these concerns, I am more bullish than Mayze regarding the promise of AI-based AML systems. Many of the challenges and concerns regarding the development of effective AI systems in the AML context can be overcome through the use of synthetic data.

Continue reading

ML for AML: Is Artificial Intelligence Up to the Task of Anti-Money Laundering Compliance?

Fighting corruption—especially grand corruption—requires effective anti-money laundering (AML) systems capable of efficiently and correctly flagging suspicious transactions. The financial institutions responsible for identifying and reporting suspicious transactions employ automated systems that identify transactions that involve certain red flags—characteristics like transaction amount, location, or deviation from a customer’s typical activity; when the automated system flags a transaction, this triggers further review. But—given the ever-increasing volume and complexity of financial transactions that occur each day, as well as the increasing sophistication of kleptocrats, criminal groups, and others in disguising their illicit activities to avoid the usual red flags—picking out the genuinely suspicious transactions can be extraordinarily difficult. Even the cleverest compliance system designer couldn’t hope to incorporate every potential red flag into the automated system.

The need to stay one step ahead of the bad actors has fueled greater interest in how new advances in data processing technology may help make automated suspicious transaction detection systems more effective. Techno-enthusiasts are particularly interested in deploying deep learning artificial intelligence (AI), as well as classic algorithms that fall under the machine learning (ML) umbrella, in the AML context. ML and AI systems extract patterns from training datasets, and “learn” (by induction) what data patterns are associated with particular identifiable categorizations. Email spam filters provide a simple example. A spam filter, which can be created to conduct a process known as classification, sorts input variables into two categories: “spam” and “not spam.” It makes its categorization based on individual characteristics of the emails (such as the sender, body text, etc.). In the AML context, the idea would be to train an algorithm with data on financial transactions, so that the system “learns” to identify suspicious transactions even in cases that might lack the usual red flags that a human designer would program into an automated system. Advocates hope that ML/AI systems could be used both to filter out the false positives (transactions which are flagged as suspicious but turn out, on review, not to raise any concerns—an estimated 99% of all flagged transactions), while also identifying unusual, potentially fraudulent behavior that may be overlooked by human regulators (false negatives). Indeed, industry experts are understandably enthusiastic about AI systems that will cut costs while improving accuracy, and proponents claim that “AI holds the keys to a more efficient and transparent AML stance[,]” urging that “[b]anks must take hold of this new [AML] weapon[.]”

To the extent that AI tools can improve upon the admittedly-clunky automated systems currently in use, it could be a step forward. But ML/AI systems have a less than stellar track record in other contexts, and a model targeted at AML compliance presents some unique challenges.

Continue reading

New Podcast, Featuring Irio Musskopf

A new episode of KickBack: The Global Anticorruption Podcast is now available. In this week’s episode, my collaborators Nils Köbis and Christopher Starke interview Irio Musskopf, a Brazilian software engineer who co-founded and developed an open-data anticorruption project called Operation “Serenata de Amor, which uses artificial intelligence algorithms to analyze publicly available data to identify and publicize information about suspicious cases involving potential misappropriation of public money. Mr. Musskopf discusses the background of the project,the basic statistical approach to detecting suspicious spending patterns,the reasons for relying exclusively on public data (even when offered access to non-public information), and some of the challenges the project team has encountered. The conversation also discusses more general questions regarding the role that intelligent algorithms can play in anticorruption efforts, including questions about whether and where such algorithms might be able to supplant human analysis, and when human decision-making will remain essential..

You can find this episode here. You can also find both this episode and an archive of prior episodes at the following locations:

KickBack is a collaborative effort between GAB and the ICRN. If you like it, please subscribe/follow, and tell all your friends! And if you have suggestions for voices you’d like to hear on the podcast, just send me a message and let me know.

The New Frontier: Using Artificial Intelligence To Help Fight Corruption

In January 2018, scientists from Valladolid, Spain brought a piece of inspiring news to anticorruption advocates: they created an artificial intelligence (AI) system that can predict in which Spanish provinces are at higher risk for corruption, and also identifies the variables that are associated with greater corruption (including the real estate tax, inflated housing prices, the opening of bank branches, and the establishment of new companies, among others). This is hardly the first example of computer technology being used in the fight against corruption. Governments, international organizations, and civil society organizations have already been mining “big data” (see, for example, here and here) and using mobile apps to encourage reporting (see, for example, here and here). What makes the recent Spanish innovation notable is its use of AI.

AI is a cluster of technologies that are distinct in their ability to “learn,” rather than relying solely on the instructions specified in advance by human programmers. AI systems come in several types, including “machine learning” (in which a computer analyzes large quantities of data to identify patterns, which in turn enables the machine to perform tasks and make predictions when confronted with new information) and more advanced “deep learning” systems that can find patterns in unstructured data – in hundreds of thousands of dimensions – and can obtain something resembling human cognitive capabilities, though capable of making predictions beyond normal human capacity.

AI is a potentially transformative technology in many fields, including anticorruption. Consider three examples of the anticorruption potential of AI systems:

Continue reading