-
Marketplace
-
Channel Resources
Articles from this Site
Assessments - A Mandatory BI/DW Pit Stop
C-Level Executives Prioritize Business Continuity Protection
A Statistical Stocking Stuffer for the Holidays
Lessons Learned in Master Data Management
Research Outlines Key Steps to Protect Sensitive Data
White Papers
Best Practices in Data Management
Enterprise Class Hardware Tuning for Microsoft Analysis Services
Best Practices for Planning and Budgeting in Midsize Companies: Overcoming Process and Technology Barriers
Books
Mastering Data Mining: Art and Science of Customer Relationship Management
Handbook of Customer Relationship Management (CRM): The Definitive Guide to Winning, Managing and Developing Key Account Business
Essential Guide to Knowledge Management, The: e-Business and CRM Applications
Microsoft Data Mining
Maximizing Business Performance through Software Packages: Best Practices for Justification, Selection, and Implementation
Mine Your Way to Combat Money Laundering, Part 2
Data Mining in Anti-Money Laundering (AML) Solutions
As we described in part 1, money laundering operations are characterized by a complex series of financial transactions aimed at obscuring the sources of funds. The large volume of interinstitutional financial transactions and fragmented transaction information coupled with poorly trained and understaffed enforcement agencies often result in important alerts not being followed. Further, alerts may be incomplete and untimely, resulting in crime investigations that continue well after the illicit proceeds have been successfully laundered.
Data mining has the potential to uncover new scenarios for investigation leading to detecting instances of money laundering. Data mining is defined as the nontrivial automated process of extraction of interesting, significant, implicit, previously unknown and potentially useful information or patterns from data in large databases.1 The emergence of data warehousing as a viable technology means that enforcement agencies are now able to consolidate financial transactions from multiple institutions across several countries. This gives a consolidated picture of funds transfer that helps in analysis of transactions. Data mining algorithms and techniques, when applied on such transactions, bring out hidden implicit patterns of funds flow. This coupled with domain knowledge in the form of know your customer (KYC) information and field knowledge from experts will enable suspicious transactions to be detected concurrently as they occur. Figure 3 is a partial list of data mining techniques that are relevant in AML solutions and their descriptions.
Association rule mining (ARM) reveals hidden relationships based on co-occurrence of items/attributes. In the case of money laundering, ARM might be used to relate KYC information of customers with their transaction information. Thus, typical patterns of frequent transactions for a particular customer profile might get revealed. Frequent sequence mining (FSM) takes this one step further to show the sequence of transactions that represent normal business operations and sequences that might represent money laundering instances.
It may not be feasible to monitor all transactions due to computational costs. Typically, transactions undertaken by customers classified as "risky" profiles should be monitored. Classification algorithms can be used to identify new customers with risky profiles. This is done on the basis of knowledge of existing customers and their transaction behavior. Clustering algorithms may be used to segment the account base based on criteria such as similarity of activity, (volume, value and velocity) of transactions. An analysis of the resulting clusters helps in enriching the domain knowledge of money laundering experts.

Figure 3: Data Mining Techniques for Detecting And Combating Money Laundering Activities
Regression analysis techniques are useful in discovering, validating and quantifying trends from previously solved money laundering cases for use on current cases. For instance, data from previously observed behaviors can be used to find the most promising locations (accounts) at the most probabilistically promising day and time. This can be used to focus future investigative activities.
Finally, link analysis and mining help investigators relate a large number of objects of different types such as people, bank accounts, businesses, wire transfers and cash deposits. This may be based on transaction activity or common points of reference like transacting with the same customer, etc. An AML system implemented by Financial Crimes Enforcement Network (FinCEN) of Virginia uses link analysis to uncover many instances of unknown and potentially high-value transactions for possible investigation.2
A Data Mining Framework for AML Solutions
Data mining techniques and subsequent analysis from an AML perspective consists of multiple levels.3 The framework presented here classifies the financial activity and the corresponding information flows into four levels. Each higher level can be thought of as an aggregation of the activities at the lower levels and additional domain knowledge.
Mining into Four Levels
The lowermost and the most basic level at which information is available in any financial institution is the transaction level. This consists of individual transactions, such as currency deposits, withdrawals, wire transfers, checks and the like.

Figure 4: A Data Mining Framework for Anti-Money Laundering
The second level is the individual customer or the account level. Multiple transactions are associated with specific accounts, while each transaction can be associated with at least one account. Accounts may be internal to an institution or external. Aggregation of transactions pertaining to individual accounts gives an account level picture of the financial activity. This picture shows the degree of association between various accounts based on frequencies of transactions that connect the accounts.
The third level can be thought of as the institution level, wherein the same customer institution (business or individual) may have multiple accounts in different financial institutions. A consolidation of these accounts may throw light on the fact that a business may be a front for money laundering and may involve multiple accounts and multiple individuals. Usually, AML solutions are built for a single institution as a part of their vigilance efforts. However, AML solutions having a local scope are likely to have limited utility as only monitoring solutions rather than as proactive money laundering detection solutions. This suggests a need for integrating data from different financial institutions. The AML solutions of central agencies such as FinCEN operate at this level.
Finally, we have the ring level which involves a professional money laundering operation of broad scope consisting of multiple businesses or institutions. Figure 4 depicts a framework for data mining framework for AML solutions. Data is collected from multiple internal and external sources.
A level-based view is useful from various perspectives. Different data mining techniques may be applied across different levels to yield insights into the domain. Similarly, a data mining technique that might be useful at a certain level might not yield any result at a different level. For example, link analysis may be useful at an account or institution level but not at a transaction level. Similarly, it makes no sense to cluster transactions, while clustering accounts based on the similarity of transactions will help to relate accounts. Analysis at any single level may miss indicators of activities at other levels. Domain knowledge from experts is incorporated into data mining operations and also to build the level-based classification of warehouse data.
Challenges and Limitations of Data Mining in AML Operations
Although data mining is a useful technology for money laundering detection, effective implementation has to surmount many challenges and may be limited due to technological and other reasons.
- The nature of money laundering: Building money laundering profiles is not an easy task because money laundering instances are not self-revealing. Unlike credit card fraud detection, where the victim is likely to report instances of fraud, instances of money laundering reporting are likely to be rare.4 Therefore domain knowledge from experts needs to be integrated into AML systems. This is likely to be a challenge due to knowledge acquisition limitations and paucity of experts.
- Nature of criminal conduct: Recognition of money laundering activities from legitimate transactions is likely to be a challenging task due to dynamic and diverse forms of criminal conduct; many patterns of criminal conduct differ little from legitimate transactions, making the task of identifying illicit transactions more challenging; and associating patterns in time and space is a difficult task.
- Political issues: Performance of data mining tools is limited by accessibility to financial data sets. A consolidated and useful data analysis requires data that is collated from multiple jurisdictions, i.e., financial institutions at multiple countries. Consolidating the data at one place for timely analysis requires not just technical capabilities but also smooth political processes like permissions, willingness to share information, surmounting organizational inertia and the like.
- Fragmented information: Relevant data required for AML analysis is not usually available at a single place. Data preparation for data mining involves integration from multiple sources, both internal and external, while taking care of issues such as data inconsistency, faulty data, fragmented records and missing data.
- Massive databases: This challenge is a consequence of the huge and growing volume of financial data and the relatively small number of money laundering instances in them. Analysis of such huge volumes imposes a huge computational burden on AML systems.
- Training: Training of investigative analysts in model tuning and results interpretation is likely to place high demand on resources such as time and money. Financial analysts also need to know which data mining approach will yield the desired results given varying case conditions.
Ever increasing volumes of financial data has rendered traditional methods of money laundering detection infeasible and ineffective. A variety of data mining techniques are well-suited to aid financial investigators generate timely and accurate alerts. These not only identify existing patterns of money laundering but also help investigators identify new methods used by launderers. A level-based view of the AML system helps in identification of data mining techniques appropriate at each level and the utility of the knowledge gained in the process. However there are many challenges to implementing a data mining-based AML system. These challenges are largely related to data and less to technology. In addition, political issues like data sharing across multiple jurisdictions, permissions and skepticism also need to be addressed.
References:
- U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy (Eds.). Advances in Knowledge Discovery and Data Mining. AAAI Press: New York, 1996.
- T. Senator. "The financial crimes enforcement network AI system (FAIS)." AI Magazine 4, 1995.
- M. Sparrow. "The State of the Fraud Control Game; and the Impact of Electronic Claims Processing on Fraud and Fraud Control." Proceedings of the International Symposium on Criminal Justice Information Systems and Technology, 1994.
- U.S. Congress, Office of Technology Assessment (OTA). "Information Technologies for Control of Money Laundering." OTA-ITC-630. Washington, DC: U.S. Government Printing Office, September 1995.
G. S. Vidyashankar is former director, Data warehousing and Business Intelligence Practice, at Cognizant Technology Solutions. He has more than 16 years of IT and management experience in the software industry, with specific focus on data warehousing and business intelligence. He has extensive experience in developing business analytical models for banking and financial services, retail and the telecom sectors. He headed the Business Analytics Group at Cognizant Technology Solutions. He may be reached at vidyashankar.gs@gmail.com.
Rajesh Natarajan is assistant manager, Projects, Business Analytics Group at Cognizant Technology Solutions. He specializes in applying data mining techniques to real-world problem scenarios. He has published in international conferences and journals. Natarajan may be reached at rajesh.natarajan@cognizant.com.
Subhrangshu Sanyal is assistant manager, Business Development, Banking and Financial Services at Cognizant Technology Solutions. He has over 10 years of IT and rich domain experience. He may be reached at Subhrangshu.Sanyal@cognizant.com.
For more information on related topics, visit the following channels:


