Pre-Crime Data Mining 1.1 Behavioral Profiling



Similar documents
Data Warehousing and Data Mining in Business Applications

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Security, and Intelligence

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

not possible or was possible at a high cost for collecting the data.

The Big Data Paradigm Shift. Insight Through Automation

Siemens Intelligence Platform. Non contractual; Commercial in confidence; Subject to change without notice

Web Data Mining: A Case Study. Abstract. Introduction

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing

Data Mining Solutions for the Business Environment

Hexaware E-book on Predictive Analytics

Data Mining for Fun and Profit

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov

Introduction to Data Mining

Introduction. A. Bellaachia Page: 1

Accelerating Complex Event Processing with Memory- Centric DataBase (MCDB)

WHITE PAPER Moving Beyond the FFIEC Guidelines

Take the Red Pill: Becoming One with Your Computing Environment using Security Intelligence

The Data Mining Process

Information Management course

ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis

Data Mining + Business Intelligence. Integration, Design and Implementation

Making critical connections: predictive analytics in government

A SAS White Paper: Implementing the Customer Relationship Management Foundation Analytical CRM

Fighting Identity Fraud with Data Mining. Groundbreaking means to prevent fraud in identity management solutions

Video Intelligence Platform

F I C O. February 22, 2011

Fluency With Information Technology CSE100/IMT100

A Review of Data Mining Techniques

Data Mining System, Functionalities and Applications: A Radical Review

Hearing before the House Permanent Select Committee on Intelligence. Homeland Security and Intelligence: Next Steps in Evolving the Mission

IBM Content Analytics: Rapid insight for crime investigation

Recognize Nefarious Cyber Activity and Catch Those Responsible with IBM InfoSphere Entity Analytic Solutions

SAP Solution Brief SAP HANA. Transform Your Future with Better Business Insight Using Predictive Analytics

August Investigating an Insider Threat. A Sensage TechNote highlighting the essential workflow involved in a potential insider breach

Data Mining: Overview. What is Data Mining?

How To Secure An Extended Enterprise

THE 2014 THREAT DETECTION CHECKLIST. Six ways to tell a criminal from a customer.

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Benefits of LifeLock Ultimate Plus. About LifeLock. 3 Layers of Protection DETECT ALERT RESTORE FACT SHEET LIFELOCK ULTIMATE PLUS

Beyond Watson: The Business Implications of Big Data

III JORNADAS DE DATA MINING

SAS Fraud Framework for Banking

CoolaData Predictive Analytics

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

Applying machine learning techniques to achieve resilient, accurate, high-speed malware detection

Information to Protect Our Customers From Identity Theft

Unlock the business value of enterprise data with in-database analytics

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Chapter 6 - Enhancing Business Intelligence Using Information Systems

Predictive Dynamix Inc Turning Business Experience Into Better Decisions

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

GETTING REAL ABOUT SECURITY MANAGEMENT AND "BIG DATA"

Making Critical Connections: Predictive Analytics in Government

Transforming the Telecoms Business using Big Data and Analytics

BIG SHIFTS WHAT S NEXT IN AML

INTRODUCTION. Identity Theft Crime Victim Assistance Kit

Becoming an Agile Digital Detective

Computer/IT Project LIST. Contact:

FBI CHALLENGES IN A CYBER-BASED WORLD

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI

Fraud Solution for Financial Services

Three proven methods to achieve a higher ROI from data mining

Tax Fraud in Increasing

Introduction to Data Mining

Adobe Insight, powered by Omniture

Wireless Intrusion Detection Systems (WIDS)

FIVE INDUSTRIES. Where Big Data Is Making a Difference

Big Data. Fast Forward. Putting data to productive use

This Symposium brought to you by

Solve Your Toughest Challenges with Data Mining

We may collect the following types of information during your visit on our Site:

YOUR BENEFITS GUIDE. Welcome to your. CIBC Platinum Visa card

Information Protection

How To Create An Insight Analysis For Cyber Security

An Introduction to Advanced Analytics and Data Mining

Solve your toughest challenges with data mining

Five Predictive Imperatives for Maximizing Customer Value

Machine Learning: Overview

Beyond listening Driving better decisions with business intelligence from social sources

INFO Koffka Khan. Tutorial 6

Telecom: Effective Customer Marketing

TRENDS IN DATA WAREHOUSING

White Paper. How Streaming Data Analytics Enables Real-Time Decisions

A financial software company

Preservation for a Safer World

Executive summary. Incorporate more information into the decisioning process, especially for high-dollar unsecured transactions.

A strategic approach to fraud

Data Loss Prevention in the Enterprise

9K: How Technology Can Address Current and Emerging Fraud Risks

The State of Insurance Fraud Technology. A study of insurer use, strategies and plans for anti-fraud technology

Data Mining: A Tool for Enhancing Business Process in Banking Sector Dr.R.Mahammad Shafi, Porandla Srinivas

Protecting Yourself from Identity Theft. Charlene L. Esaw Chief, Outreach and Student Programs Central Intelligence Agency (CIA) May 2009

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

U.S. Department of Homeland Security STATEMENT

Fraud detection in newly opened accounts. Connecting data helps predict identity theft

PDF PREVIEW EMERGING TECHNOLOGIES. Applying Technologies for Social Media Data Analysis

Working with telecommunications

Transcription:

1 Pre-Crime Data Mining 1.1 Behavioral Profiling With every call you make on your cell phone and every swipe of your debit and credit card a digital signature of when, what, and where you call and buy is incrementally built every second of every day in the servers of your credit card provider and wireless carrier. Monitoring the digital signatures of your consumer DNA-like code are models created with data mining technologies, looking for deviations from the norm, which once spotted instantly issue silent alerts to monitor your card or phone for potential theft. This is nothing new; it has been taking place for years. What is different is that since 9/11 this use of data mining will take an even more active role in the areas of criminal detection, security and behavioral profiling. Behavioral profiling is not racial profiling, which is not only illegal, but a crude and not very effective process. Racial profiling simply does not work; race is simply too broad a category to be useful, it is one-dimensional. What is important however is suspicious behavior and the related digital information found in diverse databases, which data mining can be used to analyze and quantify. Behavioral profiling is the capability to recognize patterns of criminal activity, to predict when and where the probabilities of crimes are likely to take place and to identify its perpetrators. Pre-crime is not science fiction; it is the objective of data mining techniques based on artificial intelligence (AI) technologies. The same data mining technologies that have been used by marketers to provide personalization which is the exact placement of the right offer, to the right person at the right time can be used for providing the right inquiry to the right perpetrator at the right time: before they commit the crime. Investigative data mining is the visualization, organization, sorting, clustering, segmenting and prediction of criminal behavior using data attributes such as age, previous arrests, modus operandi, type of building, household income, time 1

2 1.2 Rivers of Scraps of day, geo code, countries visited, housing type, auto make, length of residency, type of license, utility usage, IP address, VISA type, number of children, place of birth, average usage of ATM card, number of credit cards, etc., the data points can run into the hundreds. Pre-crime is the interactive process of predicting criminal behavior by mining this vast array of data using several artificial intelligence technologies, including: 1.2 Rivers of Scraps Link Analysis for creating graphical networks to view criminal associations and interactions Intelligent Agents for retrieving, monitoring, organizing and acting on case related information Text Mining for searching through gigabytes of documents in search of concepts and key words Neural Networks for recognizing the patterns of criminal behavior and anticipating criminal activity Machine Learning Algorithms for extracting rules and graphical maps of criminal behavior and perpetrator profiles It s not going to be a cruise missile or a bomber that will be the determining factor, Defense Secretary Donald Rumsfeld said over and over in the days following September 11. It s going to be a scrap of information. Make that multiple scraps, millions of them flowing in a digital river of information at the speed of light from servers networked across the planet. Rumsfeld is right, the landscape of battle has irretrievably changed forever and so have the weapons, if commercial airliners can become missiles, so also how we use one of the most ethereal technologies of all human creativity and imagination: AI. AI in the form of text mining robots scanning and translating terabyte databases able to detect deception, 3-D link analysis networks correlating human associations and interpersonal interactions, biometric identification devices monitoring for suspected chemicals, powerful pattern recognition neural networks looking for the signature of fraud, silent intrusion detection systems monitoring keystrokes, autonomous intelligent agent software retrieving e-mails able to sense emotions, real-time machine-learning profiling systems sitting in chat rooms all bred from (and fostering) a new type of alien intelligence. These are the weapons and tools for criminal investigations of today and tomorrow, whether we like it or not. Which of the 1.5 million people who cross U.S. borders each day is the courier for a smuggling operation? What respected merchant in ebay.com is

1.3 Data Mining 3 1.3 Data Mining about to abandon successful auction bidders, skipping out with hundred of thousands of dollars? What tiny shred of the world s $1.5 trillion in daily foreign exchange transactions is the payment from an al-quaida cell for a loose Russian nuke? How many failed passwords attempts to log into a network is a sign of organized intrusion attack? Finding the needles in these type of moving haystacks and answers to these kinds of questions is where data mining can be used to anticipate crimes and terrorist attacks. Data mining is the fusion of statistical modeling, database storage, and artificial intelligence technologies. Statisticians have been using computers for decades as a means to prove or disprove hypotheses on collected data. In fact one of the largest software companies in the world rents its statistical programs to nearly every government agency and major corporation in the United States: SAS. Linear regressions and other types of modeling analyses are common and have been used in everything from the drug approval process by the Food and Drug Administration to the credit rating of individuals by financial service providers. Another element in the development of data mining is the increasing ability of data storage. In the 1970s, most data storage depended upon COBOL programs and storage systems not conducive to easy data extraction for inductive data analysis. Today however, organizations can store and query terabytes of information in sophisticated data warehouse systems. In addition, the development of multidimensional data models, such as those used in relational database, has allowed users to move from a transaction view of customers to a more dynamic and analytical way of marketing and retaining their most profitable clients. However, the final element in data mining s evolution is with AI, during the 1980s there was development of machine learning algorithms designed to enable software to learn, there were genetic algorithms designed to evolve and improve autonomously, and of course during that decade, neural networks came into acceptance as powerful programs for classification, prediction and profiling. During the last decade intelligent agents were developed able to autonomously incorporate all of these AI functions and use them to go out over networks and the Internet to scrounge the planet for information its masters programmed them to retrieve. When combined, these AI technologies enable the creation of applications designed to listen, learn, act, evolve and identify anything from a potentially fraudulent credit card transaction to the detection of tanks from satellites, and of course now more then ever to prevent potential criminal activity. Chapter 1

4 1.4 Investigative Data Warehousing As a result of these developments, data mining flowered during the late 1990s, with many commercial, medical, marketing and manufacturing applications. Retail companies eagerly applied complex analytical capabilities to their data to increase their customer base. The financial community found trends and patterns to predict fluctuations in stock prices and economic demand. Credit card companies used it to target their offerings, micro-segmenting their customers and prospects maneuvering the best possible interest rates to maximize their profits. Telecommunication carriers used the technology to develop churn models, to predict which customers were about to jump ship and sign with their wireless competitor. The ultimate goal of data mining is the prediction of human behavior, and is by far the most common business application, however this can easily be modified to meet the objective of detection and deterrence of criminals. These and many more application have demonstrated that, rather than requiring a human to attempt to deal with hundreds of descriptive attributes, data mining allows the automatic analysis of databases and the recognition of important trends and behavioral patterns. Increasingly crime and terror in our world will be digital by nature. In fact one of the world largest criminal monitoring and detection enterprises in the world is at this very moment using a neural network to look for fraud. The HNC Falcon system uses, in part a neural network to look for patterns of potential fraud in about 80% of all credit card transactions every second of everyday. So it is that analysts and investigators will come to rely on machines and artificial intelligence to detect and deter crime and terrorism in today s world. Breakthrough applications are already taking place in which neural networks are being used for forensic analysis of chemical compounds to detect arson and illegal drug manufacturing, coupled with agent technology, sensors can be deployed to detect bio-terror attacks, DARPA has already solicited a prototype for such a system. 1.4 Investigative Data Warehousing Data warehousing is a practice of compiling transactional data with lifestyle demographics for constructing composites of customers and then de-composing them via segmentation reports and data mining techniques to extract profiles or views of who they are and what they value. Data warehouse techniques have been practiced for a decade in private industry. These same techniques have so far not been applied to criminal detection and security deterrence, however they well could. Using the same approach behavioral data from diverse sources such as the Internet (clickstream data captured by Internet mechanisms, such as cookies,

1.5 Link Analysis 5 1.5 Link Analysis invisible graphics, registration forms), demographics from data providers such as ChoicePoint, CACI, Experian, Acxiom, DataQuick, etc., utility and telecom usage data, coupled with criminal data could be used to construct composites representing views of perpetrators enabling the analysis of similarities and traits which through data mining could yield predictive models for investigators and analysts. As with private industry better views of perpetrators could be developed enabling the detection and prevention of criminal and terrorist activity. Effectively combining multiple sources of data can lead law enforcement investigators to discover patterns to help them be proactive in their investigations. Link analysis is a good start in mapping terrorist activity and criminal intelligence by visualizing associations between entities and events. Link analyses often involve seeing via a chart or a map the associations between suspects and locations whether by physical contacts, communications in a network, thru phone calls, financial transactions, or via the Internet and e-mail. Criminal investigators often use link analysis to begin to answer such questions as who know whom and when and where have they been in contact? Intelligence analysts and criminal investigators must often correlate enormous amounts of data about individuals in fraudulent, political, terrorist, narcotics and other criminal organizations. A critical first step in the mining of this data is viewing it in terms of relationships between people and organizations under investigation. One of the first tasks in data mining and criminal detection involves the visualization of these associations, which commonly involves the use of link analysis charts. Chapter 1