Identity Resolution in Criminal Justice Data: An Application of NORA
|
|
- Antony Myron Owens
- 8 years ago
- Views:
Transcription
1 Identity Resolution in Criminal Justice Data: An Application of NORA Queen E. Booker 1 1 Minnesota State University, Mankato, 150 Morris Hall Mankato, Minnesota Queen.booker@mnsu.edu Abstract. Identifying aliases is an important component of the of the criminal justice system. Accurately identifying a person of interest or someone who has been arrested can significantly reduce the costs within the entire criminal justice system. This paper examines the problem domain of matching and relating identities, examines traditional approaches to the problem, and applies the identity resolution approach described by Jeff Jonas [1] and relationship awareness to the specific case of client identification for the indigent defense office. The combination of identify resolution and relationship awareness offered improved accuracy in matching identities. Keywords: Pattern Analysis, Identity Resolution, Text Mining 1 Introduction Appointing counsel for indigent clients is a complex task with many constraints and variables. The manager responsible for assigning the attorney is limited by the number of attorneys at his/her disposal. If the manager assigns an attorney to a case with which the attorney has a conflict of interest, the office loses the funds already invested in the case by the representing attorney. Additional resources are needed to bring the next attorney up to speed Thus, it is in the best interest of the manager to be able to accurately identify the client, the victim and any potential witnesses to minimize any conflict of interest. As the number of cases grows, many times, the manager simply selects the next person on the list when assigning the case. This type of assignment can lead to a high number of withdrawals due to a late identified conflict of interest. Costs to the office increase due to additional incarceration expenses while the client is held in custody as well as the sunk costs of prior and repeated attorney representation regardless of whether the client is in or out of custody.
2 These problems are further exacerbated when insufficient systems are in place to manage the data that could be used to make assignments easier. The data on the defendant is separately maintained by the various criminal justice agencies including the indigent defense service agency itself. This presents a challenge as the number of cases increases but without a concomitant increase in staff available to make the assignments. Thus those individuals responsible for assigning attorneys want not only the ability to better assign attorneys, but also to do so in a more expedient fashion. The aggregate data from all the information systems in the criminal justice process has been proven to improve the attorney assignment process [2] Criminal justice systems have many disparate information systems, each with their own data sets. These include systems concerned with arrests, court case scheduling, the prosecuting attorneys office, to name a few. In many cases, relationships are non-obvious. It is not unusual for a repeat offender to provide an alternative name that is not validated prior to sending the arrest data to the indigent defense office. Likewise it is not unusual for potential witnesses to provide alternative names in an attempt to protect their identities. And further, it is not unusual for a victim to provide yet another name in an attempt to hide a previous interaction with the criminal justice process. Detecting aliases becomes harder as the indigent defense problem grows in complexity. 2 Problems with matching Matching identities or finding aliases is a difficult process to perform manually. The process relies on institutional knowledge and/or visual stimulation. For example, if an arrest report is accompanied by a picture, the manager or attorney can easily ascertain the person s identity. But that is not the case. Arrest reports sent generally are textual with the defendant s name, demographic information, arrest charges, victim, and any witness information. With the institutional knowledge, the manager or an attorney can review the information on the report and identify the person by the use of a previous alias or by other pertinent information on the report. So essentially, it is possible to identify many aliases by humans, and hence possible for an information system because the enterprise contains all the necessary knowledge. But the knowledge and the process is trapped across isolated operational systems within the criminal justice agencies. One approach to improving the indigent defense agency problem is to amass information from as many different available data sources, clean the data, and finding matches to improve the defense process. Traditional algorithms aren't well suited for this process. Matching is further encumbered by the poor quality of the underlying data. Lists containing subjects of interest commonly have typographical errors such as data from the defendants who intentionally misspell their names to frustrate data matching efforts, and legitimate natural variability (Mike versus Michael and 123 Main Street versus 123 S. Maine Street). Dates are often a problem as well. Months and days are sometimes transposed, especially in international settings. Numbers often have transposition errors or might have been entered with a different number of leading zeros.
3 2.1 Current Identity Matching Approaches Organizations typically employ three general types of identity matching systems: merge/purge and match/merge, binary matching engines, and centralized identity catalogues. Merge/purge and match/merge is the process of combining two or more lists or files, simultaneously identifying and eliminating duplicate records. This process was developed by direct marketing organizations to eliminate duplicate customer records in mailing lists. Binary matching engines test an identity in one data set for its presence in a second data set. These matching engines are also sometimes used to compare one identity with another single identity (versus a list of possibilities), with the output often expected to be a confidence value pertaining to the likelihood that the two identity records are the same. These systems were designed to help organizations recognize individuals with whom they had previously done business or, alternatively, recognize that the identity under evaluation is known as a subject of interest that is, on a watch list thus warranting special handling. [1] Centralized identity catalogues are systems collect identity data from disparate and heterogeneous data sources and assemble it into unique identities, while retaining pointers to the original data source and record with the purpose of creating an index. Each of the three types of identity matching systems uses either probabilistic or deterministic matching algorithms. Probabilistic techniques rely on training data sets to compute attribute distribution and frequency looking for both common and uncommon patterns. These statistics are stored and used later to determine confidence levels in record matching. As a result, any record containing similar, but uncommon data might be considered a record the same person with a high degree of probability. These systems lose accuracy when the underlying data's statistics deviate from the original training set and must frequently retrained to maintain its level of accuracy. Deterministic techniques rely on pre-coded expert rules to define when records should be matched. One rule might be that if the names are close (Robert versus Rob) and the social security numbers are the same, the system should consider the records as matching identities. These systems often have complex rules based on itemsets such as name, birthdate, zipcode, telephone number, and gender. However, these systems fail as data becomes more complex. 3 NORA Jeff Jonas introduced a system called NORA which stands for non-obvious relationship awareness. He developed the system specifically to solve Las Vegas casinos' identity matching problems. NORA accepts data feeds from numerous enterprise information systems, and built a model of identities and relationships between identities (such as shared addresses or phone numbers) in real time. If a new identity matched or related to another identity in a manner that warranted human scrutiny (based on basic rules, such as good guy connected to very bad guy), the system would immediately generate an intelligence alert. The system approach for the Las Vegas casinos is very similar to the
4 needs of the criminal justice system. The data needed to identify aliases and relationships for conflict of interest concerns comes from multiple data sources arresting agency, probation offices, court systems, prosecuting attorney office, and the defense agency itself, and the ability to successfully identify a client is needed in real-time to reduce costs to the defenses office. The NORA system requirements were: Sequence neutrality. The system needed to react to new data in real time. Relationship awareness. Relationship awareness was designed into the identity resolution process so that newly discovered relationships could generate realtime intelligence. Discovered relationships also persisted in the database, which is essential to generate alerts to beyond one degree of separation. Perpetual analytics. When the system discovered something of relevance during the identity matching process, it had to publish an alert in real time to secondary systems or users before the opportunity to act was lost. Context accumulation. Identity resolution algorithms evaluate incoming records against fully constructed identities, which are made up of the accumulated attributes of all prior records. This technique enabled new records to match to known identities in toto, rather than relying on binary matching that could only match records in pairs. Context accumulation improved accuracy and greatly improved the handling of low-fidelity data that might otherwise have been left as a large collection of unmatched orphan records. Extensible. The system needed to accept new data sources and new attributes through the modification of configuration files, without requiring that the system be taken offline. Knowledge-based name evaluations. The system needed detailed name evaluation algorithms for high-accuracy name matching. Ideally, the algorithms would be based on actual names taken from all over the world and developed into statistical models to determine how and how often each name occurred in its variant form. This empirical approach required that the system be able to automatically determine the culture that the name most likely came from because names vary in predictable ways depending on their cultural origin. Real time. The system had to handle additions, changes, and deletions from realtime operational business systems. Processing times are so fast that matching results and accompanying intelligence (such as if the person is on a watch list or the address is missing an apartment number based on prior observations) could be returned to the operational systems in sub-seconds. Scalable. The system had to be able to process records on a standard transaction server, adding information to a repository that holds hundreds of identities. [1] Like the gaming industry, the defense attorney s office has relatively low daily transactional volumes. Although it receives booking reports on an ongoing basis, initial
5 court appearances are handled by a specific attorney, and the assignments are made daily, usually the day after the initial court appearance. The attorney at the initial court appearance is not the officially assigned attorney, allowing the manager a window of opportunity from booking to assigning the case to accurately identify the client. But the analytical component of accurate identification involves numerous records with accurate linkages including aliases as well as past relationships and networks as related to the case. The legal profession has rules and regulations that constitute conflict of interest. Lawyers must follow these rules to maintain their license to practice which makes the assignment process even more critical. [3] NORA s identity resolution engine is capable of performing in real time against extraordinary data volumes. The gaming industry's requirements of less than 1 million affected records a day means that a typical installation might involve a single Intel-based server and any one of several leading SQL database engines. This performance establishes an excellent baseline for application to the defense attorney data since the NORA system demonstrated that the system could handle multibillion-row databases consisting of hundreds of millions of constructed identities and ingest new identities at a rate of more than 2,000 identity resolutions per second; such ultra-large deployments require 64 or more CPUs and multiple terabytes of storage, and move the performance bottleneck from the analytic engine to the database engine itself. While the defense attorney dataset is not quite as large, the processing time on the casino data suggests that NORA would be able to accurately and easily handle the defense attorney s needs in real-time. 4 Identity resolution Identity resolution is an operational intelligence process, typically powered by an identity resolution engine, whereby organizations can connect disparate data sources with a view to understanding possible identity matches and non-obvious relationships across multiple data sources. It analyzes all of the information relating to individuals and/or entities from multiple sources of data, and then applies likelihood and probability scoring to determine which identities are a match and what, if any, non-obvious relationships exist between those identities. These engines are used to uncover risk, fraud, and conflicts of interest. Identity resolution is designed to assemble i identity records from j data sources into k constructed, persistent identities. The term "persistent" indicates that matching outcomes are physically stored in a database at the moment a match is computed. Accurately evaluating the similarity of proper names is undoubtedly one of the most complex (and most important) elements of any identity matching system. Dictionarybased approaches fail to handle the complexities of names such as common names such as Robert Johnson. The approaches fail even greater when cultural influences in naming are involved. Soundex is an improvement over traditional dictionary approaches. It uses a phonetic algorithm for indexing names by their sound when pronounced in English. The basic aim is for names with the same pronunciation to be encoded to the same string so that
6 matching can occur despite minor differences in spelling. Such systems' attempts to neutralize slight variations in name spelling by assigning some form of reduced "key" to a name (by eliminating vowels or eliminating double consonants) frequently fail because of external factors for example, different fuzzy matching rules are needed for names from different cultures. Jonas found that the deterministic method is essential for eliminating dependence on training data sets. As such, the system no longer needed periodic reloads to account for statistical changes to the underlying universe of data. However, he also asserts many common conditions in which deterministic techniques fail specifically, certain attributes were so overused that it made more sense to ignore them than to use them for identity matching and detecting relationships. For example, two people with the first name of "Rick" who share the same social security number are probably the same person unless the number is Two people who have the same phone number probably live at the same address unless that phone number is a travel agency's phone number. He refers to such values as generic because the overuse diminishes the usefulness of the value itself. It's impossible to know all of these generic values a priori for one reason, they keep changing thus probabilistic-like techniques are used to automatically detect and remember them. His identity resolution system uses a hybrid matching approach that combines deterministic expert rules with a probabilistic-like component to detect generics in real time (to avoid the drawback of training data sets). The result is expert rules that look something like this: If the name is similar AND there is a matching unique identifier THEN match UNLESS this unique identifier is generic In his system, a unique identifier might include social security or credit-card numbers, or a passport number, but wouldn't include such values as phone number or date of birth. The term "generic" here means the value has become so widely used (across a predefined number of discreet identities) that one can no longer use this same value to disambiguate one identity from another. [1] However, the approach for the study for the defense data included a merged itemset that combined date of birth, gender, and ethnicity code because of the inability or legal constraint of not being able to use the social security number for identification. Thus, an identifier was developed from a merged itemset after using the SUDA algorithm to identify infrequent itemsets based on data mining [4]. The actual deterministic matching rules for NORA as well as the defense attorney system are much more elaborate in practice because they must explicitly address fuzzy matching to scrub and clean the data as well as address transposition errors in numbers, malformed addresses, and other typographical errors. The current defense attorney agency model has thirty-six rules. Once the data is cleansed it is stored and indexed to provide user-friendly views of the data that make it easy for the user to find specific information
7 when performing queries and ad hoc reporting. Then, a data-mining algorithm using a combination of binary regression and logit models is run to update patterns for assigning attorneys based on the day s outcomes [5]. The algorithm identifies patterns for the outcomes and tree structure for attorney and defendant combinations where the attorney completed the case. [6] Although matching accuracy is highly dependent on the available data, using the techniques described here achieves the goals of identity resolution, which essentially boil down to accuracy, scalability, and sustainability even in extremely large transactional environments. 5 Relationship awareness According to Jonas, detecting relationships is vastly simplified when a mechanism for doing so is physically embedded into the identity matching algorithm. Stating the obvious, before analyzing meaningful relationships, the system must be able to resolve unique identities. As such, identity resolution must occur first. Jonas purported that it was computationally efficient to observe relationships at the moment the identity record is resolved because in-memory residual artifacts (which are required to match an identity) comprise a significant portion of what's needed to determine relevant relationships. Relevant relationships, much like matched identities, were then persisted in the same database. Notably, some relationships are stronger than others; a relationship score that's assigned with each relationship pair captures this strength. For example, living at the same address three times over 10 years should yield a higher score than living at the same address once for three months. As identities are matched and relationships detected, the NORA evaluates userconfigurable rules to determine if any new insight warrants an alert being published as an intelligence alert to a specific system or user. One simplistic way to do this is via conflicting roles. A typical rule for the defense attorney might be notification any time a client rule is associated to a role of victim, witness, co-defendant, or previously represented relative, for example. In this case, associated might mean zero degrees of separation (they're the same person) or one degree of separation (they're roommates). Relationships are maintained in the database to one degree of separation; higher degrees are determined by walking the tree. Although the technology supports searching for any degree of separation between identities, higher orders include many insignificant leads and are thus less useful. 6 Comparative Results This research is an ongoing process to improve the attorney assignment process in the defense attorney offices. As economic times get harder, crime increases and as crimes increase, so do the number of people who require representation by the public defense offices. The ability to quickly identify conflicts of interests reduces the amount of time a
8 person stays in the system and also reduces the time needed to process the case. The original system built to work with the alias/identity matching as called the Court Appointed Counsel System or CACS. CACS identified 83% more conflicts of interests than the indigent defense managers during the initial assignments [Booker]. Using the merged itemset and an algorithm using NORA s underlying technology, the conflicts improved from 83% to 87%. But the real improvement came in the processing time. The key to the success of these systems is the ability to update and provide accurate data at a moments notice. Utilizing NORA s underlying algorithms improved the updating and matching process significantly, allowing for new data to be entered and analyzed within a couple of hours as opposed to the days it took to process using the CACS algorithms. Further, the merged itemset approach helped to provide a unique identifier in 90% of the cases significantly increasing automated relationship identifications. The ability to handle real-time transactional data with sustained accuracy will continue to be of "front and center" importance as organizations seek competitive advantage. The identity resolution technology applied here provides evidence that such technologies can be applied to more than simple fraud detection but also to improve business decision making and intelligence support to entities whose purpose are to. References 1. Jonas, J., "Threat and Fraud Intelligence, Las Vegas Style," IEEE Security & Privacy, Vol. 4, No. 06, pp 28-34, (2006) 2. Booker, Q., Kitchens, F. K., and Rebman, C., A Rule Based Decision Support System Prototype for Assigning Felony Court Appointed Counsel, Proceedings of the 2004 Decision Sciences Annual Meeting, Boston, MA. (2004) 3. Gross, L., "Are Differences Among the Attorney Conflict of Interest Rules Consistent with Principles of Behavioral Economics". Georgetown Journal of Legal Ethics, Vol. 19, p. 111, (2006) 4. Manning, A. M., Haglin, D. J., and Keane, J. A., A Recursive Search Algorithm for Statistical Disclosure Assessment, Data Mining and Knowledge Discovery, (2007), conditionally accepted. 5. Kitchens, Fred L.; Sharma, S. K.; and Harris, T., Cluster Computers for e-business Applications, Asian Journal of Information Systems (AJIS), 3 (10) (2004) 6. Forgy, C., Rete: A Fast Algorithm for the Many Pattern/ Many Object Pattern Match Problem, Artificial Intelligence 19, (1982)
Hybrid Technique for Data Cleaning
Hybrid Technique for Data Cleaning Ashwini M. Save P.G. Student, Department of Computer Engineering, Thadomal Shahani Engineering College, Bandra, Mumbai, India Seema Kolkur Assistant Professor, Department
More informationRecognize Nefarious Cyber Activity and Catch Those Responsible with IBM InfoSphere Entity Analytic Solutions
Building a Smarter Planet with Advanced Cyber Security Solutions Recognize Nefarious Cyber Activity and Catch Those Responsible with Highlights g Cyber Security Solutions from IBM InfoSphere Entity Analytic
More informationALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search
Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti
More informationHow To Create An Insight Analysis For Cyber Security
IBM i2 Enterprise Insight Analysis for Cyber Analysis Protect your organization with cyber intelligence Highlights Quickly identify threats, threat actors and hidden connections with multidimensional analytics
More informationData Migration. How CXAIR can be used to improve the efficiency and accuracy of data migration. A CXAIR White Paper. www.connexica.
Search Powered Business Analytics, the smartest way to discover your data Data Migration How CXAIR can be used to improve the efficiency and accuracy of data migration A CXAIR White Paper www.connexica.com
More informationCHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1. Introduction 1.1 Data Warehouse In the 1990's as organizations of scale began to need more timely data for their business, they found that traditional information systems technology
More informationIBM Content Analytics: Rapid insight for crime investigation
IBM Content Analytics: Rapid insight for crime investigation Discover insights in structured and unstructured information to speed case and identity resolution Highlights Reduces investigation time from
More informationDATA MINING AND WAREHOUSING CONCEPTS
CHAPTER 1 DATA MINING AND WAREHOUSING CONCEPTS 1.1 INTRODUCTION The past couple of decades have seen a dramatic increase in the amount of information or data being stored in electronic format. This accumulation
More informationUsing reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
More informationJOURNAL OF OBJECT TECHNOLOGY
JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationWHITE PAPER SPLUNK SOFTWARE AS A SIEM
SPLUNK SOFTWARE AS A SIEM Improve your security posture by using Splunk as your SIEM HIGHLIGHTS Splunk software can be used to operate security operations centers (SOC) of any size (large, med, small)
More informationComparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications
Comparing Microsoft SQL Server 2005 Replication and DataXtend Remote Edition for Mobile and Distributed Applications White Paper Table of Contents Overview...3 Replication Types Supported...3 Set-up &
More informationOracle Real Time Decisions
A Product Review James Taylor CEO CONTENTS Introducing Decision Management Systems Oracle Real Time Decisions Product Architecture Key Features Availability Conclusion Oracle Real Time Decisions (RTD)
More informationProcess Intelligence: An Exciting New Frontier for Business Intelligence
February/2014 Process Intelligence: An Exciting New Frontier for Business Intelligence Claudia Imhoff, Ph.D. Sponsored by Altosoft, A Kofax Company Table of Contents Introduction... 1 Use Cases... 2 Business
More informationMaking critical connections: predictive analytics in government
Making critical connections: predictive analytics in government Improve strategic and tactical decision-making Highlights: Support data-driven decisions using IBM SPSS Modeler Reduce fraud, waste and abuse
More informationBusiness Process Management In An Application Development Environment
Business Process Management In An Application Development Environment Overview Today, many core business processes are embedded within applications, such that it s no longer possible to make changes to
More informationINFO 1400. Koffka Khan. Tutorial 6
INFO 1400 Koffka Khan Tutorial 6 Running Case Assignment: Improving Decision Making: Redesigning the Customer Database Dirt Bikes U.S.A. sells primarily through its distributors. It maintains a small customer
More informationTOWARD A DISTRIBUTED DATA MINING SYSTEM FOR TOURISM INDUSTRY
TOWARD A DISTRIBUTED DATA MINING SYSTEM FOR TOURISM INDUSTRY Danubianu Mirela Stefan cel Mare University of Suceava Faculty of Electrical Engineering andcomputer Science 13 Universitatii Street, Suceava
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationHIGH PRECISION MATCHING AT THE HEART OF MASTER DATA MANAGEMENT
HIGH PRECISION MATCHING AT THE HEART OF MASTER DATA MANAGEMENT Author: Holger Wandt Management Summary This whitepaper explains why the need for High Precision Matching should be at the heart of every
More informationIssues in Identification and Linkage of Patient Records Across an Integrated Delivery System
Issues in Identification and Linkage of Patient Records Across an Integrated Delivery System Max G. Arellano, MA; Gerald I. Weber, PhD To develop successfully an integrated delivery system (IDS), it is
More informationCase Management and Real-time Data Analysis
SOLUTION SET AcuityPlus Case Management and Real-time Data Analysis Introduction AcuityPlus enhances the Quality Assurance and Management capabilities of the Cistera Convergence Server by taking existing
More informationIMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
More informationDataFlux Data Management Studio
DataFlux Data Management Studio DataFlux Data Management Studio provides the key for true business and IT collaboration a single interface for data management tasks. A Single Point of Control for Enterprise
More informationChapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
More informationStay ahead of insiderthreats with predictive,intelligent security
Stay ahead of insiderthreats with predictive,intelligent security Sarah Cucuz sarah.cucuz@spyders.ca IBM Security White Paper Executive Summary Stay ahead of insider threats with predictive, intelligent
More informationActivePrime's CRM Data Quality Solutions
Data Quality on Demand ActivePrime's CRM Data Quality Solutions ActivePrime s family of products easily resolves the major areas of data corruption: CleanCRM is a single- or multi-user software license
More informationWorking with telecommunications
Working with telecommunications Minimizing churn in the telecommunications industry Contents: 1 Churn analysis using data mining 2 Customer churn analysis with IBM SPSS Modeler 3 Types of analysis 3 Feature
More informationEthical Constraints on Lawyers Serving as Pro Tem Limited Jurisdiction Judges
Arizona Supreme Court Judicial Ethics Advisory Committee ADVISORY OPINION 02-06 (September 21, 2002) Ethical Constraints on Lawyers Serving as Pro Tem Limited Jurisdiction Judges Issues 1. May a lawyer
More informationRule 6 Adopted at a joint meeting of the District and County Court at Law Judges of Webb County on December 2, 2009
Rule 6 Adopted at a joint meeting of the District and County Court at Law Judges of Webb County on December 2, 2009 Committee Members Pete Garza Hugo Martinez Richard Gonzalez Fernando Sanchez Eduardo
More informationThe Analysis of Online Communities using Interactive Content-based Social Networks
The Analysis of Online Communities using Interactive Content-based Social Networks Anatoliy Gruzd Graduate School of Library and Information Science, University of Illinois at Urbana- Champaign, agruzd2@uiuc.edu
More informationExperience studies data management How to generate valuable analytics with improved data processes
www.pwc.com/us/insurance Experience studies data management How to generate valuable analytics with improved data processes An approach to managing data for experience studies October 2015 Table of contents
More informationNICE MULTI-CHANNEL INTERACTION ANALYTICS
NICE MULTI-CHANNEL INTERACTION ANALYTICS Revealing Customer Intent in Contact Center Communications CUSTOMER INTERACTIONS: The LIVE Voice of the Customer Every day, customer service departments handle
More informationTENTH JUDICIAL DISTRICT DISTRICT COURT DIVISION LOCAL RULES AND CONTINUANCE POLICIES FOR DISTRICT COURT CRIMINAL/INFRACTION CASES
STATE OF NORTH CAROLINA COUNTY OF WAKE IN THE GENERAL COURT OF JUSTICE TENTH JUDICIAL DISTRICT DISTRICT COURT DIVISION LOCAL RULES AND CONTINUANCE POLICIES FOR DISTRICT COURT CRIMINAL/INFRACTION CASES
More informationIBM SECURITY QRADAR INCIDENT FORENSICS
IBM SECURITY QRADAR INCIDENT FORENSICS DELIVERING CLARITY TO CYBER SECURITY INVESTIGATIONS Gyenese Péter Channel Sales Leader, CEE IBM Security Systems 12014 IBM Corporation Harsh realities for many enterprise
More informationMedical Fraud Detection Through Data Mining Megaputer Case Study www.megaputer.com Megaputer Intelligence, Inc. 120 West Seventh Street, Suite 310 Bloomington, IN 47404, USA +1 812-330-0110 Medical Fraud
More informationINTRODUCTION DO YOU NEED A LAWYER?
INTRODUCTION The purpose of this handbook is to provide answers to some very basic questions that inmates or inmates families might have regarding the processes of the criminal justice system. In no way
More informationIllinois. An Assessment of Access to Counsel & Quality of Representation in Delinquency Proceedings
Illinois An Assessment of Access to Counsel & Quality of Representation in Delinquency Proceedings by the Children and Family Justice Center, Bluhm Legal Clinic, Northwestern University School of Law,
More informationMARYLAND RULES OF PROCEDURE TITLE 1 GENERAL PROVISIONS CHAPTER 100 APPLICABILITY AND CITATION
TITLE 1 GENERAL PROVISIONS CHAPTER 100 APPLICABILITY AND CITATION AMEND Rule 1-101 (q) to add collaborative law processes to the applicability of Title 17, as follows: Rule 1-101. APPLICABILITY... (q)
More informationHow In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time
SCALEOUT SOFTWARE How In-Memory Data Grids Can Analyze Fast-Changing Data in Real Time by Dr. William Bain and Dr. Mikhail Sobolev, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T wenty-first
More informationBetter planning and forecasting with IBM Predictive Analytics
IBM Software Business Analytics SPSS Predictive Analytics Better planning and forecasting with IBM Predictive Analytics Using IBM Cognos TM1 with IBM SPSS Predictive Analytics to build better plans and
More informationViewpoint ediscovery Services
Xerox Legal Services Viewpoint ediscovery Platform Technical Brief Viewpoint ediscovery Services Viewpoint by Xerox delivers a flexible approach to ediscovery designed to help you manage your litigation,
More informationCourse 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
More informationHIGH SPEED DATA RETRIEVAL FROM NATIONAL DATA CENTER (NDC) REDUCING TIME AND IGNORING SPELLING ERROR IN SEARCH KEY BASED ON DOUBLE METAPHONE ALGORITHM
HIGH SPEED DATA RETRIEVAL FROM NATIONAL DATA CENTER (NDC) REDUCING TIME AND IGNORING SPELLING ERROR IN SEARCH KEY BASED ON DOUBLE METAPHONE ALGORITHM Md. Palash Uddin 1, Ashfaque Ahmed 2, Md. Delowar Hossain
More informationComplex, true real-time analytics on massive, changing datasets.
Complex, true real-time analytics on massive, changing datasets. A NoSQL, all in-memory enabling platform technology from: Better Questions Come Before Better Answers FinchDB is a NoSQL, all in-memory
More informationHurwitz ValuePoint: Predixion
Predixion VICTORY INDEX CHALLENGER Marcia Kaufman COO and Principal Analyst Daniel Kirsch Principal Analyst The Hurwitz Victory Index Report Predixion is one of 10 advanced analytics vendors included in
More informationDigging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA
Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of
More informationPredictive Analytics
Predictive Analytics How many of you used predictive today? 2015 SAP SE. All rights reserved. 2 2015 SAP SE. All rights reserved. 3 How can you apply predictive to your business? Predictive Analytics is
More informationData Management Implementation Plan
Appendix 8.H Data Management Implementation Plan Prepared by Vikram Vyas CRESP-Amchitka Data Management Component 1. INTRODUCTION... 2 1.1. OBJECTIVES AND SCOPE... 2 2. DATA REPORTING CONVENTIONS... 2
More informationACL WHITEPAPER. Automating Fraud Detection: The Essential Guide. John Verver, CA, CISA, CMC, Vice President, Product Strategy & Alliances
ACL WHITEPAPER Automating Fraud Detection: The Essential Guide John Verver, CA, CISA, CMC, Vice President, Product Strategy & Alliances Contents EXECUTIVE SUMMARY..................................................................3
More informationIntegrating SAP and non-sap data for comprehensive Business Intelligence
WHITE PAPER Integrating SAP and non-sap data for comprehensive Business Intelligence www.barc.de/en Business Application Research Center 2 Integrating SAP and non-sap data Authors Timm Grosser Senior Analyst
More informationIBM SPSS Modeler Premium
IBM SPSS Modeler Premium Improve model accuracy with structured and unstructured data, entity analytics and social network analysis Highlights Solve business problems faster with analytical techniques
More informationUsing LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset.
White Paper Using LSI for Implementing Document Management Systems Turning unstructured data from a liability to an asset. Using LSI for Implementing Document Management Systems By Mike Harrison, Director,
More informationEffecting Data Quality Improvement through Data Virtualization
Effecting Data Quality Improvement through Data Virtualization Prepared for Composite Software by: David Loshin Knowledge Integrity, Inc. June, 2010 2010 Knowledge Integrity, Inc. Page 1 Introduction The
More informationA Knowledge Management Framework Using Business Intelligence Solutions
www.ijcsi.org 102 A Knowledge Management Framework Using Business Intelligence Solutions Marwa Gadu 1 and Prof. Dr. Nashaat El-Khameesy 2 1 Computer and Information Systems Department, Sadat Academy For
More informationDemand Generation vs. Marketing Automation David M. Raab Raab Associates Inc.
Demand Generation vs. Marketing Automation David M. Raab Raab Associates Inc. Demand generation systems help marketers to identify, monitor and nurture potential customers. But so do marketing automation
More informationVictim Services Programs. Core Service Definitions
Victim Services Programs Core Service Definitions EFFECTIVE MAY 2012 1 P a g e Core Services Overview The Criminal Justice Coordinating Council (CJCC) strives to be a responsible and exemplary steward
More information[callout: no organization can afford to deny itself the power of business intelligence ]
Publication: Telephony Author: Douglas Hackney Headline: Applied Business Intelligence [callout: no organization can afford to deny itself the power of business intelligence ] [begin copy] 1 Business Intelligence
More informationData Mining Applications in Fund Raising
Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,
More informationCHAPTER 6: CRIMINAL PROCEDURE MICHIGAN COURT RULES OF 1985
CHAPTER 6: CRIMINAL PROCEDURE MICHIGAN COURT RULES OF 1985 Subchapter 6.000 General Provisions Rule 6.001 Scope; Applicability of Civil Rules; Superseded Rules and Statutes (A) Felony Cases. The rules
More informationLeveraging Big Data for the Next Generation of Health Care Ken Cunningham, VP Analytics Pam Jodock, Director Business Development
Leveraging Big Data for the Next Generation of Health Care Ken Cunningham, VP Analytics Pam Jodock, Director Business Development December 6, 2012 Health care spending to Reach 20% of U.S. Economy by 2020
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
More information<no narration for this slide>
1 2 The standard narration text is : After completing this lesson, you will be able to: < > SAP Visual Intelligence is our latest innovation
More informationKey Factors for Payers in Fraud and Abuse Prevention. Protect against fraud and abuse with a multi-layered approach to claims management.
White Paper Protect against fraud and abuse with a multi-layered approach to claims management. October 2012 Whether an act is technically labeled health insurance fraud or health insurance abuse, the
More informationNine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
More informationREQUEST FOR INFORMATION RFI No.: 15-0006 FOR FRAUD ANALYTICS SOFTWARE
REQUEST FOR INFORMATION RFI No.: 15-0006 FOR FRAUD ANALYTICS SOFTWARE This is a Request for Information (RFI) issued by Citizens Property Insurance Corporation ( Citizens ). Citizens is seeking market
More informationInsider Threat Detection Using Graph-Based Approaches
Cybersecurity Applications & Technology Conference For Homeland Security Insider Threat Detection Using Graph-Based Approaches William Eberle Tennessee Technological University weberle@tntech.edu Lawrence
More informationThe Power of Risk, Compliance & Security Management in SAP S/4HANA
The Power of Risk, Compliance & Security Management in SAP S/4HANA OUR AGENDA Key Learnings Observations on Risk & Compliance Management Current State Current Challenges The SAP GRC and Security Solution
More informationBuilding In-Database Predictive Scoring Model: Check Fraud Detection Case Study
Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study Jay Zhou, Ph.D. Business Data Miners, LLC 978-726-3182 jzhou@businessdataminers.com Web Site: www.businessdataminers.com
More informationIn-Database Analytics
Embedding Analytics in Decision Management Systems In-database analytics offer a powerful tool for embedding advanced analytics in a critical component of IT infrastructure. James Taylor CEO CONTENTS Introducing
More informationSubchapter 6.600 Criminal Procedure in District Court
Subchapter 6.600 Criminal Procedure in District Court Rule 6.610 Criminal Procedure Generally (A) Precedence. Criminal cases have precedence over civil actions. (B) Pretrial. The court, on its own initiative
More informationChapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
More informationLANCASTER COUNTY ADULT DRUG COURT
LANCASTER COUNTY ADULT DRUG COURT Administered by the Lancaster County Department of Community Corrections Judicial Oversight by the Lancaster County District Court www.lancaster.ne.gov keyword: drug court
More informationFirst Line of Defense
First Line of Defense SecureWatch ANALYTICS FIRST LINE OF DEFENSE OVERVIEW KEY BENEFITS Comprehensive Visibility Gain comprehensive visibility into DDoS attacks and cyber-threats with easily accessible
More informationGetting started with a data quality program
IBM Software White Paper Information Management Getting started with a data quality program 2 Getting started with a data quality program The data quality challenge Organizations depend on quality data
More informationSolve Your Toughest Challenges with Data Mining
IBM Software Business Analytics IBM SPSS Modeler Solve Your Toughest Challenges with Data Mining Use predictive intelligence to make good decisions faster Solve Your Toughest Challenges with Data Mining
More informationCloud Computing and Advanced Relationship Analytics
Cloud Computing and Advanced Relationship Analytics Using Objectivity/DB to Discover the Relationships in your Data By Brian Clark Vice President, Product Management Objectivity, Inc. 408 992 7136 brian.clark@objectivity.com
More informationData Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
More informationHexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
More informationIntegrating Netezza into your existing IT landscape
Marco Lehmann Technical Sales Professional Integrating Netezza into your existing IT landscape 2011 IBM Corporation Agenda How to integrate your existing data into Netezza appliance? 4 Steps for creating
More informationEasily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
More informationCourse Syllabus For Operations Management. Management Information Systems
For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third
More informationBig Data Integration: A Buyer's Guide
SEPTEMBER 2013 Buyer s Guide to Big Data Integration Sponsored by Contents Introduction 1 Challenges of Big Data Integration: New and Old 1 What You Need for Big Data Integration 3 Preferred Technology
More informationPredictive Analytics Workshop With IBM SPSS Modeler
Predictive Analytics Workshop With IBM SPSS Modeler Introduction What Makes a Smarter City? Objectives Smarter Public Safety with IBM The Power of Predictive Analytics What IBM Strives to Accomplish in
More informationFinancial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
More informationWHITEPAPER. Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk
WHITEPAPER Creating and Deploying Predictive Strategies that Drive Customer Value in Marketing, Sales and Risk Overview Angoss is helping its clients achieve significant revenue growth and measurable return
More informationMeeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions
Leveraging Risk & Compliance for Strategic Advantage IBM Information Management software Meeting Identity Theft Red Flags Regulations with IBM Fraud, Risk & Compliance Solutions XXX Astute financial services
More informationConnecting with clients through authentic interactions that not only satisfy their practical needs, but also their emotional
THE FAMILY LAW PROCESS AND ITS REQUIREMENTS Connecting with clients through authentic interactions that not only satisfy their practical needs, but also their emotional wants.! 1. ARRAIGNMENT: Arraignment
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationWHITEPAPER. Complying with the Red Flag Rules and FACT Act Address Discrepancy Rules
WHITEPAPER Complying with the Red Flag Rules and FACT Act Address Discrepancy Rules May 2008 2 Table of Contents Introduction 3 ID Analytics for Compliance and the Red Flag Rules 4 Comparison with Alternative
More informationSusan J Hyatt President and CEO HYATTDIO, Inc. Lorraine Fernandes, RHIA Global Healthcare Ambassador IBM Information Management
Accurate and Trusted Data- The Foundation for EHR Programs Susan J Hyatt President and CEO HYATTDIO, Inc. Lorraine Fernandes, RHIA Global Healthcare Ambassador IBM Information Management Healthcare priorities
More informationData Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over
More informationA Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems
A Simplified Framework for Data Cleaning and Information Retrieval in Multiple Data Source Problems Agusthiyar.R, 1, Dr. K. Narashiman 2 Assistant Professor (Sr.G), Department of Computer Applications,
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationData Quality Assessment. Approach
Approach Prepared By: Sanjay Seth Data Quality Assessment Approach-Review.doc Page 1 of 15 Introduction Data quality is crucial to the success of Business Intelligence initiatives. Unless data in source
More informationEnhancing Education Quality Assurance Using Data Mining. Case Study: Arab International University Systems.
Enhancing Education Quality Assurance Using Data Mining Case Study: Arab International University Systems. Faek Diko Arab International University Damascus, Syria f-diko@aeu.ac.sy Zaidoun Alzoabi Arab
More informationThe Informatica Solution for Improper Payments
The Informatica Solution for Improper Payments Reducing Improper Payments and Improving Fiscal Accountability for Government Agencies WHITE PAPER This document contains Confidential, Proprietary and Trade
More informationKnowledgent White Paper Series. Developing an MDM Strategy WHITE PAPER. Key Components for Success
Developing an MDM Strategy Key Components for Success WHITE PAPER Table of Contents Introduction... 2 Process Considerations... 3 Architecture Considerations... 5 Conclusion... 9 About Knowledgent... 10
More information