Data Mining Techniques and Opportunities for Taxation Agencies

Similar documents
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept

Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1

DATA MINING AND WAREHOUSING CONCEPTS

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining

Data Mining Algorithms Part 1. Dejan Sarka

ISSN: (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

Database Marketing, Business Intelligence and Knowledge Discovery

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: X DATA MINING TECHNIQUES AND STOCK MARKET

Big Data Strategies Creating Customer Value In Utilities

Use of Data Mining in Banking

Role of Social Networking in Marketing using Data Mining

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

Healthcare Measurement Analysis Using Data mining Techniques

INDIAN STATISTICAL INSTITUTE announces Training Program on Statistical Techniques for Data Mining & Business Analytics

The Data Mining Process

A Capability Model for Business Analytics: Part 2 Assessing Analytic Capabilities

not possible or was possible at a high cost for collecting the data.

Data Mining and Analytics in Realizeit

King Saud University

SPATIAL DATA CLASSIFICATION AND DATA MINING

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

Big Data: Rethinking Text Visualization

Data Mining Analytics for Business Intelligence and Decision Support

HELSINKI UNIVERSITY OF TECHNOLOGY T Enterprise Systems Integration, Data warehousing and Data mining: an Introduction

Data Mining + Business Intelligence. Integration, Design and Implementation

BUSINESS ANALYTICS. Overview. Lecture 0. Information Systems and Machine Learning Lab. University of Hildesheim. Germany

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining for Everyone

Importance or the Role of Data Warehousing and Data Mining in Business Applications

Data Mining Applications in Fund Raising

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Data Mining Techniques

Data Mining with SAS. Mathias Lanner Copyright 2010 SAS Institute Inc. All rights reserved.

DATA MINING TECHNIQUES AND APPLICATIONS

Big Data Analytics. An Introduction. Oliver Fuchsberger University of Paderborn 2014

FREQUENT PATTERN MINING FOR EFFICIENT LIBRARY MANAGEMENT

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

Data Mining Applications in Higher Education

Introduction to Data Mining

Assessing Data Mining: The State of the Practice

White Paper. Data Mining for Business

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

QUANTIFIED THE IMPACT OF AGILE. Double your productivity. Improve Quality by 250% Balance your team performance. Cut Time to Market in half

CUSTOMER Presentation of SAP Predictive Analytics

Social Media Mining. Data Mining Essentials

An Introduction to Data Mining

DEMYSTIFYING BIG DATA. What it is, what it isn t, and what it can do for you.

Knowledge Based Descriptive Neural Networks

Customer Analytics. Turn Big Data into Big Value

This Symposium brought to you by

Journal of Management Systems

Lluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining. Data Analysis and Knowledge Discovery

Establishing Production Capability (PC) to use Automated Data Analysis Tools and Resources

An Overview of Database management System, Data warehousing and Data Mining

Data Mining System, Functionalities and Applications: A Radical Review

Foundations of Business Intelligence: Databases and Information Management

Numerical Algorithms Group. Embedded Analytics. A cure for the common code. Results Matter. Trust NAG.

Data Warehousing and Data Mining for improvement of Customs Administration in India. Lessons learnt overseas for implementation in India

A Brief Tutorial on Database Queries, Data Mining, and OLAP

Learning outcomes. Knowledge and understanding. Competence and skills

Data Mining Analysis of a Complex Multistage Polymer Process

Challenges of Analytics

Statistics for BIG data

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Transforming the Telecoms Business using Big Data and Analytics

Data mining techniques: decision trees

Information Management course

Numerical Algorithms Group

DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM

Introduction to Data Mining Techniques

Chapter 12 Discovering New Knowledge Data Mining

Gerard Mc Nulty Systems Optimisation Ltd BA.,B.A.I.,C.Eng.,F.I.E.I

How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK

DATA EXPERTS MINE ANALYZE VISUALIZE. We accelerate research and transform data to help you create actionable insights

Data Mining and KDD: A Shifting Mosaic. Joseph M. Firestone, Ph.D. White Paper No. Two. March 12, 1997

How can we discover stocks that will

Data Mining Techniques in CRM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

Cleaned Data. Recommendations

Data Mining: Motivations and Concepts

Data Mining for Model Creation. Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA

Use of Data Mining in the field of Library and Information Science : An Overview

From Chaos to Clarity.

Using Data Mining for Mobile Communication Clustering and Characterization

Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing

Data Mining III: Numeric Estimation

Data Mining Part 5. Prediction

Perspectives on Data Mining

Transcription:

Data Mining Techniques and Opportunities for Taxation Agencies Florida Consultant In This Session... You will learn the data mining techniques below and their application for Tax Agencies ABC Analysis Association Analysis Clustering Decision Trees Score Carding Techniques You will be provided with descriptions of realworld usage and ideas for practical application in your agency You will hear examples of their usage from the State of Florida 1

Agenda What is Data Mining? Data Mining Techniques How can we use Data Mining in our Agency? Usage in Florida Recap/Closing 2 A Few Common Questions Data mining still holds a mythical reputation Is it very complicated? Advanced techniques still are on paper! Conceptually, techniques are still a challenge. Application is usually not. What can data mining do? Turn tomes of data into small, digestible, ideally actionable amounts of information What staff do I need to mine data? In the past it was the actuarial staff, select financial staff, and PhDs. Tools of today allow the process Power Users to data mine effectively. 3

What is Data Mining? The Answer! Definition: the science of extracting useful information from large data sets or databases. D. Hand, H. Mannila, P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge, MA. the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. W. Frawley and G. Piatetsky-Shapiro and C. Matheus (Fall 1992). "Knowledge Discovery in Databases: An Overview the statistical and logical analysis of large sets of data, looking for patterns that can aid decision making. Ellen Monk, Bret Wagner (2006). Concepts in Enterprise Resource Planning, Second Edition. Thomson Course Technology, Boston, MA Intentionally using statistical techniques on a large amount of data to either discover new insight or to convert it into actionable information. - Reporting and Analytics 2008 Conference 4 4 What is data mining? Vital Concepts Focused Ideas Especially Goal Data Quality/Integrity Discovery vs. Predictive Training, Benchmarking, and Prediction 5

Agenda What is Data Mining? Data Mining Techniques How can we use Data Mining in our Agency? Usage in Florida Recap/Closing 6 ABC Analysis - Concept ABC Analysis is similar to the Pareto Principle 80/20 Rule The Law of Vital Few The Principle of Factor Sparsity ABC Analysis is primarily used as a discovery technique. Concept: 80% of the outcome come from 20% of the root causes Call Center: 80% of all incoming calls are in reference to 20% of the potential reasons for calling in Economics: 80% of a country s wealth is owned by 20% of the population.* 7 7

ABC Analysis Concept (cont.) ABC Analysis is a slight deviation of the Pareto Principle Began as an inventory classification technique and commonly referenced in Supply Chain applications ABC Codes break down as following: A Class contains 60% of total value B Class contains 20% of total value C Class accounts for the remaining 20% of total value Classification codes are very process specific and likely have a different meaning across each process 8 Association Analysis - Concept Association Analysis unveils hidden interactions, correlations, and patterns in your data. Affinity Analysis Market Basket Analysis (MBA) Concept: Uncover patterns in the data and then define rules to the interaction. If a person purchases onions and potatoes, they are 50% more likely to also purchase beef. {Onions, Potatoes} {Beef} 9

Association Analysis Concept (cont.) Transactions determine the rules The rules should change over time - moving timeframe Association Analysis is typically augmented with other techniques if automated 10 Clustering - Concept Clustering is statistically separating records into smaller unique groups based on common similarities Also known as a Classification Problem Clustering can be used for discovery or prediction. The two primary forms of clusters are Hierarchical and Partitional Raw Data Clustered Data 11

Clustering Concept (cont.) Hierarchical Clusters 12 Clustering Concept (cont.) Partitional Clusters 13 13

Decision Trees An easy to understand method that classifies possible outcomes from previous decisions bases on characteristics in the data and develops rules to predict the future outcomes. Once a tree is trained and created it can be used on new data to predict decisions Trees can even be partially defined to meet regulatory requirements 14 Decision Trees Concept (cont.) Decision Trees can also be used in two ways Discovery to find new insight into the data Prediction to statistically guess what the outcome will be (requires a trained decision tree) A prediction tree can automate business processes, but, must be periodically retrained Commonly used in Insurance Legal Sales/Marketing 15

Score Carding Techniques Score Carding is a technique in which a valuation function is applied to value (score) a data set Approximation The two most common techniques are weighted scoring and regression Weighted Scoring is a manually defined score Regression (either linear or non-linear) is an automated method for scoring 16 Score Carding Techniques (cont.) Weighted Average Linear Regression Non-Linear Regression 17

Agenda What is Data Mining? Data Mining Techniques How can we use Data Mining in our Agency? Usage in Florida Recap/Closing 18 How can we use Data Mining? 19

How can we use Data Mining? 20 Agenda What is Data Mining? Data Mining Techniques How can we use Data Mining in our Agency? Usage in Florida Recap/Closing /Eric Wojcik 21

Florida s Decision Tree Distributions 22 Florida s Lead Development 23

Agenda What is Data Mining? Data Mining Techniques How can we use Data Mining in our Agency? Usage in Florida Recap/Closing /Eric Wojcik 24 Questions 25