# Data Mining Algorithms Part 1. Dejan Sarka

Size: px
Start display at page:

Transcription

1 Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on #DW2015

2 Instructor Bio Dejan Sarka 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses Focus: Data modeling Data mining Data quality

3 Agenda Introduction Naïve Bayes Decision Trees Neural Network Logistic Regression Predictive Models Evaluation

4 Data Mining Algorithms Data mining as the most advanced data analysis technique is gaining popularity With modern data mining engines, products and packages, like SQL Server Analysis Services (SSAS), Excel and R, data mining has become a black box It is possible to use data mining without knowing how it works But: not knowing how the algorithms work might lead to many problems, including using the wrong algorithm for a task, misinterpretation of the results, and more Learn how the most popular data mining algorithms work When to use which algorithm Advantages and drawbacks of each algorithm Use the algorithms SSAS, Excel and R

5 Assumptions This is not an introduction to SQL Server, Excel or R Neither to the tools like Visual Studio, SQL Server Management Studio, Excel, or RStudio Basic familiarity with at least one of the tools assumed Focus is on the algorithms

6 What Is Data Mining? Michael J. A. Berry and Gordon S. Linoff: Data mining is the process of exploration and analysis, by automatic or semiautomatic means, of large quantities of data in order to discover patterns and rules Ralph Kimball: Data mining is a collection of powerful analysis techniques for making sense out of very large datasets Bill Inmon: Data Mining / Data Exploration is the usage of historical data to discover and exploit important business relationships

7 What Is Data Mining? Deduce knowledge by examining data and then make predictions on the knowledge extracted Examining data means scanning samples of known facts about cases using their attributes, which are called variables Knowledge is the patterns, clusters, decision trees, neural networks, association rules On-Line Analytical Processing (OLAP) is model driven, whereas data mining is data driven Alternative names include knowledge discovery in databases (KDD) and predictive analytics

8 The Two Types of Data Mining Directed (supervised) data mining (top-down approach) Classification Estimation Forecasting Undirected (unsupervised) data mining (bottomup approach) Affinity grouping Clustering Description

9 Typical Business Questions What s the credit risk of this customer? Are there any groups of my customers? What products do customers tend to buy together? How much of a specific product can I sell in the next time period? What is the potential number of customers shopping in this store? What are the major groups of my web-click customers? Is this a spam ?

10 Data Mining Tasks Cross-selling market basket analysis Order of items in a purchase might also be of some interest Fraud detection Churn detection Customer segmentation How is a website is used Forecasting

11 Data Mining Virtuous Cycle Identify Transform Measure Act

12 The CRISP Model Transform CRISP = Cross Industry Standard Process for Data Mining (

13 Data Mining Data Flow Model Browsing LOB Apps Historical Dataset ETL Reports Mining Models Prediction Cube Cube New Dataset

14 Different Types of Analyses Structured reports Some interaction, but not dynamic restructuring Can enable ad-hoc reports with a semantic model Structured groupings in OLAP Predefined grouping buckets Report structure is dynamic Structured attributes with data mining Predefined attributes Mining model calculates grouping and structure

15 SQL Server Tools SQL Server Analysis Services (SSAS) installed in Multidimensional and Data Mining mode SQL Server Integration Services (SSIS) Full-text search and semantic search

16 Excel Tools Microsoft Office Data Mining Add-ins Excel does not become a data mining engine Needs connection to SSAS in multidimensional mode Excel cell range or Excel table can be the data source Three add-ins: Data Mining Client for Excel Table Analysis Tools for Excel Data Mining Templates for Visio

17 Introducing R R is a free programming language and software environment for statistical computing and graphics Free under the GNU General Public License Pre-compiled binary versions are provided for various operating systems R uses a command line interface; however, several graphical user interfaces are available for use with R RStudio is a free and open source integrated development environment (IDE) for R

18 Naïve Bayes Naive Bayes quickly builds mining models that can be used for classification and prediction This makes the model a good option for exploring the data It calculates probabilities for each possible state of the input attribute, given each state of the predictable attribute The probabilities can later be used to predict an outcome of the predicted attribute based on the known input attributes Input attributes are treated as mutually independent

19 Naïve Bayes OK Faulty Judged Faulty: Judged OK: Actual Actual declared.56

20 Naïve Bayes OK Faulty.67 Reverse Tree After classification, the posterior probabilities are much more accurate than the prior probabilities Declared (prior) Actual (posterior) probabilities

21 Example Table of products with Color, Class, and Weight columns If Color is missing, 80% of Weight values are missing as well; if Class is missing, 60% of Weight values are missing as well 0.8 (Color missing for Weight missing) * 0.6 (Class missing for Weight missing) = (Color missing for Weight not missing) * 0.4 (Class missing for Weight not missing) = 0.08 The likelihood that Weight is missing is much higher than the likelihood it is not missing when Color and Class are unknown

22 Example You can convert the likelihoods to probabilities by normalizing their sum to 1: P (Weight missing if Color and Class are missing) = 0.48 / ( ) = P (Weight not missing if Color and Class are missing) = 0.08 / ( ) = 0.143

23 Naïve Bayes Usage Naive Bayes is used for classification Assign new cases to predefined classes Typical usage scenarios include: Categorizing bank loan applications Assigning customers to predefined segments Quickly obtaining a basic comprehension of the data by checking the correlation between input variables

24 Decision Trees Decision Trees assign (classify) each case to one of a few (discrete) broad categories of selected attribute (variable) and explains the classification with few selected input variables Once built, they are easy to understand They are used to predict values of the explained variable Recursive partitioning is used to build the tree Data is split into partitions and then split up more Initially all cases are in one big box

25 Decision Trees The algorithm tries all possible breaks in classes using all possible values of each input attribute; it then selects the split that partitions the data to the purest classes of the searched variable Uses several measures of purity, such as frequency distribution, entropy, and Bayesian scoring of prior / posterior probabilities It then repeats the splitting process for each new class, again testing all possible breaks The problem is where to stop

26 Decision Trees A common problem is over-fitting Not useful branches of the tree can be prepruned or post-pruned Pre-pruning methods try to stunt the growth of the tree before it grows too deep They test each node to see whether a further split would be useful; the tests can be simple (n of cases) or complicated (complexity penalty) Post-pruning methods allow the tree to grow and then prune off branches Again the test can be simple (n of cases) or more complex

27 Example Interview of the people who watched the famous Woodstock movie You have a population aged between 20 and 60 years old You gathered data about their education and ranged it into 7 classes (1 = lowest, 7 = highest) 55% of them liked the movie Can you discover the factors that have an influence on whether they liked the movie?

28 Example E D U C A T I O N L E V E L Liked 55% Y 45% N 45 A G E Y E A R S

29 Example E D U C A T I O N L E V E L Liked 55% Y 45% N A G E AGE 35- Liked 73% Y 27% N Liked 33% Y 67% N Y E A R S

30 Example E D U C A T I O N L E V E L Liked 55% Y 45% N 35+ AGE 35- A G E 35 Liked 73% Y 27% N Liked 33% Y 67% N EDUCATION Y E A R S 2 5 Liked 87% Y 13% N Liked 33% Y 67% N Liked 17% Y 83% N Liked 67% Y 33% N

31 Decision Trees Usage Decision Trees are used for classification and prediction Typical usage scenarios include: Predicting which customers will leave Targeting the audience for mailings and promotional campaigns Explaining the reasons for a decision Answering questions such as What movies do young female customers buy?

32 Neural Network A neural network is a data modeling tool that can capture and represent complex input/output relationships Neural networks resemble the human brain in the following two ways: It acquires knowledge through learning It s knowledge is stored within inter-neuron connection strengths known as synaptic weights The Neural Network algorithm explores more possible data relationships than the other algorithms

33 Neural Network Hidden Layer Output Non-Linear Function * Input Unit * Hyperbolic tangent function in hidden layer and sigmoid function in output layer Weighted Sum

34 Backpropagation Training a neural network is the process of setting the best weights on the inputs of each of the units The backpropagation process: Gets a training example and calculates outputs Calculate the errors the difference between the calculated and the expected (known) result Adjusts the weights to minimize the error

35 Logistic Regression Sigmoid function is called the logistic function as well If a neural network has only input neurons that are directly connected to the output neurons, it is logistic regression No hidden layer f x = tanh x = ex e x e x + e x 1 g x = σ x = 1 + e x

36 Logistic Regression and Neural Network Usage Like the Decision Trees algorithm, you can use the Neural Network and Logistic Regression algorithms for Classification Prediction E.g., risk analysis Interpretation is more complex Especially neural networks with many hidden layers Decision Trees algorithm is more popular

37 Evaluating Predictive Models Lift chart Profit chart Classification matrix Cross validation

38 Training and Test Sets For predictive models, you need to split the data into training and test sets in order to evaluate the models A training set is required to build the model (70% of the data) A test set is used for predictions (30% of the data) When you know the value of the predicted variable, you can measure the quality of the predictions As with every sampling, it is important to randomly select the data for each set

39 Lift Chart No target value: overall performance Target value: a percentage of the target audience against a specified percentage of the complete audience

40 Profit Chart Y = profit X = percentage of the population contacted Settings: Population Fixed Cost Individual Cost Revenue Per Individual

41 Classification Matrix Predicted Negative Positive Actual Negative A B Positive C D The accuracy (AC) is the proportion of the total number of predictions that were correct The recall or true positive rate (TP) is the proportion of positive cases that were correctly identified The false positive rate (FP) is the proportion of negative cases that were incorrectly classified as positive A C A D B A D B C D B D

42 Classification Matrix Predicted Negative Positive Actual Negative A B Positive C D The true negative rate (TN) is the proportion of negative cases that were classified correctly The false negative rate (FN) is the proportion of positive cases that were incorrectly classified as negative The precision (P) is the proportion of the predicted positive cases that were correct A C B A C D B D D

43 Cross Validation Cross validation shows the robustness of the models Splits training set in folds Use one fold for testing, others for training You can see how models perform over different subsets of data

44 Questions? Thank you! Join the conversation on #DW2015

### Social Media Mining. Data Mining Essentials

Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

### Prerequisites. Course Outline

MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,

### from Larson Text By Susan Miertschin

Decision Tree Data Mining Example from Larson Text By Susan Miertschin 1 Problem The Maximum Miniatures Marketing Department wants to do a targeted mailing gpromoting the Mythic World line of figurines.

### Data Mining with SQL Server Data Tools

Data Mining with SQL Server Data Tools Data mining tasks include classification (directed/supervised) models as well as (undirected/unsupervised) models of association analysis and clustering. 1 Data Mining

### Data Mining Techniques

15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

### Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

### Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal

Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether

### What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling

MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : msiris@cityu.edu.hk 1 Aims To introduce the basic concepts of data mining

### Data Mining Part 5. Prediction

Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification

### Introduction to Data Mining

Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

### Data Mining: Overview. What is Data Mining?

Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

### OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

### The Data Mining Process

Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

### Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened

### Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

Data Mining with SAS Mathias Lanner mathias.lanner@swe.sas.com Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA

### Data Mining and Predictive Modeling with Excel 2007

Spyridon Ganas Abstract With the release of Excel 2007 and SQL Server 2008, Microsoft has provided actuaries with a powerful and easy to use predictive modeling platform. This paper provides a brief overview

### Data Mining Solutions for the Business Environment

Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

### Classification and Prediction

Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser

### DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

### not possible or was possible at a high cost for collecting the data.

Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

### Customer Analytics. Turn Big Data into Big Value

Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data

### Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

### Data Mining: An Introduction

Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted

### International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

### Fraud Detection with the SQL Server Suite

Fraud Detection with the SQL Server Suite Author: Dejan Sarka Reviewer: Matija Lah July 2013 Table of Contents Management Summary... 3 Abstract... 5 Introduction... 5 The SolidQ Approach to Projects...

### Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification

Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati tnpatil2@gmail.com, ss_sherekar@rediffmail.com

2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously

### Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

### Data mining and statistical models in marketing campaigns of BT Retail

Data mining and statistical models in marketing campaigns of BT Retail Francesco Vivarelli and Martyn Johnson Database Exploitation, Segmentation and Targeting group BT Retail Pp501 Holborn centre 120

### Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

### Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier

International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing

### Data Mining is the process of knowledge discovery involving finding

using analytic services data mining framework for classification predicting the enrollment of students at a university a case study Data Mining is the process of knowledge discovery involving finding hidden

### Using Microsoft Dynamics CRM for Analytical CRM: A Curriculum Package for Business Intelligence or Data Mining Courses

Using Microsoft Dynamics CRM for Analytical CRM: A Curriculum Package for Business Intelligence or Data Mining Courses Huei Lee, Ph.D. Professor Department of Computer Information Systems College of Business

### Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

### Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734

Cleveland State University Tutorials for Project on Building a Business Analytic Model Using Data Mining Tool and Data Warehouse and OLAP Cubes IST 734 SS Chung 14 Build a Data Mining Model using Data

### COURSE SYLLABUS COURSE TITLE:

1 COURSE SYLLABUS COURSE TITLE: FORMAT: CERTIFICATION EXAMS: 55043AC Microsoft End to End Business Intelligence Boot Camp Instructor-led None This course syllabus should be used to determine whether the

### Evaluation & Validation: Credibility: Evaluating what has been learned

Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model

### Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

Brochure More information from http://www.researchandmarkets.com/reports/2170926/ Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd

### Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

### SQL Server 2005 Features Comparison

Page 1 of 10 Quick Links Home Worldwide Search Microsoft.com for: Go : Home Product Information How to Buy Editions Learning Downloads Support Partners Technologies Solutions Community Previous Versions

### Building Data Cubes and Mining Them. Jelena Jovanovic Email: jeljov@fon.bg.ac.yu

Building Data Cubes and Mining Them Jelena Jovanovic Email: jeljov@fon.bg.ac.yu KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the

### T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577

T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or

### This white paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, AS TO THE INFORMATION IN THIS DOCUMENT.

Data Mining Tutorial Seth Paul Jamie MacLennan Zhaohui Tang Scott Oveson Microsoft Corporation June 2005 Abstract: Microsoft SQL Server 2005 provides an integrated environment for creating and working

### Data Mining Applications in Higher Education

Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

### 8. Machine Learning Applied Artificial Intelligence

8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name

### Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

### FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS

FRAUD DETECTION IN ELECTRIC POWER DISTRIBUTION NETWORKS USING AN ANN-BASED KNOWLEDGE-DISCOVERY PROCESS Breno C. Costa, Bruno. L. A. Alberto, André M. Portela, W. Maduro, Esdras O. Eler PDITec, Belo Horizonte,

### TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

### Data Mining Classification: Decision Trees

Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous

### An Introduction to Data Mining

An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

### Chapter 7: Data Mining

Chapter 7: Data Mining Overview Topics discussed: The Need for Data Mining and Business Value The Data Mining Process: Define Business Objectives Get Raw Data Identify Relevant Predictive Variables Gain

### Analytics on Big Data

Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

### Data Mining for Fun and Profit

Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

### What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM

Relationship Management Analytics What is Relationship Management? CRM is a strategy which utilises a combination of Week 13: Summary information technology policies processes, employees to develop profitable

### Data Mining and Marketing Intelligence

Data Mining and Marketing Intelligence Alberto Saccardi 1. Data Mining: a Simple Neologism or an Efficient Approach for the Marketing Intelligence? The streamlining of a marketing campaign, the creation

### NEURAL NETWORKS IN DATA MINING

NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,

### White Paper. Data Mining for Business

White Paper Data Mining for Business January 2010 Contents 1. INTRODUCTION... 3 2. WHY IS DATA MINING IMPORTANT?... 3 FUNDAMENTALS... 3 Example 1...3 Example 2...3 3. OPERATIONAL CONSIDERATIONS... 4 ORGANISATIONAL

### Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

### COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

### SQL Server 2012 Business Intelligence Boot Camp

SQL Server 2012 Business Intelligence Boot Camp Length: 5 Days Technology: Microsoft SQL Server 2012 Delivery Method: Instructor-led (classroom) About this Course Data warehousing is a solution organizations

### Data mining techniques: decision trees

Data mining techniques: decision trees 1/39 Agenda Rule systems Building rule systems vs rule systems Quick reference 2/39 1 Agenda Rule systems Building rule systems vs rule systems Quick reference 3/39

### An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

### Data Mining - The Next Mining Boom?

Howard Ong Principal Consultant Aurora Consulting Pty Ltd Abstract This paper introduces Data Mining to its audience by explaining Data Mining in the context of Corporate and Business Intelligence Reporting.

### International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

### Oracle9i Data Warehouse Review. Robert F. Edwards Dulcian, Inc.

Oracle9i Data Warehouse Review Robert F. Edwards Dulcian, Inc. Agenda Oracle9i Server OLAP Server Analytical SQL Data Mining ETL Warehouse Builder 3i Oracle 9i Server Overview 9i Server = Data Warehouse

### Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

### How To Cluster

Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main

### A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model

A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased

### Nine Common Types of Data Mining Techniques Used in Predictive Analytics

1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

### Data Mining for Knowledge Management. Classification

1 Data Mining for Knowledge Management Classification Themis Palpanas University of Trento http://disi.unitn.eu/~themis Data Mining for Knowledge Management 1 Thanks for slides to: Jiawei Han Eamonn Keogh

### Data Warehousing and Data Mining in Business Applications

133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

### Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product

Data Mining Application in Direct Marketing: Identifying Hot Prospects for Banking Product Sagarika Prusty Web Data Mining (ECT 584),Spring 2013 DePaul University,Chicago sagarikaprusty@gmail.com Keywords:

### Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

### Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

### Role of Neural network in data mining

Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### Data Mining Fundamentals

Part I Data Mining Fundamentals Data Mining: A First View Chapter 1 1.11 Data Mining: A Definition Data Mining The process of employing one or more computer learning techniques to automatically analyze

### Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

### Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study

Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study Tongshan Chang The University of California Office of the President CAIR Conference in Pasadena 11/13/2008

### Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

### A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview

### Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778

Implementing Data Models and Reports with Microsoft SQL Server 2012 MOC 10778 Course Outline Module 1: Introduction to Business Intelligence and Data Modeling This module provides an introduction to Business

### Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr WEKA Gallirallus Zeland) australis : Endemic bird (New Characteristics Waikato university Weka is a collection

### GETTING AHEAD OF THE COMPETITION WITH DATA MINING

WHITE PAPER GETTING AHEAD OF THE COMPETITION WITH DATA MINING Ultimately, data mining boils down to continually finding new ways to be more profitable which in today s competitive world means making better

### ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

### Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

### Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

### Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

### Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

### Microsoft Data Warehouse in Depth

Microsoft Data Warehouse in Depth 1 P a g e Duration What s new Why attend Who should attend Course format and prerequisites 4 days The course materials have been refreshed to align with the second edition

### Business Intelligence: Real ROI Using the Microsoft Business Intelligence Platform. April 6th, 2006

Business Intelligence: Real ROI Using the Microsoft Business Intelligence Platform April 6th, 2006 Agenda Introduction Background Business Goals Microsoft Business Intelligence Platform Examples Conclusions

### How To Use Neural Networks In Data Mining

International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

### DATA MINING METHODS WITH TREES

DATA MINING METHODS WITH TREES Marta Žambochová 1. Introduction The contemporary world is characterized by the explosion of an enormous volume of data deposited into databases. Sharp competition contributes