Data Analytics and Business Intelligence (8696/8697)

Size: px
Start display at page:

Download "Data Analytics and Business Intelligence (8696/8697)"

Transcription

1 http: // togaware. com Copyright 2014, 1/39 Data Analytics and Business Intelligence (8696/8697) Introducing Data Mining Chief Data Scientist Australian Taxation Office Adjunct Professor, Australian National University Adjunct Professor, University of Canberra Fellow, Institute of Analytics Professionals of Australia

2 http: // togaware. com Copyright 2014, 2/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

3 Introductions http: // togaware. com Copyright 2014, 3/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

4 Introductions Beginning Data Mining http: // togaware. com Copyright 2014, 4/39 A Little History Database Research where to now? Machine Learning algorithms for symbolic learning. Statistics long heritage and role of sampling. Data Mining is one of the most important core technologies being developed today, in the same league as medical research, energy research, and environmental research. (David Hand, 2006) Data mining is a core technology underlying new developments in medicine, biotechnology, finance, environmental research.

5 Introductions Beginning Data Mining http: // togaware. com Copyright 2014, 5/39 Data Mining in Practise Data Mining research - CSIRO 1995 Data Mining in practise - Health Insurance Commission 1995 Data Mining projects with: Esanda Finance NRMA Mount Stromlo Health Insurance Commission Commonwealth Bank Department of Health Australian Taxation Office Australian Customs Service Department of Veteran Affairs...

6 Introductions Beginning Data Mining http: // togaware. com Copyright 2014, 5/39 Data Mining in Practise Data Mining research - CSIRO 1995 Data Mining in practise - Health Insurance Commission 1995 Data Mining projects with: Esanda Finance NRMA Mount Stromlo Health Insurance Commission Commonwealth Bank Department of Health Australian Taxation Office Australian Customs Service Department of Veteran Affairs...

7 Introductions Beginning Data Mining http: // togaware. com Copyright 2014, 5/39 Data Mining in Practise Data Mining research - CSIRO 1995 Data Mining in practise - Health Insurance Commission 1995 Data Mining projects with: Esanda Finance NRMA Mount Stromlo Health Insurance Commission Commonwealth Bank Department of Health Australian Taxation Office Australian Customs Service Department of Veteran Affairs...

8 Introductions Beginning Data Mining http: // togaware. com Copyright 2014, 5/39 Data Mining in Practise Data Mining research - CSIRO 1995 Data Mining in practise - Health Insurance Commission 1995 Data Mining projects with: Esanda Finance NRMA Mount Stromlo Health Insurance Commission Commonwealth Bank Department of Health Australian Taxation Office Australian Customs Service Department of Veteran Affairs...

9 Introductions Beginning Data Mining http: // togaware. com Copyright 2014, 6/39 Data Mining in Use Today Amazon What else should I buy ebay How to support sellers Facebook and LinkedIn Who should I be firends with? Google Who, what, when, how GoolgeNow Finance Bank products and fraud Health care Best treatment Insurance Fraud Government Better Services and Compliance

10 Setting the Scene http: // togaware. com Copyright 2014, 7/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

11 Setting the Scene http: // togaware. com Copyright 2014, 8/39 Data is Fundamental Sherlock Holmes: It is a capital mistake to theorize before one has data. Insensibly, one begins to twist facts to suit theories, instead of theories to suit facts. A Scandal in Bohemia (1891) Arthur Conan Doyle

12 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

13 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

14 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

15 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

16 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

17 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

18 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

19 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

20 Setting the Scene http: // togaware. com Copyright 2014, 9/39 The Economist, 21 September 2006 Combine dozens of clues to spot suspicious patterns in mountains of transactions. Car Insurance: the Monday morning fraudsters separating the genuine from the fraud Claimant nearly injured (driver side impact) low Low resale value high Low resale + own a luxury car low Low resale + own a luxury car expired insurance no Angry call after claim demanding action high Customer calls after the 20th of the month low But so many combinations!

21 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

22 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

23 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

24 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

25 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

26 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

27 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

28 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

29 Setting the Scene http: // togaware. com Copyright 2014, 10/39 Credit Card Fraud Association for Payment Clearing Services Buy petrol and pay by card using a machine hmmm... Soon after buy a diamond ring very high Purchase sports shoes marginally high Purchase two higher Teenage sizes high City has vibrant black market highest Professional fraudsters are innovative and dynamic we need to monitor and score individual cards and transactions. Block cards with sudden spikes in purchases.

30 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

31 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

32 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

33 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

34 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

35 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

36 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

37 Setting the Scene http: // togaware. com Copyright 2014, 11/39 Digital Footprints We leave behind us, every day, a growing digital footprint. Store Purchase - loyalty cards and credit cards Building Access Computer Login etoll Records Mobile Phone Cameras with sophisticated image recognition We need due diligence in collection and analysis of data privacy protocols.

38 Setting the Scene http: // togaware. com Copyright 2014, 12/39 Taxation Records - Historic Perspective Data Collection 5500 years ago Sumerian (Iraq) and Elam (Iran) peoples marked their tax records onto dried mud tablets. Data Analysis Since then people have sought ways to add value to this otherwise statically recorded information either for good or bad! Source

39 Setting the Scene http: // togaware. com Copyright 2014, 12/39 Taxation Records - Historic Perspective Data Collection 5500 years ago Sumerian (Iraq) and Elam (Iran) peoples marked their tax records onto dried mud tablets. Data Analysis Since then people have sought ways to add value to this otherwise statically recorded information either for good or bad! Source

40 Setting the Scene http: // togaware. com Copyright 2014, 12/39 Taxation Records - Historic Perspective Data Collection 5500 years ago Sumerian (Iraq) and Elam (Iran) peoples marked their tax records onto dried mud tablets. Data Analysis Since then people have sought ways to add value to this otherwise statically recorded information either for good or bad! Source

41 Terminology http: // togaware. com Copyright 2014, 13/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

42 Terminology http: // togaware. com Copyright 2014, 14/39 Machine Learning as Model Building Data Mining is a data driven approach to understanding the world. Deploys machine learning and statistical modelling approaches to build models from large data collections. We are aiming to build models of the real world, as an architect would build models to get a better understanding of how things will look.

43 Terminology http: // togaware. com Copyright 2014, 14/39 Machine Learning as Model Building Data Mining is a data driven approach to understanding the world. Deploys machine learning and statistical modelling approaches to build models from large data collections. We are aiming to build models of the real world, as an architect would build models to get a better understanding of how things will look.

44 Terminology http: // togaware. com Copyright 2014, 14/39 Machine Learning as Model Building Data Mining is a data driven approach to understanding the world. Deploys machine learning and statistical modelling approaches to build models from large data collections. We are aiming to build models of the real world, as an architect would build models to get a better understanding of how things will look.

45 Terminology http: // togaware. com Copyright 2014, 15/39 Data Mining as Model Building Build models of the world (regression, decision trees, neural networks, association rules, fuzzy systems, random forests, support vector machines, boosted stumps,... ) from data that represent snippets of information about the world. Use these models to understand and discover patterns of interest that may provide knowledge deployable in improving business processes. The non-trivial extraction of novel, implicit, and actionable knowledge from large databases and in a timely manner. Knowledge from data any way you can. Technology to enable data exploration, data analysis, and data visualisation of very large databases at a high level of abstraction, without a specific hypothesis in mind.

46 Terminology http: // togaware. com Copyright 2014, 15/39 Data Mining as Model Building Build models of the world (regression, decision trees, neural networks, association rules, fuzzy systems, random forests, support vector machines, boosted stumps,... ) from data that represent snippets of information about the world. Use these models to understand and discover patterns of interest that may provide knowledge deployable in improving business processes. The non-trivial extraction of novel, implicit, and actionable knowledge from large databases and in a timely manner. Knowledge from data any way you can. Technology to enable data exploration, data analysis, and data visualisation of very large databases at a high level of abstraction, without a specific hypothesis in mind.

47 Terminology http: // togaware. com Copyright 2014, 15/39 Data Mining as Model Building Build models of the world (regression, decision trees, neural networks, association rules, fuzzy systems, random forests, support vector machines, boosted stumps,... ) from data that represent snippets of information about the world. Use these models to understand and discover patterns of interest that may provide knowledge deployable in improving business processes. The non-trivial extraction of novel, implicit, and actionable knowledge from large databases and in a timely manner. Knowledge from data any way you can. Technology to enable data exploration, data analysis, and data visualisation of very large databases at a high level of abstraction, without a specific hypothesis in mind.

48 Terminology http: // togaware. com Copyright 2014, 15/39 Data Mining as Model Building Build models of the world (regression, decision trees, neural networks, association rules, fuzzy systems, random forests, support vector machines, boosted stumps,... ) from data that represent snippets of information about the world. Use these models to understand and discover patterns of interest that may provide knowledge deployable in improving business processes. The non-trivial extraction of novel, implicit, and actionable knowledge from large databases and in a timely manner. Knowledge from data any way you can. Technology to enable data exploration, data analysis, and data visualisation of very large databases at a high level of abstraction, without a specific hypothesis in mind.

49 Terminology http: // togaware. com Copyright 2014, 16/39 Data Mining Adds Value to Data Competitive business environment knowledge is power Improved knowledge for better decision support Volumes of data unexplored wealth of information Data sources accessible massive data warehouses Available data mining tools and computational power

50 Applications http: // togaware. com Copyright 2014, 17/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

51 Applications http: // togaware. com Copyright 2014, 18/39 Business Problems Customer Profiling Customer Segmentation Direct Marketing Customer Retention Basket Analysis Fraud Detection Compliance Adverse Drug Reactions...

52 Applications http: // togaware. com Copyright 2014, 19/39 Customer Relationship Management All-of-business view of customer E-commerce providing significant data for mining Amazon.com a data mining company! ATO (and other Government Departments) CRM through data mining

53 Applications http: // togaware. com Copyright 2014, 20/39 Fraud Disguised amongst the mass of data A small percentage of all transactions A small percentage of a very large budget 0.1% of $1billion = $1million NRMA: Fraud costs $70 per policy Dob in a fraud: Canberra Times 16 Aug 2003 ATO and serious non-compliance

54 Applications http: // togaware. com Copyright 2014, 21/39 Characteristics Extremely large databases Discovery of the non-obvious No specific hypothesis c.f. statistical hypothesis testing Useful knowledge to improve processes Too large to be done manually

55 Technologies http: // togaware. com Copyright 2014, 22/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

56 Technologies http: // togaware. com Copyright 2014, 23/39 Technologies Machine Learning Database Web Services High Performance Computers Data Mining Statistics Computational Mathematics Intelligent Systems

57 Technologies http: // togaware. com Copyright 2014, 24/39 A Framework for Data Mining A Formal Framework with Three Elements: Knowledge Representation - Language and Sentences Measure of Goodness Heuristic Search - near infinite search space

58 Technologies http: // togaware. com Copyright 2014, 25/39 Machine Learning Representing Knowledge Languages: Decision Trees Classification Rules Neural Networks Support Vector Machine K Nearest Neighbours Cluster Centroids Association Rules Heuristic search for the best sentences (models) in the language by some measure (criteria of best model) Age... Gender Distance Income... GP Visits $ Risk Age Hospital? Support Vectors

59 Technologies Statistics Supporting sampling and uncertainty and modelling: Data, Counting, Probabilities, Hypothesis testing Exploratory data analysis Predictive models: CART, MARS, PRIM Statistical thinking and avoiding data dredging Computational requirements sampling Data mining attempts to avoid sampling Data mining: hypothesis generation c.f. hypothesis testing Be aware of the Bonferroni correction: testing 2 hypotheses, then use a p-value threshold of http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 26/39

60 Technologies http: // togaware. com Copyright 2014, 27/39 Visualisation Data visualisation is key to understanding our data, its distribution, and models Exploratory data analysis has long history in Statistics, now significantly enhanced by computer science Visualisation in Data Mining Understanding the data Visualising the process Visualising the results of mining See Minard s visualisation of Napolean s march on Moscow.

61 Technologies http: // togaware. com Copyright 2014, 28/39 High Performance Computing Most algorithms are computationally expensive Very large datasets versus slow algorithms Parallel computers Multiprocessor computers Moving back to 64bit processors 32bit is limited to 4GB of data 64bit is limited to 16 EB (exabytes) (16,000 PB = 16,000,000 TB = 16,000,000,000 GB)

62 Technologies http: // togaware. com Copyright 2014, 29/39 Software Engineering SE Practices apply to DM practises Model development process Repeatability and Evidence Developing APIs for data and models SOAP Delivery of data mining through web services Interoperability through PMML (XML) Predictive Modelling Markup Language

63 Techniques http: // togaware. com Copyright 2014, 30/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

64 Techniques http: // togaware. com Copyright 2014, 31/39 Techniques Data mining can be descriptive or predictive. Descriptive: Identify human-interpretable patterns. Visualisation Concept Description Cluster Analysis Association Analysis Sequential Patterns Anomaly Detection Predictive: Predict unknown or future outcomes. Regression predict a continuous quantity Classification predict class of entities

65 Techniques http: // togaware. com Copyright 2014, 32/39 Concept Description Characterise and discriminate between different concepts (classes or groups) of entities (customers) Example: Amazon.com interested in the characteristics of people who tend to purchase many books so that they can provide a better service for them. Characterisation identifies general characteristics or features of a target class of data. Discrimination compares features of different classes for those that differentiate.

66 Techniques http: // togaware. com Copyright 2014, 33/39 Cluster Analysis Identify groups within data that maximise intraclass similarity and minimises interclass similarity. Example: Cluster all customers based on personal characteristics and products they ve purchased. Building models from unlabelled data: unsupervised learning.

67 Techniques http: // togaware. com Copyright 2014, 34/39 Association Analysis Identify attribute-value conditions that occur frequently together in a given set of data. Example: Age [20, 30] Income [$20K, $30K] MP3player Support=2%; Confidence=60%

68 Techniques http: // togaware. com Copyright 2014, 35/39 Anomaly Detection Also referred to as Outlier Analysis Identify entities that do not comply with the general behaviour or model of data. Exceptions to the rule! Example: Odd patterns can be easily hidden amongst 10 million transactions, but may be indicative of fraud.... but not likely find those who live at the edge

69 Techniques http: // togaware. com Copyright 2014, 36/39 Classification Find a model which describes and distinguishes data classes or concepts for the purpose of using the model to predict the class of previously unseen entities. Example: Build a model to predict whether customers are likely to purchase a download of a particular MP3 music file. Building models from labelled data: supervised learning.

70 Issues http: // togaware. com Copyright 2014, 37/39 Overview 1 Introductions 2 Setting the Scene 3 Data Mining Terminology 4 Data Mining Applications 5 Data Mining Technologies 6 Data Mining Techniques 7 Issues

71 Issues http: // togaware. com Copyright 2014, 38/39 Data Challenges 1 Scalability 2 Dimensionality 3 Complex and Heterogeneous Data 4 Data Quality 5 Data Matching and Linking 6 Privacy 7 Streaming: Anytime Data Mining give me the answer now

72 Summary Introducing Data Mining http: // togaware. com Copyright 2014, 39/39 Lecture Summary Data Mining is about Analysing Data; Technology that is all pervasive today; Draws on many disciplines; Key disciplines of Statistics and Machine Learning. This document, sourced from DataMiningL.Rnw revision 436, was processed by KnitR version 1.6 of and took 1.1 seconds to process. It was generated by gjw on nyx running Ubuntu LTS with Intel(R) Xeon(R) CPU 2.67GHz having 4 cores and 12.3GB of RAM. It completed the processing :22:53.

How To Understand Data Mining In R And Rattle

How To Understand Data Mining In R And Rattle http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 1/40 Data Analytics and Business Intelligence (8696/8697) Introducing Data Science with R and Rattle Graham.Williams@togaware.com Chief

More information

Delivery of an Analytics Capability

Delivery of an Analytics Capability Delivery of an Analytics Capability Ensembles and Model Delivery for Tax Compliance Graham Williams Senior Director and Chief Data Miner, Analytics Office of the Chief Knowledge Officer Australian Taxation

More information

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots? Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.

More information

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

More information

Data Science with R. Introducing Data Mining with Rattle and R. Graham.Williams@togaware.com

Data Science with R. Introducing Data Mining with Rattle and R. Graham.Williams@togaware.com http: // togaware. com Copyright 2013, Graham.Williams@togaware.com 1/35 Data Science with R Introducing Data Mining with Rattle and R Graham.Williams@togaware.com Senior Director and Chief Data Miner,

More information

Perspectives on Data Mining

Perspectives on Data Mining Perspectives on Data Mining Niall Adams Department of Mathematics, Imperial College London n.adams@imperial.ac.uk April 2009 Objectives Give an introductory overview of data mining (DM) (or Knowledge Discovery

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

A Data Mining Tutorial

A Data Mining Tutorial A Data Mining Tutorial Presented at the Second IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN 98) 14 December 1998 Graham Williams, Markus Hegland and Stephen

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Data Mining System, Functionalities and Applications: A Radical Review

Data Mining System, Functionalities and Applications: A Radical Review Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Data Warehousing and Data Mining for improvement of Customs Administration in India. Lessons learnt overseas for implementation in India

Data Warehousing and Data Mining for improvement of Customs Administration in India. Lessons learnt overseas for implementation in India Data Warehousing and Data Mining for improvement of Customs Administration in India Lessons learnt overseas for implementation in India Participants Shailesh Kumar (Group Leader) Sameer Chitkara (Asst.

More information

Data Analytics and Business Intelligence (8696/8697)

Data Analytics and Business Intelligence (8696/8697) http: // togaware. com Copyright 2014, Graham.Williams@togaware.com 1/34 Data Analytics and Business Intelligence (8696/8697) Introducing and Interacting with R Graham.Williams@togaware.com Chief Data

More information

Introduction to Data Mining

Introduction to Data Mining Bioinformatics Ying Liu, Ph.D. Laboratory for Bioinformatics University of Texas at Dallas Spring 2008 Introduction to Data Mining 1 Motivation: Why data mining? What is data mining? Data Mining: On what

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Data Mining: Benefits for business.

Data Mining: Benefits for business. Data Mining: Benefits for business. Data Mining enables businesses of all types and sizes to empower their data and gain valuable insight into what decisions to take. Since 2005 the Oversea-Chinese Banking

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Data Mining with SAS. Mathias Lanner mathias.lanner@swe.sas.com. Copyright 2010 SAS Institute Inc. All rights reserved.

Data Mining with SAS. Mathias Lanner mathias.lanner@swe.sas.com. Copyright 2010 SAS Institute Inc. All rights reserved. Data Mining with SAS Mathias Lanner mathias.lanner@swe.sas.com Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining a.j.m.m. (ton) weijters (slides are partially based on an introduction of Gregory Piatetsky-Shapiro) Overview Why data mining (data cascade) Application examples Data Mining

More information

AMIS 7640 Data Mining for Business Intelligence

AMIS 7640 Data Mining for Business Intelligence The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session

More information

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90

Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 FREE echapter C H A P T E R1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 percent of the data in the

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

Role of Social Networking in Marketing using Data Mining

Role of Social Networking in Marketing using Data Mining Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:

More information

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler

Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web

More information

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

More information

PLATYPUS SYMPOSIUM BIG DATA. Associate Professor Paul Kennedy University of Technology, Sydney

PLATYPUS SYMPOSIUM BIG DATA. Associate Professor Paul Kennedy University of Technology, Sydney PLATYPUS SYMPOSIUM BIG DATA Associate Professor Paul Kennedy University of Technology, Sydney Big Data Associate Professor Paul Kennedy School of Software Faculty of Engineering & IT, UTS UTS Centre for

More information

Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence. Introduction to Data Mining Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

More information

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Winter Semester 2010/2011 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper gamper@inf.unibz.it DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Machine Learning, Data Mining, and Knowledge Discovery: An Introduction

Machine Learning, Data Mining, and Knowledge Discovery: An Introduction Machine Learning, Data Mining, and Knowledge Discovery: An Introduction AHPCRC Workshop - 8/17/10 - Dr. Martin Based on slides by Gregory Piatetsky-Shapiro from Kdnuggets http://www.kdnuggets.com/data_mining_course/

More information

Data Mining and Machine Learning in Bioinformatics

Data Mining and Machine Learning in Bioinformatics Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg

More information

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing

Session 10 : E-business models, Big Data, Data Mining, Cloud Computing INFORMATION STRATEGY Session 10 : E-business models, Big Data, Data Mining, Cloud Computing Tharaka Tennekoon B.Sc (Hons) Computing, MBA (PIM - USJ) POST GRADUATE DIPLOMA IN BUSINESS AND FINANCE 2014 Internet

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining

Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining Introduction to Artificial Intelligence G51IAI An Introduction to Data Mining Learning Objectives Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees

More information

INTRODUCTION TO MACHINE LEARNING 3RD EDITION

INTRODUCTION TO MACHINE LEARNING 3RD EDITION ETHEM ALPAYDIN The MIT Press, 2014 Lecture Slides for INTRODUCTION TO MACHINE LEARNING 3RD EDITION alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml3e CHAPTER 1: INTRODUCTION Big Data 3 Widespread

More information

Machine Learning: Overview

Machine Learning: Overview Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave

More information

Data Mining for Business Analytics

Data Mining for Business Analytics Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

An Introduction to Advanced Analytics and Data Mining

An Introduction to Advanced Analytics and Data Mining An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit

More information

Obtaining Value from Big Data

Obtaining Value from Big Data Obtaining Value from Big Data Course Notes in Transparency Format technology basics for data scientists Spring - 2014 Jordi Torres, UPC - BSC www.jorditorres.eu @JordiTorresBCN Data deluge, is it enough?

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 7, July 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information

An Empirical Study of Application of Data Mining Techniques in Library System

An Empirical Study of Application of Data Mining Techniques in Library System An Empirical Study of Application of Data Mining Techniques in Library System Veepu Uppal Department of Computer Science and Engineering, Manav Rachna College of Engineering, Faridabad, India Gunjan Chindwani

More information

Machine Learning and Data Mining. Fundamentals, robotics, recognition

Machine Learning and Data Mining. Fundamentals, robotics, recognition Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Dan French Founder & CEO, Consider Solutions

Dan French Founder & CEO, Consider Solutions Dan French Founder & CEO, Consider Solutions CONSIDER SOLUTIONS Mission Solutions for World Class Finance Footprint Financial Control & Compliance Risk Assurance Process Optimization CLIENTS CONTEXT The

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Data Warehousing and Data Mining

Data Warehousing and Data Mining Data Warehousing and Data Mining Winter Semester 2012/2013 Free University of Bozen, Bolzano DM Lecturer: Mouna Kacimi mouna.kacimi@unibz.it http://www.inf.unibz.it/dis/teaching/dwdm/index.html Organization

More information

The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS?

The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS? Conclusions Paper The Use of Open Source Is Growing. So Why Do Organizations Still Turn to SAS? Insights from a presentation at the 2014 Hadoop Summit Featuring Brian Garrett, Principal Solutions Architect

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

Predictive Analytics Certificate Program

Predictive Analytics Certificate Program Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and

More information

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH

A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH 205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Data Mining Introduction

Data Mining Introduction Data Mining Introduction Organization Lectures Mondays and Thursdays from 10:30 to 12:30 Lecturer: Mouna Kacimi Office hours: appointment by email Labs Thursdays from 14:00 to 16:00 Teaching Assistant:

More information

Mobile Phone APP Software Browsing Behavior using Clustering Analysis

Mobile Phone APP Software Browsing Behavior using Clustering Analysis Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis

More information

Machine Learning Capacity and Performance Analysis and R

Machine Learning Capacity and Performance Analysis and R Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

It's All About Ensembles

It's All About Ensembles It's All About Ensembles Big Data Analytics Using Science resolves the whole into parts, the organism into organs, the obscure into the known. Science gives us knowledge, but only philosophy can give us

More information

Is a Data Scientist the New Quant? Stuart Kozola MathWorks

Is a Data Scientist the New Quant? Stuart Kozola MathWorks Is a Data Scientist the New Quant? Stuart Kozola MathWorks 2015 The MathWorks, Inc. 1 Facts or information used usually to calculate, analyze, or plan something Information that is produced or stored by

More information

Big Data (Adv. Analytics) in 15 Mins. Peter LePine Managing Director Sales Support IM & BI Practice

Big Data (Adv. Analytics) in 15 Mins. Peter LePine Managing Director Sales Support IM & BI Practice Big Data (Adv. Analytics) in 15 Mins. Peter LePine Managing Director Sales Support IM & BI Practice Agenda Big Data in 15 Mins. Goal: Provide a basic understanding of; What is Big Data; Why it s important

More information

Master of Science in Health Information Technology Degree Curriculum

Master of Science in Health Information Technology Degree Curriculum Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525

More information

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept Statistics 215b 11/20/03 D.R. Brillinger Data mining A field in search of a definition a vague concept D. Hand, H. Mannila and P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge. Some definitions/descriptions

More information

Anomaly Detection and Predictive Maintenance

Anomaly Detection and Predictive Maintenance Anomaly Detection and Predictive Maintenance Rosaria Silipo Iris Adae Christian Dietz Phil Winters Rosaria.Silipo@knime.com Iris.Adae@uni-konstanz.de Christian.Dietz@uni-konstanz.de Phil.Winters@knime.com

More information

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science TNS EX A MINE BehaviourForecast Predictive Analytics for CRM 1 TNS BehaviourForecast Why is BehaviourForecast relevant for you? The concept of analytical Relationship Management (acrm) becomes more and

More information

AMIS 7640 Data Mining for Business Intelligence

AMIS 7640 Data Mining for Business Intelligence The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2014, Session

More information

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining

Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.

More information

Statistics for BIG data

Statistics for BIG data Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before

More information

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012

Clustering Big Data. Anil K. Jain. (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Clustering Big Data Anil K. Jain (with Radha Chitta and Rong Jin) Department of Computer Science Michigan State University November 29, 2012 Outline Big Data How to extract information? Data clustering

More information

TIETS34 Seminar: Data Mining on Biometric identification

TIETS34 Seminar: Data Mining on Biometric identification TIETS34 Seminar: Data Mining on Biometric identification Youming Zhang Computer Science, School of Information Sciences, 33014 University of Tampere, Finland Youming.Zhang@uta.fi Course Description Content

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

A RESEARCH STUDY ON DATA MINING TECHNIQUES AND ALGORTHMS

A RESEARCH STUDY ON DATA MINING TECHNIQUES AND ALGORTHMS A RESEARCH STUDY ON DATA MINING TECHNIQUES AND ALGORTHMS Nitin Trivedi, Research Scholar, Manav Bharti University, Solan HP ABSTRACT The purpose of this study is not to delve deeply into the technical

More information

A Big Data Workshop introducing a smarter decision methodology for your organisation through advanced analytics and career opportunity

A Big Data Workshop introducing a smarter decision methodology for your organisation through advanced analytics and career opportunity Advanced Analytics Institute (AAI) http://analytics.uts.edu.au/index.html Big Data Workshop - Smarter Decisions and Careers in Data A Big Data Workshop introducing a smarter decision methodology for your

More information

White Paper. Data Mining for Business

White Paper. Data Mining for Business White Paper Data Mining for Business January 2010 Contents 1. INTRODUCTION... 3 2. WHY IS DATA MINING IMPORTANT?... 3 FUNDAMENTALS... 3 Example 1...3 Example 2...3 3. OPERATIONAL CONSIDERATIONS... 4 ORGANISATIONAL

More information

Challenges and Opportunities in Data Mining: Personalization

Challenges and Opportunities in Data Mining: Personalization Challenges and Opportunities in Data Mining: Big Data, Predictive User Modeling, and Personalization Bamshad Mobasher School of Computing DePaul University, April 20, 2012 Google Trends: Data Mining vs.

More information

Navigating Big Data business analytics

Navigating Big Data business analytics mwd a d v i s o r s Navigating Big Data business analytics Helena Schwenk A special report prepared for Actuate May 2013 This report is the third in a series and focuses principally on explaining what

More information

Assessing Data Mining: The State of the Practice

Assessing Data Mining: The State of the Practice Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality

More information

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010. Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 8-10 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

April 2016 JPoint Moscow, Russia. How to Apply Big Data Analytics and Machine Learning to Real Time Processing. Kai Wähner. kwaehner@tibco.

April 2016 JPoint Moscow, Russia. How to Apply Big Data Analytics and Machine Learning to Real Time Processing. Kai Wähner. kwaehner@tibco. April 2016 JPoint Moscow, Russia How to Apply Big Data Analytics and Machine Learning to Real Time Processing Kai Wähner kwaehner@tibco.com @KaiWaehner www.kai-waehner.de LinkedIn / Xing Please connect!

More information

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management

Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators

More information