# Data Mining: An Overview. David Madigan

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Data Mining: An Overview David Madigan

2 Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms: Model-Based Clustering Algorithms: Frequent Items and Association Rules Future Directions, etc.

3 Of Laws, Monsters, and Giants Moore s law: processing capacity doubles every 18 months : CPU, cache, memory It s more aggressive cousin: Disk storage capacity doubles every 9 months hat do the two laws combined produce? A rapidly growing gap between our ability to generate data, and our ability to make use of it. Disk TB Shipped per Year 1E+7 1E+6 1E+5 1E+4 1E Disk Trend (Jim Porter) disk TB growth: 112%/y EaByte Moore's Law: 58.7%/y

4 hat is Data Mining? Finding interesting structure in data Structure: refers to statistical patterns, predictive models, hidden relationships Eamples of tasks addressed by Data Mining Predictive Modeling (classification, regression) Segmentation (Data Clustering ) Summarization Visualization

5

6

7 Ronny Kohavi, ICML 1998

8 Data Mining Algorithms A data mining algorithm is a well-defined procedure that takes data as input and produces output in the form of models or patterns Hand, Mannila, and Smyth well-defined : can be encoded in software algorithm : must terminate after some finite number of steps

9 Algorithm Components 1. The task the algorithm is used to address (e.g. classification, clustering, etc.) 2. The structure of the model or pattern we are fitting to the data (e.g. a linear regression model) 3. The score function used to judge the quality of the fitted models or patterns (e.g. accuracy, BIC, etc.) 4. The search or optimization method used to search over parameters and/or structures (e.g. steepest descent, MCMC, etc.) 5. The data management technique used for storing, indeing, and retrieving data (critical when data too large to reside in memory)

10

11 Backpropagation data mining algorithm h 1 h 2 y 4 4! # i ii s = i i = 1 2 = 1 i s 1 = ;! " h! si ( s i ) = 1 (1 + e ) 4 2 y =! = w h i 1 i i 4 2 vector of p input values multiplied by p d 1 weight matri resulting d 1 values individually transformed by non-linear function resulting d 1 values multiplied by d 1 d 2 weight matri 1

12 Backpropagation (cont.) Parameters: " 1,," 4,! 1,,! 4, w1, w2 Score: SSSE =!( y( i) " n i= 1 yˆ( i)) 2 Search: steepest descent; search for structure?

13 Models and Patterns Models Prediction Probability Distributions Structured Data Linear regression Piecewise linear

14 Prediction Models Probability Distributions Structured Data Linear regression Piecewise linear Nonparamatric regression

15

16 Prediction Models Probability Distributions Structured Data Linear regression Piecewise linear Nonparametric regression Classification logistic regression naïve bayes/tan/bayesian networks NN support vector machines Trees etc.

17 Prediction Models Probability Distributions Structured Data Linear regression Piecewise linear Nonparametric regression Classification Parametric models Mitures of parametric models Graphical Markov models (categorical, continuous, mied)

18 Prediction Linear regression Piecewise linear Nonparametric regression Classification Models Probability Distributions Parametric models Mitures of parametric models Graphical Markov models (categorical, continuous, mied) Structured Data Time series Markov models Miture Transition Distribution models Hidden Markov models Spatial models

19 Markov Models First-order:! = " = T t t t t T y y p y p y y p ) ( ) ( ),, ( e.g.: ) ( 2 1 ep 2 1 ) (! " # \$ % & ' ' = ' ' ( )( t t t t y g y y y p g linear standard first-order auto-regressive model e y y t t + + =!1 0 " 1 " ) (0, ~ 2! N e y 1 y 2 y 3 y T

20 First-Order HMM/Kalman Filter T y 1 y 2 y 3 y T! = " = T t t t t t T T p y p y p p y y p ) ( ) ( ) ( ) ( ),,,,, ( Note: to compute p(y 1,,y T ) need to sum/integrate over all possible state sequences...

21 Bias-Variance Tradeoff High Bias - Low Variance Score function should embody the compromise Low Bias - High Variance overfitting - modeling the random component

22 The Curse of Dimensionality X ~ MVN p (0, I) Gaussian kernel density estimation Bandwidth chosen to minimize MSE at the mean Suppose want: E[( pˆ( )! p( )) 2 p( ) 2 < 0.1 = 0 Dimension # data points , ,000

23 Patterns Global Local Clustering via partitioning Hierarchical Clustering Miture Models Outlier detection Changepoint detection Bump hunting Scan statistics Association rules

24 Scan Statistics via Permutation Tests The curve represents a road Each marks an accident Red denotes an injury accident Black means no injury Is there a stretch of road where there is an unusually large fraction of injury accidents?

25 Scan with Fied indow If we know the length of the stretch of road that we seek, e.g., we could slide this window long the road and find the most unusual window location

26 How Unusual is a indow? Let p and p denote the true probability of being red inside and outside the window respectively. Let (,n ) and (,n ) denote the corresponding counts Use the GLRT for comparing H 0 : p = p versus H 1 : p p n n n n n n n n n n n n!!!! + +!! + +! + + = )] / ( [1 ) / ( )] / ( [1 ) / ( ))] ) /( (( [1 )] ) /( [( " 2 log λ here has an asymptotic chi-square distribution with 1df lambda measures how unusual a window is

27 Permutation Test Since we look at the smallest λ over all window locations, need to find the distribution of smallest-λ under the null hypothesis that there are no clusters Look at the distribution of smallest-λ over say 999 random relabellings of the colors of the s smallest-λ Look at the position of observed smallest-λ in this distribution to get the scan statistic p-value (e.g., if observed smallest-λ is 5 th smallest, p-value is 0.005)

28 Variable Length indow No need to use fied-length window. Eamine all possible windows up to say half the length of the entire road O = fatal accident O = non-fatal accident

29 Spatial Scan Statistics Spatial scan statistic uses, e.g., circles instead of line segments

30

31 Spatial-Temporal Scan Statistics Spatial-temporal scan statistic use cylinders where the height of the cylinder represents a time window

32 Other Issues Poisson model also common (instead of the bernoulli model) Covariate adjustment Andrew Moore s group at CMU: efficient algorithms for scan statistics

33 Software: SaTScan + others

34 Association Rules: Support and Confidence Customer buys beer Customer buys both Customer buys diaper Find all the rules Y Z with minimum confidence and support support, s, probability that a transaction contains {Y & Z} confidence, c, conditional probability that a transaction having Y also contains Z Transaction ID Items Bought 2000 A,B,C 1000 A,C 4000 A,D 5000 B,E,F Let minimum support 50%, and minimum confidence 50%, we have A C (50%, 66.6%) C A (50%, 100%)

35 Mining Association Rules An Eample Transaction ID Items Bought 2000 A,B,C 1000 A,C 4000 A,D 5000 B,E,F For rule A C: support = support({a &C}) = 50% confidence = support({a &C})/support({A}) = 66.6% The Apriori principle: Min. support 50% Min. confidence 50% Any subset of a frequent itemset must be frequent Frequent Itemset Support {A} 75% {B} 50% {C} 50% {A,C} 50%

36 Mining Frequent Itemsets: the Key Step Find the frequent itemsets: the sets of items that have minimum support A subset of a frequent itemset must also be a frequent itemset i.e., if {AB} is a frequent itemset, both {A} and {B} should be a frequent itemset Iteratively find frequent itemsets with cardinality from 1 to k (k-itemset) Use the frequent itemsets to generate association rules.

37 The Apriori Algorithm Join Step: C k is generated by joining L k-1 with itself Prune Step: Any (k-1)-itemset that is not frequent cannot be a subset of a frequent k-itemset Pseudo-code: C k : Candidate itemset of size k L k : frequent itemset of size k L 1 = {frequent items}; for (k = 1; L k!= ; k++) do begin C k+1 = candidates generated from L k ; for each transaction t in database do increment the count of all candidates in C k+1 that are contained in t L k+1 = candidates in C k+1 with min_support end return k L k ;

38 Database D TID Items The Apriori Algorithm Scan D itemset sup. {1} 2 {2} 3 {3} 3 {4} 1 {5} 3 Eample C 1 L 1 itemset sup {1 2} 1 {1 3} 2 {1 5} 1 {2 3} 2 {2 5} 3 {3 5} 2 C 2 C 2 L 2 itemset sup Scan D {1 3} 2 {2 3} 2 {2 5} 3 {3 5} 2 C 3 itemset Scan D L 3 {2 3 5} itemset sup {2 3 5} 2 itemset sup. {1} 2 {2} 3 {3} 3 {5} 3 itemset {1 2} {1 3} {1 5} {2 3} {2 5} {3 5}

39 Association Rule Mining: A Road Map Boolean vs. quantitative associations (Based on the types of values handled) buys(, SQLServer ) ^ buys(, DMBook ) buys(, DBMiner ) [0.2%, 60%] age(, ) ^ income(, K ) buys(, PC ) [1%, 75%] Single dimension vs. multiple dimensional associations (see e. Above) Single level vs. multiple-level analysis hat brands of beers are associated with what brands of diapers? Various etensions (thousands!)

40

### Data Mining: Foundation, Techniques and Applications

Data Mining: Foundation, Techniques and Applications Lesson 1b :A Quick Overview of Data Mining Li Cuiping( 李 翠 平 ) School of Information Renmin University of China Anthony Tung( 鄧 锦 浩 ) School of Computing

### Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

### Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.

### Association Rules Apriori Algorithm. Machine Learning Overview Sales Transaction and Association Rules Aprori Algorithm Example

Association Rules Apriori Algorithm Machine Learning Overview Sales Transaction and Association Rules Aprori Algorithm Example 1 Machine Learning Common ground of presented methods Statistical Learning

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

### ECS289- Data Mining Lecture Outline

ECS289- Data Mining Lecture Outline What is Data Mining? Absurdly Simple Student Grade Example Fundamental Tasks in Data Mining Course Structure Overview Emphasis on social networks and graphs Our aim

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### Data Mining. Nonlinear Classification

Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

### CART 6.0 Feature Matrix

CART 6.0 Feature Matri Enhanced Descriptive Statistics Full summary statistics Brief summary statistics Stratified summary statistics Charts and histograms Improved User Interface New setup activity window

### Yiming Peng, Department of Statistics. February 12, 2013

Regression Analysis Using JMP Yiming Peng, Department of Statistics February 12, 2013 2 Presentation and Data http://www.lisa.stat.vt.edu Short Courses Regression Analysis Using JMP Download Data to Desktop

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Neural Networks. CAP5610 Machine Learning Instructor: Guo-Jun Qi

Neural Networks CAP5610 Machine Learning Instructor: Guo-Jun Qi Recap: linear classifier Logistic regression Maximizing the posterior distribution of class Y conditional on the input vector X Support vector

### LCs for Binary Classification

Linear Classifiers A linear classifier is a classifier such that classification is performed by a dot product beteen the to vectors representing the document and the category, respectively. Therefore it

### Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.

### HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

### Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones

Introduction to machine learning and pattern recognition Lecture 1 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 What is machine learning? Data description and interpretation

### Data, Measurements, Features

Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

### KINGS COLLEGE OF ENGINEERING

KINGS COLLEGE OF ENGINEERING DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING ACADEMIC YEAR 2011-2012 / ODD SEMESTER SUBJECT CODE\NAME: CS1011-DATA WAREHOUSE AND DATA MINING YEAR / SEM: IV / VII UNIT I BASICS

### The Artificial Prediction Market

The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory

### Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

### Introduction to Machine Learning

Introduction to Machine Learning Prof. Alexander Ihler Prof. Max Welling icamp Tutorial July 22 What is machine learning? The ability of a machine to improve its performance based on previous results:

### Knowledge Discovery and Data Mining. Structured vs. Non-Structured Data

Knowledge Discovery and Data Mining Unit # 2 1 Structured vs. Non-Structured Data Most business databases contain structured data consisting of well-defined fields with numeric or alphanumeric values.

### Linear Model Selection and Regularization

Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### Machine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.

Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,

### Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction

Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration

### Introduction to Machine Learning. Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011

Introduction to Machine Learning Speaker: Harry Chao Advisor: J.J. Ding Date: 1/27/2011 1 Outline 1. What is machine learning? 2. The basic of machine learning 3. Principles and effects of machine learning

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Introduction to Data Mining

Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

### Data Preprocessing. Week 2

Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

### Perspectives on Data Mining

Perspectives on Data Mining Niall Adams Department of Mathematics, Imperial College London n.adams@imperial.ac.uk April 2009 Objectives Give an introductory overview of data mining (DM) (or Knowledge Discovery

### Regression analysis. MTAT Data Mining Anna Leontjeva

Regression analysis MTAT.03.183 Data Mining 2016 Anna Leontjeva Previous lecture Supervised vs. Unsupervised Learning? Previous lecture Supervised vs. Unsupervised Learning? Iris setosa Iris versicolor

### Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

### Introduction. A. Bellaachia Page: 1

Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

### Introduction to Learning & Decision Trees

Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing

### CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

### SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

### Analytics on Big Data

Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

### Principles of Dat Da a t Mining Pham Tho Hoan hoanpt@hnue.edu.v hoanpt@hnue.edu. n

Principles of Data Mining Pham Tho Hoan hoanpt@hnue.edu.vn References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,

### Basics of Statistical Machine Learning

CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

Lecture 15: Loess Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the lecture Introduce nonparametric regression: Bivariate

### Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví. Pavel Kříž. Seminář z aktuárských věd MFF 4.

Insurance Analytics - analýza dat a prediktivní modelování v pojišťovnictví Pavel Kříž Seminář z aktuárských věd MFF 4. dubna 2014 Summary 1. Application areas of Insurance Analytics 2. Insurance Analytics

### WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

### Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar

Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of

Neural Network Add-in Version 1.5 Software User s Guide Contents Overview... 2 Getting Started... 2 Working with Datasets... 2 Open a Dataset... 3 Save a Dataset... 3 Data Pre-processing... 3 Lagging...

### Data Mining Cluster Analysis: Advanced Concepts and Algorithms. ref. Chapter 9. Introduction to Data Mining

Data Mining Cluster Analysis: Advanced Concepts and Algorithms ref. Chapter 9 Introduction to Data Mining by Tan, Steinbach, Kumar 1 Outline Prototype-based Fuzzy c-means Mixture Model Clustering Density-based

### Data Mining for Model Creation. Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA 98370 paul.below@eds.

Sept 03-23-05 22 2005 Data Mining for Model Creation Presentation by Paul Below, EDS 2500 NE Plunkett Lane Poulsbo, WA USA 98370 paul.below@eds.com page 1 Agenda Data Mining and Estimating Model Creation

### Chapter 4 Data Mining A Short Introduction. 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Data Mining - 1

Chapter 4 Data Mining A Short Introduction 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Data Mining - 1 1 Today's Question 1. Data Mining Overview 2. Association Rule Mining

### Final Exam, Spring 2007

10-701 Final Exam, Spring 2007 1. Personal info: Name: Andrew account: E-mail address: 2. There should be 16 numbered pages in this exam (including this cover sheet). 3. You can use any material you brought:

### Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

### Privacy-preserving Data Mining: current research and trends

Privacy-preserving Data Mining: current research and trends Stan Matwin School of Information Technology and Engineering University of Ottawa, Canada stan@site.uottawa.ca Few words about our research Universit[é

### Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

### SAS Certificate Applied Statistics and SAS Programming

SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and

Server Load Prediction Suthee Chaidaroon (unsuthee@stanford.edu) Joon Yeong Kim (kim64@stanford.edu) Jonghan Seo (jonghan@stanford.edu) Abstract Estimating server load average is one of the methods that

### CLASSIFICATION AND CLUSTERING. Anveshi Charuvaka

CLASSIFICATION AND CLUSTERING Anveshi Charuvaka Learning from Data Classification Regression Clustering Anomaly Detection Contrast Set Mining Classification: Definition Given a collection of records (training

### An Introduction to Statistical Machine Learning - Overview -

An Introduction to Statistical Machine Learning - Overview - Samy Bengio bengio@idiap.ch Dalle Molle Institute for Perceptual Artificial Intelligence (IDIAP) CP 592, rue du Simplon 4 1920 Martigny, Switzerland

### Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

### Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

### Introduction to nonparametric regression: Least squares vs. Nearest neighbors

Introduction to nonparametric regression: Least squares vs. Nearest neighbors Patrick Breheny October 30 Patrick Breheny STA 621: Nonparametric Statistics 1/16 Introduction For the remainder of the course,

### Predict Influencers in the Social Network

Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

### Statistical Models in Data Mining

Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of

### Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

### Data Mining Practical Machine Learning Tools and Techniques

Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

### Local classification and local likelihoods

Local classification and local likelihoods November 18 k-nearest neighbors The idea of local regression can be extended to classification as well The simplest way of doing so is called nearest neighbor

### 3F3: Signal and Pattern Processing

3F3: Signal and Pattern Processing Lecture 3: Classification Zoubin Ghahramani zoubin@eng.cam.ac.uk Department of Engineering University of Cambridge Lent Term Classification We will represent data by

### Generalized Linear Models

Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the

### Statistical Models in R

Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova

### Data Mining Apriori Algorithm

10 Data Mining Apriori Algorithm Apriori principle Frequent itemsets generation Association rules generation Section 6 of course book TNM033: Introduction to Data Mining 1 Association Rule Mining (ARM)

### Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

### Association Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 6

Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web site. Topics Definition

### Data Mining Part 5. Prediction

Data Mining Part 5. Prediction 5.7 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Linear Regression Other Regression Models References Introduction Introduction Numerical prediction is

### Question 2 Naïve Bayes (16 points)

Question 2 Naïve Bayes (16 points) About 2/3 of your email is spam so you downloaded an open source spam filter based on word occurrences that uses the Naive Bayes classifier. Assume you collected the

### DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS

DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### CS 688 Pattern Recognition Lecture 4. Linear Models for Classification

CS 688 Pattern Recognition Lecture 4 Linear Models for Classification Probabilistic generative models Probabilistic discriminative models 1 Generative Approach ( x ) p C k p( C k ) Ck p ( ) ( x Ck ) p(

### EFFECTIVE USE OF THE KDD PROCESS AND DATA MINING FOR COMPUTER PERFORMANCE PROFESSIONALS

EFFECTIVE USE OF THE KDD PROCESS AND DATA MINING FOR COMPUTER PERFORMANCE PROFESSIONALS Susan P. Imberman Ph.D. College of Staten Island, City University of New York Imberman@postbox.csi.cuny.edu Abstract

### Why do statisticians "hate" us?

Why do statisticians "hate" us? David Hand, Heikki Mannila, Padhraic Smyth "Data mining is the analysis of (often large) observational data sets to find unsuspected relationships and to summarize the data

### Foundations of Artificial Intelligence. Introduction to Data Mining

Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present

### Classification and Regression Trees

Classification and Regression Trees Bob Stine Dept of Statistics, School University of Pennsylvania Trees Familiar metaphor Biology Decision tree Medical diagnosis Org chart Properties Recursive, partitioning

### Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm

Laboratory Module 8 Mining Frequent Itemsets Apriori Algorithm Purpose: key concepts in mining frequent itemsets understand the Apriori algorithm run Apriori in Weka GUI and in programatic way 1 Theoretical

### DATA ANALYTICS USING R

DATA ANALYTICS USING R Duration: 90 Hours Intended audience and scope: The course is targeted at fresh engineers, practicing engineers and scientists who are interested in learning and understanding data

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

### PROPOSAL TO INTRODUCE A NEW COURSE

PROPOSAL TO INTRODUCE A NEW COURSE (formerly known as subject) 1. COURSE DETAILS 1.1 Course ID COMP9318 1.2 Course name - Long Data Warehousing and Data Mining 1.3 Course name - Abbreviated Data Warehousing

### Chapter 20: Data Analysis

Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

### Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

### itesla Project Innovative Tools for Electrical System Security within Large Areas

itesla Project Innovative Tools for Electrical System Security within Large Areas Samir ISSAD RTE France samir.issad@rte-france.com PSCC 2014 Panel Session 22/08/2014 Advanced data-driven modeling techniques

### Learning is a very general term denoting the way in which agents:

What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

### 15.564 Information Technology I. Business Intelligence

15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

### What is Data mining?

STAT : DATA MIIG Javier Cabrera Fall Business Question Answer Business Question What is Data mining? Find Data Data Processing Extract Information Data Analysis Internal Databases Data Warehouses Internet

### Exploratory Data Analysis

Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

### Generating Gaussian Mixture Models by Model Selection For Speech Recognition

Generating Gaussian Mixture Models by Model Selection For Speech Recognition Kai Yu F06 10-701 Final Project Report kaiy@andrew.cmu.edu Abstract While all modern speech recognition systems use Gaussian

### Binary Logistic Regression

Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

### Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Clustering Algorithms. Data Mining Clustering. Distance. Example. More Than One Mean. Mean Clustering

Clustering Algorithms Data Mining Clustering Kevin Swingler Organise data into a number of distinct groups (clusters) according to the similarity of their members and their differences from other clusters