Foundations of Artificial Intelligence. Introduction to Data Mining
|
|
|
- Antony Charles
- 9 years ago
- Views:
Transcription
1 Foundations of Artificial Intelligence Introduction to Data Mining
2 Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present some real life data mining applications. Student should gain the knowledge on how these techniques are used for a range of problems.
3 Data Mining Overview of data mining Data mining tasks Classification (supervised learning) Clustering (unsupervised learning) Association rule discovery Summary Note: These lecture materials are based on invited lectures from Dr Li in the School of Computer Science, University of Nottingham, 2008.
4 Data Mining Data mining is the exploration and analysis of large quantities of data in order to discover valid, novel, potentially useful, and ultimately understandable patterns in data. o Valid: hold on new data with some certainty o Novel: non-obvious to the system o Useful: should be possible to act on the item o Understandable: humans should be able to interpret the pattern Also known as Knowledge Discovery in Databases (KDD)
5 Data Mining Draws ideas from machine learning/ai, pattern recognition, statistics, and database systems Statistics/ AI Machine Learning/Pattern Recognition Data Mining Database systems
6 Data Mining Integration Interpretation & Evaluation Raw Data DATA Ware house Target Data Transformed Data Patterns and Rules Knowledge
7 Data Mining What is not Data Mining? Look up phone number in phone directory Query a Web search engine for information about Amazon What is Data Mining? Certain names are more prevalent in certain US locations (O Brien, O Rurke, O Reilly in Boston area) Group together similar documents returned by search engine according to their context (e.g. Amazon rainforest, Amazon.com)
8 Data Mining Why data mining? Lots of data is being collected and warehoused Web data, e-commerce Purchases at department/ grocery stores Bank/Credit Card transactions Computers have become cheaper and more powerful Competition pressure is getting stronger Provide better, customized services (e.g. in Customer Relationship Management)
9 Data Mining Why data mining? Data collected and stored at enormous speeds Remote sensors on a satellite Telescopes scanning the sky Microarrays generating gene expression data Scientific simulations generating terabytes of data Traditional techniques infeasible for raw data Data mining may help scientists in Classifying and segmenting data Hypothesis Formation
10 Data Mining Applications Banking: loan/credit card approval predict good customers based on old customers Fraud detection: network security, financial transactions use historical data to build models of fraudulent behavior and use data mining to help identify similar instances Customer relationship management Which of my customers are likely the most loyal Which are most likely to leave for a competitor? Identify likely responders to sales promotions
11 Data Mining Applications Medicine: disease outcome, effectiveness of treatments analyze patient disease history & find relationship between diseases Astronomy: scientific data analysis identify new galaxies by the stages of formation Web site design and promotion: find affinity of visitor to pages and modify layout
12 Data Mining Overview of data mining Data mining tasks Classification (supervised learning) Clustering (unsupervised learning) Association rule discovery Summary
13 Data Mining Predictive: use some variables to predict unknown or future values of other variables Descriptive: Find human-interpretable patterns that describe the data Classification -- Predictive Clustering -- Descriptive Association rule discovery -- Descriptive
14 Data Mining Overview of data mining Data mining tasks Classification (supervised learning) Clustering (unsupervised learning) Association rule discovery Summary
15 Classification (supervised learning) Learn a method to predict the instance class from prelabeled (classified) instances Many approaches: Regression, Decision Trees, Neural Networks,...
16 Classification (supervised learning) Given a collection of records Each record contains a set of attributes One of the attributes is the class attribute Find a model for class attribute as a function of the values of other attributes
17 Classification (supervised learning) Goal: assign a class to unseen records correctly Approach Divide the given data set into training & test sets Use training set to build the model Use test set to validate the model
18 Classification (supervised learning) Goal: Predict class yi = f(x1, x2,.. Xn) Regression: (linear or any other polynomial) a*x1 + b*x2 + c = yi Decision trees: divide decision space into piecewise constant regions. Neural networks: partition by non-linear boundaries
19 Classification (supervised learning) Regression Predict the value of a given variable Y based on the values of other variables X assuming a linear or nonlinear model of dependency. Examples: Predicting sales amounts of new product based on advertising expenditure. Predicting wind velocities as a function of temperature, humidity, air pressure, etc. Time series prediction of stock market indices
20 Classification (supervised learning) Regression Linear Regression w 0 + w 1 x + w 2 y >= 0 Regression computes wi from data to minimize squared error to fit the data Not flexible enough
21 Classification (supervised learning) Decision Trees Y 3 if X > 5 then green else if Y > 3 then green else if X > 2 then red else green 2 5 X
22 Classification (supervised learning) Decision Trees Internal node: a simple decision rule on one or more attributes Leaf node: a predicted class label Salary < 1 M Prof = teacher Age < 30 Good Bad Bad Good
23 Classification (supervised learning) Decision Trees Pros + Reasonable training time + Easy to interpret + Easy to implement + Can handle large number of attributes Cons Simple decision boundaries Cannot handle complicated relationship between attributes Problems with lots of missing data
24 Classification (supervised learning) Neural Networks
25 Classification (supervised learning) Neural Networks Set of neurons connected by directed weighted edges x1 x2 x3 w2 w1 w3 Basic NN unit A more typical NN x1 x2 x3 Output nodes Hidden nodes
26 Classification (supervised learning) Neural Networks Useful for learning complex data like speech, image and handwriting recognition Decision boundaries: Linear regression Decision tree Neural network
27 Classification (supervised learning) Neural Networks Pros + Can learn more complicated class boundaries + Can be more accurate + Can handle large number of features Cons Slow training time Hard to interpret Hard to implement: trial and error for choosing parameters and network structure Can overfit the data: find patterns in random noise
28 Classification (supervised learning) Target marketing Applications Goal: Reduce cost of mailing by targeting consumers who are likely to buy a new cell-phone product. Approach: o Find the old data for a similar product. o Collect information of all customers. Business type, where they stay, how much they earn, o We know previous customers decision. This {buy, don t buy} decision forms the class attribute. o Use this information to learn a classifier model.
29 10 10 Classification (supervised learning) Banking Applications Tid Refund Marital Status Taxable Income Cheat Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No No Single 75K? Yes Married 50K? No Married 150K? Yes Divorced 90K? No Single 40K? No Married 80K? Test Set 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Training Set Learn Classifier Model
30 Classification (supervised learning) Applications Banking: Loan approval: given old data about customers and payments, predict new applicant s loan eligibility Previous customers Classifier Decision rules Age Salary Profession Location Customer type Salary > 50K Prof. = Exec Good/ bad New applicant s data
31 Classification (supervised learning) Applications Fraud detection: Goal: Predict fraudulent cases in credit card transactions. Approach: o Use credit card transactions and the information on its account-holder as attributes. When a customer buys, what he buys, how often he pays on time, etc o Label past transactions as fraud or fair. This forms the class attribute. o Learn a model for fraudulence. o Use this model to detect fraud by observing credit card transactions on an account.
32 Classification (supervised learning) Applications Customer relationship management Goal: To predict whether a customer is likely to be lost to a competitor. Approach: o Use detailed record of transactions of each past and present customers, to find attributes. How often the customer calls, where he calls, what time of the day he calls most, his financial status, marital status, etc. o Label the customers as loyal or disloyal. o Find a model for loyalty.
33 Classification (supervised learning) Applications Classifying galaxies Early Intermediate Late Data Size: 20M galaxies Over 10K images 600M pixels per image Class: Stages of Formation {Early, Intermediate, Late} Attributes (over 40): Image features Features such as light waves received,...
34 Data Mining Overview of data mining Data mining tasks Classification (supervised learning) Clustering (unsupervised learning) Association rule discovery Summary
35 Clustering Supervised Learning F(x): true function (usually not known) D: training sample (x, F(x)) 57,M,195,0,125,95,39,25,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0 0 78,M,160,1,130,100,37,40,1,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0 1 69,F,180,0,115,85,40,22,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0 0 18,M,165,0,110,80,41,30,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 0 54,F,135,0,115,95,39,35,1,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0 1 G(x): model learned from D 71,M,160,1,130,105,38,20,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0? Goal: E[(F(x)-G(x)) 2 ] is small (near zero) for future samples
36 Training dataset: Clustering Un-supervised Learning 57,M,195,0,125,95,39,25,0,1,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0 78,M,160,1,130,100,37,40,1,0,0,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0 69,F,180,0,115,85,40,22,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0 18,M,165,0,110,80,41,30,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 54,F,135,0,115,95,39,35,1,1,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,0,0,0 84,F,210,1,135,105,39,24,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0 89,F,135,0,120,95,36,28,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,1,0,0 49,M,195,0,115,85,39,32,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0 40,M,205,0,115,90,37,18,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 74,M,250,1,130,100,38,26,1,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0 77,F,140,0,125,100,40,30,1,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1, Test dataset: 71,M,160,1,130,105,38,20,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0?
37 Clustering Supervised vs. Un-supervised Learning Supervised y=f(x): true function D: labeled training set D: {x i, F(x i )} Learn: G(x): model trained to predict labels of new cases Goal: E[(F(x)-G(x)) 2 ] 0 Well defined criteria: Mean square error Un-supervised y=?: no true function D: unlabeled data set D: {x i } Learn?????????? Goal:?????????? Well defined criteria:??????????
38 Clustering (Un-supervised Learning) What we have: Data Set D Similarity/distance metric What we need to do: Find partitioning of data, or groups of similar/close items
39 Clustering (Un-supervised Learning) Find natural grouping of instances given unlabeled data
40 Clustering (Un-supervised Learning) Similarity Groups of similar customers Similar buying behavior Similar health Similar products Similar cost Similar function Similar store Similarity usually is domain/problem specific
41 Clustering (Un-supervised Learning) Distance functions Numeric data: Euclidean distance Manhattan distance Categorical data (0 / 1 indicating presence / absence): Hamming distance (# dissimilarity) Combined numeric and categorical data: weighted normalized distance
42 Clustering (Un-supervised Learning) Distance functions Consider two records x=(x 1,,x d ), y=(y 1,,y d ): Special cases: p=1: Manhattan distance p=2: Euclidean distance p p d d p p y x y x y x y x d... ), ( ), ( p x p y y x y x y x d ) (... ) ( ) ( ), ( d d y x y x y x y x d
43 Clustering (Un-supervised Learning) Distance functions Euclidean: dist(x,y) = ( ) = 5 x = (5,5) y = (9,8) Manhattan: dist(x,y) = 4+3 = 7
44 Clustering (Un-supervised Learning) Partition-based clustering: k-mean Goal: minimize sum of square of distance o Between each point and centers of the cluster. o Between each pair of points in the cluster Algorithm: Initialize K cluster centers Random K separated points o Repeat until stabilization: Assign each point to closest cluster center Generate new cluster centers Adjust clusters by merging or splitting Centers
45 Clustering (Un-supervised Learning) Density-based clustering A cluster: a connected dense component Density: the number of neighbors of a point Can find clusters of arbitrary shape
46 Clustering (Un-supervised Learning) Applications Market Segmentation: Goal: divide a market into distinct subsets of customers, where any subset may conceivably be selected as a market target. Approach: Collect different attributes of customers, based on their demographical and lifestyle related information. Find clusters of similar customers. Evaluate the clustering quality by observing buying patterns of customers in the same cluster vs. those from different clusters.
47 Clustering (Un-supervised Learning) Applications Document Clustering Clustering Points: 3204 Articles of Evening Post. Similarity Measure: How many words are common in these documents. Category Total Articles Financial 555 Foreign 341 National 273 Metro 943 Sports 738 Entertainment 354
48 Is ASIMO intelligent? ASIMO
49 Data Mining Overview of data mining Data mining tasks Classification (supervised learning) Clustering (unsupervised learning) Association rule discovery Summary
50 Association Rules Market Basket Analysis Co-occurrences 80% of all customers purchase items X, Y and Z together. Association rules 60% of all customers who purchase X and Y also buy Z. Sequential patterns 40% of customers who first buy X also purchase Y within three weeks.
51 Association Rules Market Basket Analysis Consider shopping cart filled with several items Market basket analysis tries to answer the following questions: Who makes purchases? What do customers buy together? In what order do customers purchase items?
52 Association Rules Giving a set of records, each of which contain some number of items Produce dependency rules, which predict occurrence of an item based on occurrences of other items. Market-Basket transactions TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke Example Association Rules {Diaper} {Beer}, {Beer, Bread} {Milk}, {Milk, Bread} {Eggs, Coke}
53 Association Rules Applications Supermarket shelf management. Goal: To identify items that are bought together by sufficiently many customers. Approach: Process the point-of-sale data collected with barcode scanners. Find dependencies among items. A classic rule: If a customer buys diaper and milk, then he is very likely to buy beer. So, don t be surprised if you find six-packs stacked next to diapers! Beer and Nappies
54 Summary Data Mining Introduced a range of data mining techniques used in AI systems including : Neural networks Decision trees Presented a range of data mining tasks and their applications : Classification Clustering Association rule
55 Can machine be intelligent?
56 Reading Material Data Mining: A Tutorial-based Primer. Richard Roiger and Michael Geatz Addison Wesley, (2002) In our library And many more from the internet
Introduction to Artificial Intelligence G51IAI. An Introduction to Data Mining
Introduction to Artificial Intelligence G51IAI An Introduction to Data Mining Learning Objectives Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees
Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web
Introduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining
Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.
Data Mining. Yeow Wei Choong Anne Laurent
Data Mining Yeow Wei Choong Anne Laurent Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused Web data, e-commerce purchases at department/ grocery stores Bank/Credit Card
Quick Introduction of Data Mining Techniques
Quick Introduction of Data Mining Techniques *Sources partially from Introduction to Data Mining, by P.-N. Tan, M. Steinbach, V. Kumar, Addison-Wesley, 2005. Main Data Mining Techniques Link Analysis Associations
Data Mining: Introduction
Data Mining: Introduction Introducing the course How the course is organized How students are evaluated Deadlines Data Mining [Chapt. 1 of course book] What is it about? The KDD process Relations to other
Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?
Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.
Data Mining on Social Networks. Dionysios Sotiropoulos Ph.D.
Data Mining on Social Networks Dionysios Sotiropoulos Ph.D. 1 Contents What are Social Media? Mathematical Representation of Social Networks Fundamental Data Mining Concepts Data Mining Tasks on Digital
DATA MINING - 1DL105, 1Dl111
1 DATA MINING - 1DL105, 1Dl111 Fall 2006 An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht06 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology,
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca
Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?
Data Mining Techniques
15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses
Marta Zorrilla Universidad de Cantabria
Tipos de problemas Marta Zorrilla Universidad de Cantabria Slides from Tan, P., Steinbach, M., Kumar, V. Introduction to data mining. Pearson Prentice Hall. 2006 Data Mining Tasks Prediction Methods Use
Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Data Mining and Machine Learning in Bioinformatics
Data Mining and Machine Learning in Bioinformatics PRINCIPAL METHODS AND SUCCESSFUL APPLICATIONS Ruben Armañanzas http://mason.gmu.edu/~rarmanan Adapted from Iñaki Inza slides http://www.sc.ehu.es/isg
CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining. Chengkai Li University of Texas at Arlington Spring 2016
CSE4334/5334 Data Mining Lecturer 2: Introduction to Data Mining Chengkai Li University of Texas at Arlington Spring 2016 Big Data http://dilbert.com/strip/2012-07-29 Big Data http://www.ibmbigdatahub.com/infographic/four-vs-big-data
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms
Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
Machine Learning: Overview
Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave
CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht [email protected] 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht [email protected] 539 Sennott
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Classification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
Data Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
Big Data. Introducción. Santiago González <[email protected]>
Big Data Introducción Santiago González Contenidos Por que BIG DATA? Características de Big Data Tecnologías y Herramientas Big Data Paradigmas fundamentales Big Data Data Mining
Machine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
Unsupervised Learning and Data Mining. Unsupervised Learning and Data Mining. Clustering. Supervised Learning. Supervised Learning
Unsupervised Learning and Data Mining Unsupervised Learning and Data Mining Clustering Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...
Chapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
Role of Neural network in data mining
Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)
A Data Mining Tutorial
A Data Mining Tutorial Presented at the Second IASTED International Conference on Parallel and Distributed Computing and Networks (PDCN 98) 14 December 1998 Graham Williams, Markus Hegland and Stephen
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
Introduction to Learning & Decision Trees
Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
Analytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
Classification Techniques (1)
10 10 Overview Classification Techniques (1) Today Classification Problem Classification based on Regression Distance-based Classification (KNN) Net Lecture Decision Trees Classification using Rules Quality
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
MBA 8473 - Data Mining & Knowledge Discovery
MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 1 Learning Objectives 55. Explain what is data mining? 56. Explain two basic types of applications of data mining. 55.1. Compare and contrast various
UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS
UNSUPERVISED MACHINE LEARNING TECHNIQUES IN GENOMICS Dwijesh C. Mishra I.A.S.R.I., Library Avenue, New Delhi-110 012 [email protected] What is Learning? "Learning denotes changes in a system that enable
Data Mining Fundamentals
Part I Data Mining Fundamentals Data Mining: A First View Chapter 1 1.11 Data Mining: A Definition Data Mining The process of employing one or more computer learning techniques to automatically analyze
Grid Density Clustering Algorithm
Grid Density Clustering Algorithm Amandeep Kaur Mann 1, Navneet Kaur 2, Scholar, M.Tech (CSE), RIMT, Mandi Gobindgarh, Punjab, India 1 Assistant Professor (CSE), RIMT, Mandi Gobindgarh, Punjab, India 2
Data Mining: Foundation, Techniques and Applications
Data Mining: Foundation, Techniques and Applications Lesson 1b :A Quick Overview of Data Mining Li Cuiping( 李 翠 平 ) School of Information Renmin University of China Anthony Tung( 鄧 锦 浩 ) School of Computing
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
from Larson Text By Susan Miertschin
Decision Tree Data Mining Example from Larson Text By Susan Miertschin 1 Problem The Maximum Miniatures Marketing Department wants to do a targeted mailing gpromoting the Mythic World line of figurines.
Knowledge Discovery in Databases (KDD) - Data Mining (DM)
Knowledge Discovery in Databases (KDD) - Data Mining (DM) Vera Goebel Department of Informatics, University of Oslo 2015 1! Why Data Mining / KDD? Applications Banking: loan/credit card approval predict
ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies
ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
Introduction to Data Mining
Bioinformatics Ying Liu, Ph.D. Laboratory for Bioinformatics University of Texas at Dallas Spring 2008 Introduction to Data Mining 1 Motivation: Why data mining? What is data mining? Data Mining: On what
Machine Learning CS 6830. Lecture 01. Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected]
Machine Learning CS 6830 Razvan C. Bunescu School of Electrical Engineering and Computer Science [email protected] What is Learning? Merriam-Webster: learn = to acquire knowledge, understanding, or skill
Data Mining Applications in Fund Raising
Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,
Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
Chapter 26: Data Mining
Chapter 6: Data Mining (Some slides courtesy of Rich Caruana, Cornell University) Definition Data mining is the exploration and analysis of large quantities of data in order to discover valid, novel, potentially
How To Cluster
Data Clustering Dec 2nd, 2013 Kyrylo Bessonov Talk outline Introduction to clustering Types of clustering Supervised Unsupervised Similarity measures Main clustering algorithms k-means Hierarchical Main
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
Hexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
Supervised and unsupervised learning - 1
Chapter 3 Supervised and unsupervised learning - 1 3.1 Introduction The science of learning plays a key role in the field of statistics, data mining, artificial intelligence, intersecting with areas in
Neural Networks and Back Propagation Algorithm
Neural Networks and Back Propagation Algorithm Mirza Cilimkovic Institute of Technology Blanchardstown Blanchardstown Road North Dublin 15 Ireland [email protected] Abstract Neural Networks (NN) are important
Lecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry
Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
Lecture 6 - Data Mining Processes
Lecture 6 - Data Mining Processes Dr. Songsri Tangsripairoj Dr.Benjarath Pupacdi Faculty of ICT, Mahidol University 1 Cross-Industry Standard Process for Data Mining (CRISP-DM) Example Application: Telephone
Principles of Dat Da a t Mining Pham Tho Hoan [email protected] [email protected]. n
Principles of Data Mining Pham Tho Hoan [email protected] References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,
Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction
Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
MBA - INFORMATION TECHNOLOGY MANAGEMENT (MBAITM) Term-End Examination December, 2014 MBMI-012 : BUSINESS INTELLIGENCE SECTION I
No. of Printed Pages : 8 I MBMI-012 I MBA - INFORMATION TECHNOLOGY MANAGEMENT (MBAITM) Term-End Examination December, 2014 Time : 3 hours Note : (i) (ii) (iii) (iv) (v) MBMI-012 : BUSINESS INTELLIGENCE
Cluster Analysis. Alison Merikangas Data Analysis Seminar 18 November 2009
Cluster Analysis Alison Merikangas Data Analysis Seminar 18 November 2009 Overview What is cluster analysis? Types of cluster Distance functions Clustering methods Agglomerative K-means Density-based Interpretation
Data Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
Data Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
Introduction to data mining. Example of remote sensing image analysis
Ocean's Big Data Mining, 2014 (Data mining in large sets of complex oceanic data: new challenges and solutions) 8-9 Sep 2014 Brest (France) Monday, September 8, 2014, 4:00 pm - 5:30 pm Introduction to
Practical Introduction to Machine Learning and Optimization. Alessio Signorini <[email protected]>
Practical Introduction to Machine Learning and Optimization Alessio Signorini Everyday's Optimizations Although you may not know, everybody uses daily some sort of optimization
NEURAL NETWORKS IN DATA MINING
NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
Nine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
Chapter 4 Data Mining A Short Introduction. 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Data Mining - 1
Chapter 4 Data Mining A Short Introduction 2006/7, Karl Aberer, EPFL-IC, Laboratoire de systèmes d'informations répartis Data Mining - 1 1 Today's Question 1. Data Mining Overview 2. Association Rule Mining
Statistics and Data Mining
Statistics and Data Mining A B M Shawkat Ali PowerPoint permissions Cengage Learning Australia hereby permits the usage and posting of our copyright controlled PowerPoint slide content for all courses
Machine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
Index Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
Neural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm [email protected] Rome, 29
ADVANCED MACHINE LEARNING. Introduction
1 1 Introduction Lecturer: Prof. Aude Billard ([email protected]) Teaching Assistants: Guillaume de Chambrier, Nadia Figueroa, Denys Lamotte, Nicola Sommer 2 2 Course Format Alternate between: Lectures
Data Mining for Business Analytics
Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical
Introduction to Data Mining
Introduction to Data Mining a.j.m.m. (ton) weijters (slides are partially based on an introduction of Gregory Piatetsky-Shapiro) Overview Why data mining (data cascade) Application examples Data Mining
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
