Statistics and Data Mining
|
|
- Eleanore Smith
- 8 years ago
- Views:
Transcription
1 Statistics and Data Mining A B M Shawkat Ali PowerPoint permissions Cengage Learning Australia hereby permits the usage and posting of our copyright controlled PowerPoint slide content for all courses wherein the associated text has been adopted. PowerPoint slides may be placed on course management systems that operate under a controlled environment (accessed restricted to enrolled students, instructors and content administrators). Cengage Learning Australia does not require a copyright clearance form for the usage of PowerPoint slides as outlined above. Copyright 2007 Cengage Learning Australia Pty Limited 1
2 Objectives Objectives On completion of this lecture you should know: What is Data Mining and how does it related with Statistics? The basic ramifications of Data Mining KDD, Data Query and Data Mining Basic understanding of PDCA cycle Current applications of Data Mining 2
3 Data mining: A definition Ask yourself: What is gold mining? 3
4 Data mining (DM) The process of employing one or more computer learning techniques to automatically analyze and extract knowledge from data- (Roiger and Geatz, 2003). Data mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data using machine learning, statistical and visualization techniques (Frawley et al., 1992). Many experts agree that data mining should not be automatic human intervention and interpretation is essential. 4
5 Knowledge discovery in databases (KDD) Data Mining (DM) is one step of the KDD process. DM is an information extraction process and KDD is making sense of the information. But now no distinction is made between the two. The application of the scientific method occurs in DM. 5
6 Steps of Data Mining
7 An example Example 1.1 A leading Australian supermarket chain employs a data mining expert to analyse local buying patterns. Analysis: When a customer buys honey on Friday or Sunday, they also usually buy bread. (cont.) 7
8 Observation: More people buy honey and bread together on Friday and Sunday. Business Benefit: The supermarket chain can use this information in various ways to increase revenue. For instance, they can move the bread shelf closer to the honey shelf and make sure that bread and honey are sold at full price during the weekend.
9 Example: Amazon.com purchase suggestion Amazon.com increased sales by 15%, using data/text mining generated purchase suggestions
10 Plan-Do-Check-Act (PDCA) cycle Figure 1.1 Plan-Do-Check-Act (PDCA) cycle of Scientific method 10
11 How Can We Do Data Mining?
12 Data mining lifecycle Problem identification Taking Action Collation of data Interpretation of the Discovered knowledge Act Plan Data preprocessing Choosing an algorithm Check Do Iteration Model construction and Evaluation Figure 1.2 KDD or data mining lifecycle in the framework of PDCA cycle. 12
13 Data mining and It s branches Statistics: The model is king (Hand) Data Mining: The data is king
14 Statistics vs. Data Mining: Concepts Feature Statistics Data Mining Type of Problem Well structured Unstructured / Semi-structured Inference Role Objective of the Analysis and Data Collection Size of data set Explicit inference plays great role in any analysis First objective formulation, and then - data collection Data set is small and hopefully homogeneous No explicit inference Data rarely collected for objective of the analysis/modeling Data set is large and data set is heterogeneous Paradigm/Approach Theory-based (deductive) Synergy of theory-based and heuristic-based approaches (inductive) Signal-to-Noise Ratio STNR > 3 0 < STNR <= 3 Type of Analysis Confirmative Explorative Number of variables Small Large
15 Statistics vs. Data Mining: Regression Modeling Feature Statistics Data Mining Number of inputs Small Large Type of inputs Multicollinearity Distributional assumptions, homoscedasticity, outliers, missing values Type of model Interval scaled and categorical with small number of categories (percentage of categorical variables is small) Wide range of degree of multicollinearity with intolerance to multicollinearity Intolerance to distributional assumption violation, homoscedasticity, Outliers/leverage points, missing values Linear / Non-linear / Parametric / Non- Parametric in low dimensional X- space (intolerance to uncharacterizable non-linearities) Any mixture of interval scaled, categorical, and text variables Severe multicollinearity is always there, tolerance to multicollinearity Tolerance to distributional assumption violation, outliers/leverage points, and missing values Non-linear and nonparametric in high dimensional X-space with tolerance to uncharacterizable non-linearities
16 Steps of DM: Problem identification The problem should be meaningful. We also need to set the level of expectation for the solution, say 80% or 98% satisfaction. Without business understanding and requirements, useful data mining cannot be done. 16
17 Collation of data: The problem definition provides us with the scope of relevant data. A data mining technique may require millions and often billions of cases of data. However, typically, a data mining technique is applied to a few hundred or a few thousand instances of data. 17
18 Data preprocessing: Is dependent on the source: If the data comes from a data warehouse, no preprocessing of data is usually required because the warehouse data has already been filtered, cleaned and missing values taken care of. For transactional data, it needs to be organised and cleaned such that a data mining technique can be readily applied. (cont.) 18
19 The data has to be made consistent across sources. For example, in one database male and female may be represented as M and F, and in another database it may be represented as 1 and 0. Such anomalies have to be removed and any representation has to be made uniform.
20 Algorithm selection: Now-a-days quite a good number of data mining algorithms are available for public use. In general, parametric algorithms are relatively more suited for the data mining task. This involves choosing the optimal parameters to receive the best solution. 20
21 Data processing: This may involve data normalisation, data transformation or data integration. Some algorithms cannot work with categorical data, some cannot work with numerical data, and yet, some others cannot work with either unless the values meet certain criteria. 21
22 Another important part of this task is data splitting, which is about deciding which part of data is to be used for model building (training data) and which part for model testing (test data). This step is identified as data preparation in CRISP-DM.
23 Model construction and evaluation: Model evaluation or testing is an important step for maximising the amount of information that can be extracted from the dataset. If we see the model performances to be unacceptable, we follow the iterative path of choosing a different data mining algorithm or having a different set of features from the dataset. 23
24 Discovering knowledge: Final stage of DM. Verify the quality of knowledge. If satisfied, go ahead for implementation. 24
25 Taking action: We may act based on the discovered knowledge, which could bring rewards. Taking action can simply mean applying the model to new instances. This step is identified as deployment in CRISP- DM. 25
26 Types of knowledge Shallow knowledge: It is simply what makes up a computers response. For example, we may learn that Australian Stock Exchange generally follows the lead of Wall Street, but we wouldn't necessarily know why. Deep knowledge: It is the underlying reason behind such relationships. For example, which gene is responsible for diabetics. 26
27 Steps of data mining for business Cross-Industry Standard Process for Data Mining (CRISP-DM): Business Understanding Data understanding Data Preparation Modelling Evaluation Deployment (cont.) 27
28 We identified 8 steps considering all possible applications of data mining including business sector. These 8 steps have been described within the framework of PDCA (Plan-Do-Check- Act) cycle highlighting the highly iterative aspect of the process. 28
29 Data query versus data mining Data Query A list of customers who used MasterCard to buy medicine from a pharmacy. A list of employees who will reach retiring age next year. A list of residents in a locality who became diabetic before reaching the age of 50. Find all customers who have purchased diapers. 29
30 Data Mining Develop a profile of MasterCard holders who will take advantage of the forthcoming sale promotion of the pharmacy. Develop a list of employees, who are likely to avail themselves of the voluntary early retirement scheme when they reach the retiring age. Construct some rules about the lifestyle of residents of a locality which may reduce the occurrence of diabetes at an early age. Find all items which are frequently purchased with diapers.
31 The learning process What is Learning? It s a process to gather knowledge. Four Levels of Learning: Facts - simple truths Concepts - relationships Procedures - algorithms Principles - all pervading truths 31
32 Types of learning Supervised Learning: Learning with the help of a supervisor Example 1.2 In a biomedical study, medical records for a set of healthy patients and a set of patients with heart disease have been collected. 32
33 The data mining technique to this study would be to learn what combination of attributes obesity, high-cholesterol, smoking habit, etc. characterises patients with heart disease and distinguishes them from healthy patients.
34 Types of learning (cont.) Table 1.1 Supervised learning data structure Obesity High- Cholesterol Smoker Class Patient 1 Yes Yes Yes Sick Patient m No No No Healthy 34
35 Types of learning (cont.) Unsupervised Learning Learning without a supervisor Example 1.3 A credit card company wants to promote credit card insurance. 35
36 Types of learning (cont.) Table 1.2 Unsupervised learning data structure Home Insurance Life insurance Income range Person 1 Yes Yes 50-60K Person m Yes No 40-50K 36
37 Reinforcement Learning Leaning from incidence Example 1.4 Some players have trouble arriving on time to the practice match. To lift the team spirit coach orders all the players to run 5 extra laps in the stadium. The coach claims that this application had to be given only once a year. 37
38 The history of data mining : First Generation of Data Mining. It was based on Statistics : Second Generation of Data Mining. First introduction of Artificial Intelligence (AI) in Data Mining. 1960s: Data Mining starts the real journey. The late 1960s saw the introduction of clustering techniques (Unsupervised Learning ) in the field of Information Retrieval onwards: Third Generation of Data Mining. People introduced better techniques by combining Statistics and AI. 38
39 Data mining strategies Classification Example 1.5 A bank wishes to determine the credit risk of a credit card applicant. The application is either approved or rejected. 39
40 Cont. Feature F1 Classification F2
41 Association Example 1.6 A leading supermarket chain had 100,000 point-of sale transactions last month. An association rule miner observes that 25,000 of these transactions include both banana and bread and 8,000 transactions include three items banana, bread and honey. 41
42 Cont.
43 Clustering Example 1.7 Clustering could be used by an insurance company to group important customers according to age, types of policies purchased, duration of membership, and prior claims history. 43
44 Cont.
45 Estimation Example 1.8 We are interested in estimating the blood sugar level of a new hospital patient. 45
46 Cont.
47 Novelty Detection Example 1.8 The heartbeat record of a healthy patient to an untrained eye is either plain noise or full of features or spikes. 47
48 Cont.
49 Sequence Detection Example 1.9 Thrombosis is a potential complication of collagen diseases. 49
50 Cont.
51 Popular data mining techniques Function Estimation-Based Algorithms: Neural Networks, Support Vector Machines etc. Lazy Learning-Based Algorithms: K-Nearest Neighbors, Lazy Bayesian Rules etc. Meta Learning-Based Algorithms: Adaboost, Bagging, and MetaCost etc. Probability-Based Algorithms: Naive Bayes, BayesNet etc. Tree-Based Algorithms: C4.5, Classification and Regression Tree (CART) and CHAID etc. 51
52 Neural Network
53 Support Vector Machine (SVM)
54 Decision Tree Outlook sunny rainy overcast humidity Yes windy high normal false true No Yes Yes No
55 Data mining applications Common with insurance agencies and banks. For example, Bank of America. Common in gambling industry. For example, Harrah s Entertainment Inc. Common with large businesses. For example, Wal-Mart. 55
56 Banking loan/credit card approval: Predict good customers based on old customer profiles. Customer relationship management (CRM): Identify those who are likely to leave for a competitor. Targeted marketing: Identify likely respondents to promotions. 56
57 Fraud detection telecommunications, financial transactions: Identify fraudulent transactions from an online stream of events. Manufacturing and production: Automatically adjust knobs when process parameter changes Medicine disease outcome, effectiveness of treatments: Analyse patient disease history: find relationship between disease and symptoms.
58 Molecular/Pharmaceutical: Identify new drugs. Scientific data analysis: Identify new galaxies by searching for sub clusters. Website/store design and promotion: Find preferences of website/store visitor and modify layout accordingly. 58
59 Challenges of data mining Size of dataset High dimensionality Over-fitting Missing and noisy data Rapidly changing data Mixed dataset Human intervention and interpretation 59
60 Future of data mining Credit risk assessment Customer relationship management Attrition of small business customers Early weather warning Stock price forecast Quick machinery fault detection Brain tumor prediction 60
61 These and other such issues are already seeing the introduction of data mining technology in their solution strategies. The long-term prospects are truly exciting. Data mining technology has already opened a new dimension in medical research. For example, a gene data analyst can tell us who has breast cancer and who does not.
62 Privacy in Data Mining Mining of public and government databases is done, though people have, and continue to raise concerns. Wiki quote: "data mining gives information that would not be available otherwise. It must be properly interpreted to be useful. When the data collected involves individual people, there are many questions concerning privacy, legality, and ethics."
63 Prevalence of Data Mining Your data is already being mined, whether you like it or not. Many web services require that you allow access to your information [for data mining] in order to use the service. Google mines data in Gmail accounts to present account owners with ads. Facebook requires users to allow access to info from non Facebook pages. Facebook privacy policy: "We may use information about you that we collect from other sources, including but not limited to newspapers and Internet sources such as blogs, instant messaging services and other users of Facebook, to supplement your profile. This allows access to your blog RSS feed (rather innocuous), as well as information obtained through partner sites (worthy of concern).
64 Key learning outcomes What is Data Mining? The basic ramifications of Data mining KDD, Data Query and Data Mining Basic Understanding of PDCA cycle Current Applications of Data Mining 64
65
Data Mining in Pharmaceutical Marketing and Sales Analysis. Andrew Chabak Rembrandt Group
Data Mining in Pharmaceutical Marketing and Sales Analysis Andrew Chabak Rembrandt Group 1 Contents What is Data Mining? Data Mining vs. Statistics: what is the difference? Why Data Mining is important
More informationIntroduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationData Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction
Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationData Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
More informationIntroduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
More informationFoundations of Artificial Intelligence. Introduction to Data Mining
Foundations of Artificial Intelligence Introduction to Data Mining Objectives Data Mining Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees Present
More informationIntroduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationChapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
More informationData Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
More informationData Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms
Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge
More informationIntroduction of Information Visualization and Visual Analytics. Chapter 4. Data Mining
Introduction of Information Visualization and Visual Analytics Chapter 4 Data Mining Books! P. N. Tan, M. Steinbach, V. Kumar: Introduction to Data Mining. First Edition, ISBN-13: 978-0321321367, 2005.
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationIndex Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
More informationMachine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationKnowledge-based systems and the need for learning
Knowledge-based systems and the need for learning The implementation of a knowledge-based system can be quite difficult. Furthermore, the process of reasoning with that knowledge can be quite slow. This
More informationData Mining On Diabetics
Data Mining On Diabetics Janani Sankari.M 1,Saravana priya.m 2 Assistant Professor 1,2 Department of Information Technology 1,Computer Engineering 2 Jeppiaar Engineering College,Chennai 1, D.Y.Patil College
More informationLecture 6 - Data Mining Processes
Lecture 6 - Data Mining Processes Dr. Songsri Tangsripairoj Dr.Benjarath Pupacdi Faculty of ICT, Mahidol University 1 Cross-Industry Standard Process for Data Mining (CRISP-DM) Example Application: Telephone
More information8. Machine Learning Applied Artificial Intelligence
8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationIn this presentation, you will be introduced to data mining and the relationship with meaningful use.
In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine
More informationData Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
More informationIntroduction to Artificial Intelligence G51IAI. An Introduction to Data Mining
Introduction to Artificial Intelligence G51IAI An Introduction to Data Mining Learning Objectives Introduce a range of data mining techniques used in AI systems including : Neural networks Decision trees
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationData Mining: Introduction
Data Mining: Introduction Introducing the course How the course is organized How students are evaluated Deadlines Data Mining [Chapt. 1 of course book] What is it about? The KDD process Relations to other
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationData Mining: An Introduction
Data Mining: An Introduction Michael J. A. Berry and Gordon A. Linoff. Data Mining Techniques for Marketing, Sales and Customer Support, 2nd Edition, 2004 Data mining What promotions should be targeted
More informationDiscovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III
www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed
More informationData Mining Techniques
15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses
More informationPredictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationData Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationMA2823: Foundations of Machine Learning
MA2823: Foundations of Machine Learning École Centrale Paris Fall 2015 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr TAs: Jiaqian Yu jiaqian.yu@centralesupelec.fr
More informationMS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
More informationUsing reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
More informationData Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationA STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
More informationMachine Learning, Data Mining, and Knowledge Discovery: An Introduction
Machine Learning, Data Mining, and Knowledge Discovery: An Introduction AHPCRC Workshop - 8/17/10 - Dr. Martin Based on slides by Gregory Piatetsky-Shapiro from Kdnuggets http://www.kdnuggets.com/data_mining_course/
More informationECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam
ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More information2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: simmibagga12@gmail.com) Dr. G.N. Singh Department of Physics and
More informationMachine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
More informationImportance or the Role of Data Warehousing and Data Mining in Business Applications
Journal of The International Association of Advanced Technology and Science Importance or the Role of Data Warehousing and Data Mining in Business Applications ATUL ARORA ANKIT MALIK Abstract Information
More informationData Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
More informationBOR 6335 Data Mining. Course Description. Course Bibliography and Required Readings. Prerequisites
BOR 6335 Data Mining Course Description This course provides an overview of data mining and fundamentals of using RapidMiner and OpenOffice open access software packages to develop data mining models.
More informationData Warehousing and Data Mining for improvement of Customs Administration in India. Lessons learnt overseas for implementation in India
Data Warehousing and Data Mining for improvement of Customs Administration in India Lessons learnt overseas for implementation in India Participants Shailesh Kumar (Group Leader) Sameer Chitkara (Asst.
More informationAnalytics on Big Data
Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis
More informationINTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR. ankitanandurkar2394@gmail.com
IJFEAT INTERNATIONAL JOURNAL FOR ENGINEERING APPLICATIONS AND TECHNOLOGY DATA MINING IN HEALTHCARE SECTOR Bharti S. Takey 1, Ankita N. Nandurkar 2,Ashwini A. Khobragade 3,Pooja G. Jaiswal 4,Swapnil R.
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationData Mining for Business Analytics
Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical
More informationMachine Learning Capacity and Performance Analysis and R
Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationData Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine
Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining
More informationLluis Belanche + Alfredo Vellido. Intelligent Data Analysis and Data Mining
Lluis Belanche + Alfredo Vellido Intelligent Data Analysis and Data Mining a.k.a. Data Mining II Office 319, Omega, BCN EET, office 107, TR 2, Terrassa avellido@lsi.upc.edu skype, gtalk: avellido Tels.:
More informationBanking Analytics Training Program
Training (BAT) is a set of courses and workshops developed by Cognitro Analytics team designed to assist banks in making smarter lending, marketing and credit decisions. Analyze Data, Discover Information,
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationnot possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
More informationPerspectives on Data Mining
Perspectives on Data Mining Niall Adams Department of Mathematics, Imperial College London n.adams@imperial.ac.uk April 2009 Objectives Give an introductory overview of data mining (DM) (or Knowledge Discovery
More informationData Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
More informationClass 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?
Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.
More informationFoundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
More informationTIETS34 Seminar: Data Mining on Biometric identification
TIETS34 Seminar: Data Mining on Biometric identification Youming Zhang Computer Science, School of Information Sciences, 33014 University of Tampere, Finland Youming.Zhang@uta.fi Course Description Content
More informationLearning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
More informationCustomer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
More informationWhat is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO
What is Data Mining? Data Mining (Knowledge discovery in database) Data Mining: "The non trivial extraction of implicit, previously unknown, and potentially useful information from data" William J Frawley,
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
More informationREVIEW ON PREDICTION SYSTEM FOR HEART DIAGNOSIS USING DATA MINING TECHNIQUES
International Journal of Latest Research in Engineering and Technology (IJLRET) ISSN: 2454-5031(Online) ǁ Volume 1 Issue 5ǁOctober 2015 ǁ PP 09-14 REVIEW ON PREDICTION SYSTEM FOR HEART DIAGNOSIS USING
More informationPredictive Modeling Techniques in Insurance
Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics
More informationLecture Slides for INTRODUCTION TO. ETHEM ALPAYDIN The MIT Press, 2004. Lab Class and literature. Friday, 9.00 10.00, Harburger Schloßstr.
Lecture Slides for INTRODUCTION TO Machine Learning ETHEM ALPAYDIN The MIT Press, 2004 alpaydin@boun.edu.tr http://www.cmpe.boun.edu.tr/~ethem/i2ml Lab Class and literature Friday, 9.00 10.00, Harburger
More informationDATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationQuick Introduction of Data Mining Techniques
Quick Introduction of Data Mining Techniques *Sources partially from Introduction to Data Mining, by P.-N. Tan, M. Steinbach, V. Kumar, Addison-Wesley, 2005. Main Data Mining Techniques Link Analysis Associations
More informationChapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
More informationComparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
More informationInner Classification of Clusters for Online News
Inner Classification of Clusters for Online News Harmandeep Kaur 1, Sheenam Malhotra 2 1 (Computer Science and Engineering Department, Shri Guru Granth Sahib World University Fatehgarh Sahib) 2 (Assistant
More informationHELSINKI UNIVERSITY OF TECHNOLOGY 26.1.2005 T-86.141 Enterprise Systems Integration, 2001. Data warehousing and Data mining: an Introduction
HELSINKI UNIVERSITY OF TECHNOLOGY 26.1.2005 T-86.141 Enterprise Systems Integration, 2001. Data warehousing and Data mining: an Introduction Federico Facca, Alessandro Gallo, federico@grafedi.it sciack@virgilio.it
More informationIntroduction to Data Mining
Introduction to Data Mining a.j.m.m. (ton) weijters (slides are partially based on an introduction of Gregory Piatetsky-Shapiro) Overview Why data mining (data cascade) Application examples Data Mining
More informationHow To Perform An Ensemble Analysis
Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier
More informationData Mining Applications in Manufacturing
Data Mining Applications in Manufacturing Dr Jenny Harding Senior Lecturer Wolfson School of Mechanical & Manufacturing Engineering, Loughborough University Identification of Knowledge - Context Intelligent
More informationWhat is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM
Relationship Management Analytics What is Relationship Management? CRM is a strategy which utilises a combination of Week 13: Summary information technology policies processes, employees to develop profitable
More informationHealthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
More informationMachine Learning: Overview
Machine Learning: Overview Why Learning? Learning is a core of property of being intelligent. Hence Machine learning is a core subarea of Artificial Intelligence. There is a need for programs to behave
More informationPrediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More informationPredictive Analytics Certificate Program
Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and
More informationCPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015
CPSC 340: Machine Learning and Data Mining Mark Schmidt University of British Columbia Fall 2015 Outline 1) Intro to Machine Learning and Data Mining: Big data phenomenon and types of data. Definitions
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationConcept and Applications of Data Mining. Week 1
Concept and Applications of Data Mining Week 1 Topics Introduction Syllabus Data Mining Concepts Team Organization Introduction Session Your name and major The dfiiti definition of dt data mining i Your
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More information