Knowledge Discovery and Data Mining
|
|
|
- Augusta Cameron
- 9 years ago
- Views:
Transcription
1 Knowledge Discovery and Data Mining Data Mining in a nutshell Dr. Osmar R. Zaïane Fall 2007 Extract interesting knowledge (rules, regularities, patterns, constraints) from data in large collections. Knowledge Data Actionable knowledge KDD at the Confluence of Many Disciplines DBMS Query processing Datawarehousing OLAP Database Systems Artificial Intelligence Machine Learning Neural Networks Agents Knowledge Representation Indexing Inverted files Information Retrieval High Performance Computing Statistics Visualization Computer graphics Human Computer Interaction 3D representation Parallel and Distributed Computing Other Statistical and Mathematical Modeling
2 Who Am I? UNIVERSITY OF ALBERTA Osmar R. Zaïane, Ph.D. Associate Professor Department of Computing Science 221 Athabasca Hall Edmonton, Alberta Canada T6G 2E8 Research Interests: Data Mining, Web Mining, Multimedia Mining, Data Visualization, Information Retrieval. Telephone: Office +1 (780) Fax +1 (780) Applications: Analytic Tools, Adaptive Systems, Intelligent Systems, Diagnostic and Categorization, Recommender Systems Database Laboratory PhD on Web Mining & Multimedia Mining With Dr. Jiawei Han at Simon Fraser University, Canada Achievements: (in last 6 years): 2 PhD and 18 MSc, 90+ publications, ICDM PC co-chair 2007, ADMA co-chair 2007, WEBKDD and MDM/KDD co-chair (2000 to 2003) Currently: 3 PhD + 1 MSc Since 1999 (3 Ph.D., Dr. Osmar R. Zaïane, 1 M.Sc.) Who Am I? Research Activities Pattern Discovery for Intelligent Systems Currently: 4 graduate students UNIVERSITY OF ALBERTA Osmar R. Zaïane, Ph.D. Associate Professor Department of Computing Science 352 Athabasca Hall Edmonton, Alberta Canada T6G 2E8 Past 6 years Telephone: Office +1 (780) Fax +1 (780) [email protected] MSc PhD Principles Total of Knowledge 5Discovery 5in Data 1 Database Laboratory Total University 1 of Alberta 3 20 Research Interests: Data Mining, Web Mining, Multimedia Mining, Data Visualization, Information Retrieval. Where I Stand Artificial Intelligence HCI Graphics Applications: Analytic Tools, Adaptive Systems, Intelligent Systems, Diagnostic and Categorization, Recommender Systems Database Management Systems Achievements: 90+ publications, IEEE-ICDM PC-chair & ADMA chair WEBKDD and MDM/KDD co-chair (2000 to 2003), WEBKDD 05 co-chair, Associate Editor for ACM- SIGKDD Explorations Thanks To Without my students my research work wouldn t have been possible. Current: (No particular order) Maria-Luiza Antonie JiyangChen Andrew Foss Seyed-Vahid Jazayeri Past: (No particular order) Stanley Oliveira Yang Wang Lisheng Sun JiaLi Alex Strilets William Cheung Andrew Foss Yue Zhang Chi-Hoon Lee Weinan Wang Ayman Ammoura Hang Cui Jun Luo Jiyang Chen Yuan Ji Yi Li Yaling Pei Yan Jin Mohammad El-Hajj Maria-Luiza Antonie
3 Knowledge Discovery and Data Mining Class and Office Hours Class: Tuesday and Thursdays from 9:30 to 10:50 Office Hours: Tuesdays from 13:00 to 14:00 By mutually agreed upon appointment: Tel: Office: ATH 3-52 Course Requirements Understand the basic concepts of database systems Understand the basic concepts of artificial intelligence and machine learning Be able to develop applications in C/C ++ and/or Java But I prefer Class: Once a week from 9:00 to 11:50 (with one break) 10 Course Objectives To provide an introduction to knowledge discovery in databases and complex data repositories, and to present basic concepts relevant to real data mining applications, as well as reveal important research issues germane to the knowledge discovery domain and advanced mining applications. Students will understand the fundamental concepts underlying knowledge discovery in databases and gain hands-on experience with implementation of some data mining algorithms applied to real world cases. Evaluation and Grading There is no final exam for this course, but there are assignments, presentations, a midterm and a project. I will be evaluating all these activities out of 100% and give a final grade based on the evaluation of the activities. The midterm is either a take-home exam or an oral exam. Assignments 20% (2 assignments) Midterm 25% Project 39% Quality of presentation + quality of report and proposal + quality of demos Preliminary project demo (week 11) and final project demo (week 15) have the same weight (could be week 16) Class presentations 16% Quality of presentation + quality of slides + peer evaluation A+ will be given only for outstanding achievement
4 Choice Implement data mining project Projects Deliverables Project proposal + project pre-demo + final demo + project report Examples and details of data mining projects will be posted on the course web site. Assignments 1- Competition in one algorithm implementation (in C/C ++ ) 2- Devising Exercises with solutions More About Projects Students should write a project proposal (1 or 2 pages). project topic; implementation choices; Approach, references; schedule. All projects are demonstrated at the end of the semester. December to the whole class. Preliminary project demos are private demos given to the instructor on week November 19. Implementations: C/C ++ or Java, OS: Linux, Window XP/2000, or other systems. 13 More About Evaluation Re-examination. About Plagiarism None, except as per regulation. Collaboration. Collaborate on assignments and projects, etc; do not merely copy. Plagiarism. Work submitted by a student that is the work of another student or any other person is considered plagiarism. Read Sections and of the calendar. Cases of plagiarism are immediately referred to the Dean of Science, who determines what course of action is appropriate. Plagiarism, cheating, misrepresentation of facts and participation in such offences are viewed as serious academic offences by the University and by the Campus Law Review Committee (CLRC) of General Faculties Council. Sanctions for such offences range from a reprimand to suspension or expulsion from the University. 15
5 Notes and Textbook Course home page: We will also have a mailing list and newsgroup for the course. No Textbook but recommended books: Data Mining: Concepts and Techniques Jiawei Han and Micheline Kamber Morgan Kaufmann Publisher ISBN pages 2001 ISBN pages Other Books Principles of Data Mining David Hand, Heikki Mannila, Padhraic Smyth, MIT Press, 2001, ISBN X, 546 pages Data Mining: Introductory and Advanced Topics Margaret H. Dunham, Prentice Hall, 2003, ISBN , 315 pages Dealing with the data flood: Mining data, text and multimedia Edited by Jeroen Meij, SST Publications, 2002, ISBN , 896 pages Introduction to Data Mining Pang-Ning Tan, Michael Steinbach, Vipin Kumar Addison Wesley, ISBN: , 769 pages 17 Course Schedule There are 13 weeks from Sept 6 th to December 4 th. Week 1: Sept 6 : Introduction to Data Mining Week 2: Sept : Association Rules Week 3: Sept : Association Rules (advanced topics) Week 4: Sept : Sequential Pattern Analysis Week 5: Oct 2-4 : Classification (Neural Networks) Week 6: Oct 9-11 : Classification (Decision Trees and +) Week 7: Oct : Data Clustering Week 8: Oct : Outlier Detection Week 9: Oct 30-Nov 1 : Data Clustering in subspaces Week 10: Nov 6-8 : Contrast sets + Web Mining Week 11: Nov : Web Mining + Class Presentations Week 12: Nov : Class Presentations Week 12: Nov : Class Presentations Week 13: Dec 4 : Class Presentations Week 15: Dec 11 : Project Demos (Tentative, subject to changes) Due dates -Midterm week 8 -Assignment 1 week 6 -Assignment 2 variable dates Course Content Introduction to Data Mining Association analysis Sequential Pattern Analysis Classification and prediction Contrast Sets Data Clustering Outlier Detection Web Mining Other topics if time permits (spatial data, biomedical data, etc.) 19 20
6 For those of you who watch what you eat... Here's the final word on nutrition and health. It's a relief to know the truth after all those conflicting medical studies. The Japanese eat very little fat and suffer fewer heart attacks than the British or Americans. The Mexicans eat a lot of fat and suffer fewer heart attacks than the British or Americans. The Japanese drink very little red wine and suffer fewer heart attacks than the British or Americans The Italians drink excessive amounts of red wine and suffer fewer heart attacks than the British or Americans. The Germans drink a lot of beer and eat lots of sausages and fats and suffer fewer heart attacks than the British or Americans. CONCLUSION: Eat and drink what you like. Speaking English is apparently what kills you. What Is Association Mining? Association Rules Clustering Classification Outlier Detection Association rule mining searches for relationships between items in a dataset: Finding association, correlation, or causal structures among sets of items or objects in transaction databases, relational databases, and other information repositories. Rule form: Body Ηead [support, confidence]. Examples: buys(x, bread ) buys(x, milk ) [0.6%, 65%] major(x, CS ) ^ takes(x, DB ) grade(x, A ) [1%, 75%]
7 Basic Concepts A transaction is a set of items: T={i a, i b,i t } Association Rule Mining T I, where I is the set of all possible items {i 1, i 2,i n } D, the task relevant data, is a set of transactions. An association rule is of the form: P Q, where P I, Q I, and P Q = P Q holds in D with support s and P Q has a confidence c in the transaction set D. Support(P Q) = Probability(P Q) Confidence(P Q)=Probability(Q/P) FIM Frequent Itemset Mining Association Rules Generation abc Bound by a support threshold 1 2 ab c b ac Bound by a confidence threshold Frequent itemset generation is still computationally expensive Frequent Itemset Generation null A B C D E AB AC AD AE BC BD BE CD CE DE ABC ABD ABE ACD ACE ADE BCD BCE BDE CDE ABCD ABCE ABDE ACDE BCDE ABCDE Given d items, there are 2 d possible candidate itemsets Frequent Itemset Generation Brute-force approach (Basic approach): Each itemset in the lattice is a candidate frequent itemset Count the support of each candidate by scanning the database Transactions List of Candidates N TID Items 1 Bread, Milk 2 Bread, Diaper, Beer, Eggs 3 Milk, Diaper, Beer, Coke 4 Bread, Milk, Diaper, Beer 5 Bread, Milk, Diaper, Coke w Match each transaction against every candidate Complexity ~ O(NMw) => Expensive since M = 2 d!!! M
8 Grouping Grouping Grouping Grouping Clustering Clustering Partitioning Partitioning We need a notion of similarity or closeness (what features?) Should we know apriori how many clusters exist? How do we characterize members of groups? How do we label groups? What about objects that belong to different groups? We need a notion of similarity or closeness (what features?) Should we know apriori how many clusters exist? How do we characterize members of groups? How do we label groups? Classification Classification Categorization What is Classification? The goal of data classification is to organize and categorize data in distinct classes. A model is first created based on the data distribution. The model is then used to classify new data. Given the model, a class can be predicted for new data. With classification, I can predict in which bucket to put the ball, but I can t predict the weight of the ball n Predefined buckets i.e. known labels? n
9 Classification = Learning a Model Training Set (labeled) Framework Classification Model Labeled Data Training Data Testing Data Derive Classifier (Model) Estimate Accuracy New unlabeled data Labeling=Classification Unlabeled New Data Classification Methods Decision Tree Induction Neural Networks Bayesian Classification K-Nearest Neighbour Support Vector Machines Associative Classifiers Case-Based Reasoning Genetic Algorithms Rough Set Theory Fuzzy Sets Etc. Outlier Detection To find exceptional data in various datasets and uncover the implicit patterns of rare cases Inherent variability - reflects the natural variation Measurement error (inaccuracy and mistakes) Long been studied in statistics An active area in data mining in the last decade Many applications Detecting credit card fraud Discovering criminal activities in E-commerce Identifying network intrusion Monitoring video surveillance
10 Outliers are Everywhere Data values that appear inconsistent with the rest of the data. Some types of outliers We will see: Statistical methods; distance-based methods; density-based methods; resolution-based methods, etc. Protein Localization Contrasting Sequence Sets Collaborative e-learning E-commerce sites Brassica Napus (Canola) A B C D E F G H R S T Class Class What contrasts genes or proteins? What makes a buyer buy & a non-buyer leave empty handed? Class Finding emerging sequences and contrasting the sets of sequences would give insight about what behaviour Dr. Osmar R. leads Zaïane, to success Principles and of Knowledge what doesn t. Discovery in Data
11 Mammography Breast Cancer Recommender Systems Automatic cropping and enhancement Automatic segmentation and feature extraction Training Classification model Transaction (IID, class, F 1, F 2, F 3, F f ) F α F β F γ F δ class Normal Malignant benign Question Query expansion 1,2,3,4,5,6,7,8,9,.. 1,2,3,4,5,6,7,8,9,.. 1,2,3,4,5,6,7,8,9,.. Recommend brief answers Concise summaries Text Summarization WebViz Clustering with Constraints How to model constraints? Polygons, lines? How to include the constraint verification in the clustering phase?
12 d 1 d 2 d n How to Apply This? Crime events with descriptors d 1 d 2 d n Identification of Serial crimes d 1 d 2 d n Construction 1 Communication 2 Visualization 3 Interaction 4 CORBA+XML DIVE-ON Project Data mining in an Immersed Virtual Environment CORBA+XML Over a Network We need to define the notion of similarity Distributed databases Federated multidimensional data warehouses Local 3 dimensional data cube Visualization room (CAVE) Immersed user 3D rotating torus menu Hand signs Dr. Osmar to R. Zaïane, operate the CAVE Principles with OLAP of Knowledge operations. Discovery in Data VWV Project Virtual Web View Mediator WebML VWV 1 VWV 2 VWV n Private onthology
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
Data Warehousing and Data Mining
Data Warehousing and Data Mining Winter Semester 2010/2011 Free University of Bozen, Bolzano DW Lecturer: Johann Gamper [email protected] DM Lecturer: Mouna Kacimi [email protected] http://www.inf.unibz.it/dis/teaching/dwdm/index.html
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Syllabus. HMI 7437: Data Warehousing and Data/Text Mining for Healthcare
Syllabus HMI 7437: Data Warehousing and Data/Text Mining for Healthcare 1. Instructor Illhoi Yoo, Ph.D Office: 404 Clark Hall Email: [email protected] Office hours: TBA Classroom: TBA Class hours: TBA
Data Mining: Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar
Data Mining: Association Analysis Partially from: Introduction to Data Mining by Tan, Steinbach, Kumar Association Rule Mining Given a set of transactions, find rules that will predict the occurrence of
Data Mining Apriori Algorithm
10 Data Mining Apriori Algorithm Apriori principle Frequent itemsets generation Association rules generation Section 6 of course book TNM033: Introduction to Data Mining 1 Association Rule Mining (ARM)
Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier
Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.
Objectives of Lecture 1. Labs and TAs. Class and Office Hours. CMPUT 391: Introduction. Introduction
Database Management Systems Winter 2003 CMPUT 391: Introduction Dr. Osmar R. Zaïane Objectives of Lecture 1 Introduction Get a rough initial idea about the content of the course: Lectures Resources Activities
Data Mining Introduction
Data Mining Introduction Organization Lectures Mondays and Thursdays from 10:30 to 12:30 Lecturer: Mouna Kacimi Office hours: appointment by email Labs Thursdays from 14:00 to 16:00 Teaching Assistant:
Knowledge Discovery from Data Bases Proposal for a MAP-I UC
Knowledge Discovery from Data Bases Proposal for a MAP-I UC P. Brazdil 1, João Gama 1, P. Azevedo 2 1 Universidade do Porto; 2 Universidade do Minho; 1 Knowledge Discovery from Data Bases We are deluged
Data Warehousing and Data Mining
Data Warehousing and Data Mining Winter Semester 2012/2013 Free University of Bozen, Bolzano DM Lecturer: Mouna Kacimi [email protected] http://www.inf.unibz.it/dis/teaching/dwdm/index.html Organization
Data Mining and Business Intelligence CIT-6-DMB. http://blackboard.lsbu.ac.uk. Faculty of Business 2011/2012. Level 6
Data Mining and Business Intelligence CIT-6-DMB http://blackboard.lsbu.ac.uk Faculty of Business 2011/2012 Level 6 Table of Contents 1. Module Details... 3 2. Short Description... 3 3. Aims of the Module...
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
CAS CS 565, Data Mining
CAS CS 565, Data Mining Course logistics Course webpage: http://www.cs.bu.edu/~evimaria/cs565-10.html Schedule: Mon Wed, 4-5:30 Instructor: Evimaria Terzi, [email protected] Office hours: Mon 2:30-4pm,
Dynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
Objectives of Lecture 1. Class and Office Hours. Labs and TAs. CMPUT 391: Introduction. Introduction
Database Management Systems Winter 2004 CMPUT 391: Introduction Dr. Osmar R. Zaïane Objectives of Lecture 1 Introduction Get a rough initial idea about the content of the course: Lectures Resources Activities
Data Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
AMIS 7640 Data Mining for Business Intelligence
The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2013, Session
CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall 2015 http://digital.cs.usu.edu/~kyumin/cs5890/
CS 5890: Introduction to Data Science Syllabus, Utah State University, Fall 2015 http://digital.cs.usu.edu/~kyumin/cs5890/ 1. Credits: 3 a. Class Meets: Tuesday and Thursday 1:30pm - 2:45pm, Old Main (MAIN)
Introduction to Data Mining Techniques
Introduction to Data Mining Techniques Dr. Rajni Jain 1 Introduction The last decade has experienced a revolution in information availability and exchange via the internet. In the same spirit, more and
Association Analysis: Basic Concepts and Algorithms
6 Association Analysis: Basic Concepts and Algorithms Many business enterprises accumulate large quantities of data from their dayto-day operations. For example, huge amounts of customer purchase data
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
DATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011
DATA MINING CONCEPTS AND TECHNIQUES Marek Maurizio E-commerce, winter 2011 INTRODUCTION Overview of data mining Emphasis is placed on basic data mining concepts Techniques for uncovering interesting data
International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET
DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand
Unique column combinations
Unique column combinations Arvid Heise Guest lecture in Data Profiling and Data Cleansing Prof. Dr. Felix Naumann Agenda 2 Introduction and problem statement Unique column combinations Exponential search
DATA MINING FOR BUSINESS INTELLIGENCE. Data Mining For Business Intelligence: MIS 382N.9/MKT 382 Professor Maytal Saar-Tsechansky
DATA MINING FOR BUSINESS INTELLIGENCE PROFESSOR MAYTAL SAAR-TSECHANSKY Data Mining For Business Intelligence: MIS 382N.9/MKT 382 Professor Maytal Saar-Tsechansky This course provides a comprehensive introduction
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Mining Association Rules. Mining Association Rules. What Is Association Rule Mining? What Is Association Rule Mining? What is Association rule mining
Mining Association Rules What is Association rule mining Mining Association Rules Apriori Algorithm Additional Measures of rule interestingness Advanced Techniques 1 2 What Is Association Rule Mining?
Introduction to Data Mining
Introduction to Data Mining José Hernández ndez-orallo Dpto.. de Systems Informáticos y Computación Universidad Politécnica de Valencia, Spain [email protected] Horsens, Denmark, 26th September 2005
College of Health and Human Services. Fall 2013. Syllabus
College of Health and Human Services Fall 2013 Syllabus information placement Instructor description objectives HAP 780 : Data Mining in Health Care Time: Mondays, 7.20pm 10pm (except for 3 rd lecture
Data Mining Association Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 6. Introduction to Data Mining
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/24
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
Data Mining and Soft Computing. Francisco Herrera
Francisco Herrera Research Group on Soft Computing and Information Intelligent Systems (SCI 2 S) Dept. of Computer Science and A.I. University of Granada, Spain Email: [email protected] http://sci2s.ugr.es
Principles of Dat Da a t Mining Pham Tho Hoan [email protected] [email protected]. n
Principles of Data Mining Pham Tho Hoan [email protected] References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,
Integrated Data Mining and Knowledge Discovery Techniques in ERP
Integrated Data Mining and Knowledge Discovery Techniques in ERP I Gandhimathi Amirthalingam, II Rabia Shaheen, III Mohammad Kousar, IV Syeda Meraj Bilfaqih I,III,IV Dept. of Computer Science, King Khalid
A Web-based Interactive Data Visualization System for Outlier Subspace Analysis
A Web-based Interactive Data Visualization System for Outlier Subspace Analysis Dong Liu, Qigang Gao Computer Science Dalhousie University Halifax, NS, B3H 1W5 Canada [email protected] [email protected] Hai
2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: [email protected]) Dr. G.N. Singh Department of Physics and
Market Basket Analysis and Mining Association Rules
Market Basket Analysis and Mining Association Rules 1 Mining Association Rules Market Basket Analysis What is Association rule mining Apriori Algorithm Measures of rule interestingness 2 Market Basket
College information system research based on data mining
2009 International Conference on Machine Learning and Computing IPCSIT vol.3 (2011) (2011) IACSIT Press, Singapore College information system research based on data mining An-yi Lan 1, Jie Li 2 1 Hebei
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH
MAXIMAL FREQUENT ITEMSET GENERATION USING SEGMENTATION APPROACH M.Rajalakshmi 1, Dr.T.Purusothaman 2, Dr.R.Nedunchezhian 3 1 Assistant Professor (SG), Coimbatore Institute of Technology, India, [email protected]
CPSC 340: Machine Learning and Data Mining. Mark Schmidt University of British Columbia Fall 2015
CPSC 340: Machine Learning and Data Mining Mark Schmidt University of British Columbia Fall 2015 Outline 1) Intro to Machine Learning and Data Mining: Big data phenomenon and types of data. Definitions
How To Use Data Mining For Knowledge Management In Technology Enhanced Learning
Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.
Q2. (a) List and describe the five primitives for specifying a data mining task. Data Mining Task Primitives (b) How data mining is different from knowledge discovery in databases (KDD)? Explain. IETE
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis
, 23-25 October, 2013, San Francisco, USA Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis John David Elijah Sandig, Ruby Mae Somoba, Ma. Beth Concepcion and Bobby D. Gerardo,
A New Approach for Evaluation of Data Mining Techniques
181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty
Association Rule Mining
Association Rule Mining Association Rules and Frequent Patterns Frequent Pattern Mining Algorithms Apriori FP-growth Correlation Analysis Constraint-based Mining Using Frequent Patterns for Classification
Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction
Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration
1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining
1. What are the uses of statistics in data mining? Statistics is used to Estimate the complexity of a data mining problem. Suggest which data mining techniques are most likely to be successful, and Identify
Quick Introduction of Data Mining Techniques
Quick Introduction of Data Mining Techniques *Sources partially from Introduction to Data Mining, by P.-N. Tan, M. Steinbach, V. Kumar, Addison-Wesley, 2005. Main Data Mining Techniques Link Analysis Associations
Introduction to Data Mining. Lijun Zhang [email protected] http://cs.nju.edu.cn/zlj
Introduction to Data Mining Lijun Zhang [email protected] http://cs.nju.edu.cn/zlj Outline Overview Introduction The Data Mining Process The Basic Data Types The Major Building Blocks Scalability and Streaming
Data Mining: Introduction. Lecture Notes for Chapter 1. Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler
Data Mining: Introduction Lecture Notes for Chapter 1 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Why Mine Data? Commercial Viewpoint Lots of data is being collected and warehoused - Web
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
DATA MINING - SELECTED TOPICS
DATA MINING - SELECTED TOPICS Peter Brezany Institute for Software Science University of Vienna E-mail : [email protected] 1 MINING SPATIAL DATABASES 2 Spatial Database Systems SDBSs offer spatial
CHAPTER-24 Mining Spatial Databases
CHAPTER-24 Mining Spatial Databases 24.1 Introduction 24.2 Spatial Data Cube Construction and Spatial OLAP 24.3 Spatial Association Analysis 24.4 Spatial Clustering Methods 24.5 Spatial Classification
Search and Data Mining: Techniques. Applications Anya Yarygina Boris Novikov
Search and Data Mining: Techniques Applications Anya Yarygina Boris Novikov Introduction Data mining applications Data mining system products and research prototypes Additional themes on data mining Social
Knowledge Discovery from Databases
Encyclopedia of Database Technologies and Applications Knowledge Discovery from Databases Jose Hernandez-Orallo Dep. of Information Systems and Computation Technical University of Valencia, Spain [email protected]
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Data mining in the e-learning domain
Data mining in the e-learning domain The author is Education Liaison Officer for e-learning, Knowsley Council and University of Liverpool, Wigan, UK. Keywords Higher education, Classification, Data encapsulation,
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Cleveland State University
Cleveland State University CIS 612 Modern Database Programming & Big Data Processing (3-0-3) Fall 2014 Section 50 Class Nbr. 2670. Tues, Thur 4:00 5:15 PM Prerequisites: CIS 505 and CIS 530. CIS 611 Preferred.
AMIS 7640 Data Mining for Business Intelligence
The Ohio State University The Max M. Fisher College of Business Department of Accounting and Management Information Systems AMIS 7640 Data Mining for Business Intelligence Autumn Semester 2014, Session
City University of Hong Kong. Information on a Course offered by the Department of Management Sciences with effect from Semester A in 2012 / 2013
City University of Hong Kong Information on a Course offered by the Department of Management Sciences with effect from Semester A in 2012 / 2013 Part I Course Title: Customer Relationship Management with
Classification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
Association Rule Mining: A Survey
Association Rule Mining: A Survey Qiankun Zhao Nanyang Technological University, Singapore and Sourav S. Bhowmick Nanyang Technological University, Singapore 1. DATA MINING OVERVIEW Data mining [Chen et
CSCI-599 DATA MINING AND STATISTICAL INFERENCE
CSCI-599 DATA MINING AND STATISTICAL INFERENCE Course Information Course ID and title: CSCI-599 Data Mining and Statistical Inference Semester and day/time/location: Spring 2013/ Mon/Wed 3:30-4:50pm Instructor:
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Subject Description Form
Subject Description Form Subject Code Subject Title COMP417 Data Warehousing and Data Mining Techniques in Business and Commerce Credit Value 3 Level 4 Pre-requisite / Co-requisite/ Exclusion Objectives
Data Warehousing and Data Mining. A.A. 04-05 Datawarehousing & Datamining 1
Data Warehousing and Data Mining A.A. 04-05 Datawarehousing & Datamining 1 Outline 1. Introduction and Terminology 2. Data Warehousing 3. Data Mining Association rules Sequential patterns Classification
COURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
Principles of Data Mining
Principles of Data Mining Instructor: Sargur N. 1 University at Buffalo The State University of New York [email protected] Introduction: Topics 1. Introduction to Data Mining 2. Nature of Data
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM
MINING THE DATA FROM DISTRIBUTED DATABASE USING AN IMPROVED MINING ALGORITHM J. Arokia Renjit Asst. Professor/ CSE Department, Jeppiaar Engineering College, Chennai, TamilNadu,India 600119. Dr.K.L.Shunmuganathan
Mining an Online Auctions Data Warehouse
Proceedings of MASPLAS'02 The Mid-Atlantic Student Workshop on Programming Languages and Systems Pace University, April 19, 2002 Mining an Online Auctions Data Warehouse David Ulmer Under the guidance
Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms
Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms Y.Y. Yao, Y. Zhao, R.B. Maguire Department of Computer Science, University of Regina Regina,
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
ISQS 3358 BUSINESS INTELLIGENCE FALL 2014
ISQS 3358 BUSINESS INTELLIGENCE FALL 2014 Instructor: Dr. Miguel. I. Aguirre-Urreta, Ph.D. Office: BA E322 Phone: 806.834.0765 Email: [email protected] Office Hours Tuesdays and Thursdays from
COURSE SYLLABUS. Enterprise Information Systems and Business Intelligence
MASTER PROGRAMS Autumn Semester 2008/2009 COURSE SYLLABUS Enterprise Information Systems and Business Intelligence Instructor: Malov Andrew, Master of Computer Sciences, Assistant,[email protected] Organization
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan
Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
University of Southern California MARSHALL SCHOOL OF BUSINESS Spring, 2004 Course Guidelines & Syllabus
University of Southern California MARSHALL SCHOOL OF BUSINESS Spring, 2004 Course Guidelines & Syllabus IOM 528 DATA WAREHOUSING, BUSINESS INTELLIGENCE AND DATA MINING Instructor: Dr. Arif Ansari Office:
Data Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.1 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Classification vs. Numeric Prediction Prediction Process Data Preparation Comparing Prediction Methods References Classification
