Sutee Sujitparapitaya, Ph.D. Institutional Effectiveness and Analytics San José State University

Size: px
Start display at page:

Download "Sutee Sujitparapitaya, Ph.D. Institutional Effectiveness and Analytics San José State University"

Transcription

1 Sutee Sujitparapitaya, Ph.D. Associate Vice President for Institutional Effectiveness and Analytics San José State University Copyright Sutee Sujitparapitaya,

2 Data mining techniques are widely used for data analysis. While data mining may be viewed as expensive, time consuming, and too technical to understand dand apply, it is an institutional i i research tool used for efficiently i managing and extracting data from large databases and for expediting reporting through the use of statistical algorithms. This workshop will introduce the basic foundations of data mining and identify types of data typically found in large institutional databases, research questions to consider bf before mining i dt data, and issues of dt data quality. It will also address on how to mix traditional institutional research tools with data mining, and field additional questions typically posed by novices. Emphasis will be from a beginners (novice) perspective with an emphasis on institutional research data applications. 2

3 Describe the basic foundations of data mining from an institutional research (IR) )perspective. p Explain the principle components of IR data and research questions Describe why data mining process (CRISP DM Methodology) and primary techniques are valuable for IR Describe how the data quality and data selection works Explain the primary features of data mining tools Describe the relevant resources that are available to help the data mining projects 3

4

5 Strategic Decision Making Wealth Generation Analyzing trends Security 5

6 Data Mining is a process of finding hidden trends, patterns, and relationships in data that is not immediately apparent from summarizing the data. By examining data in largedatabases andinfers rules to a) obtain an insight; b) predict future behavior For example: Finding patterns in student data for student attrition or to identify student at risk and potential drop out from school. Motivation of Data Mining : 1. Important need for turning data into useful information 2. Fast growing amount of data, collected and stored in large and numerous databases exceeded the human ability for comprehension without powerful tools. 3. We are drowning in data, but starving for knowledge! 6

7 Traditional Statistics (Distributions, mathematics, etc.) Machine Learning: the discipline concerned with the design and development of algorithms that gives computers the ability to learn without being explicitly programmed. (Computer science, heuristics and induction algorithms). Artificial Intelligence: the study and design of intelligent agents to emulate human intelligence. Neural Networks: a mathematical model that uses an interconnected group of artificial neurons processes information between inputs and outputs or to find patterns in data. It is an adaptive model that changes its structure during a learning phase. (Biological models, psychology and engineering) 7

8 Evolutionary Step Business Question Enabling Technologies Characteristics Data Collection (1960s) What was # new applications for the last five years? Computers, tapes, disks Retrospective, static data delivery Data Access (1980s) Whatwas # new applications for College of Business last March? Relational databases, SQL, ODBC Retrospective, dynamic data delivery at record level Data Warehousing & Decision Support (1990s) Whatwas # new applications for College of Business last March? Drill down to Accounting Majors OLAP, multidimensional databases, data warehouses Retrospective, dynamic data delivery at multiple levels Data Mining (At Present Time) What s likely to happen to # new Accounting applications next month? Why? Advanced algorithms, multiprocessor computers, massive databases Prospective, proactive information delivery 8

9 Statistics Conceptual + Statistical = Proof Model Reasoning (Validation of (Hypothesis) Hypothesis) Data Mining Data Data Mining + Algorithm = based on Interestingness Pattern Discovery (Model, Rule) 9

10 Association Rules describes a method for discovering i interesting ti relations lti between variables in large databases. It produces dependency rules which will predict occurrence of an item based on occurrences of other items. Example 1: Which products are frequently bought together by customers? (Basket Analysis) DataTable = Receipts x Products Results could be used to change the placements of products Example 2: Which courses tend to be attended together? DataTable = Students x Courses Results could be used to avoid scheduling conflicts... 10

11 Market basket analysis identifies customers purchasing habits. It provides insight into the combination of products within a customers 'basket'. Ultimately, the purchasing insights provide the potential to create cross sell propositions: Which product combinations are bought When they are purchased; and in What sequence Observation Items 1 Break, Coke, Milk 2 Beer, Bread 3 Beer, Coke, Diapers, Milk 4 Beer, Bread, Diapers, Milk 5 Coke, Diapers, Milk Rules Discovered: {Milk} {Coke} {Diapers, Milk} {Beer} 11

12 The government's data mining projects fall into two broad categories: 1. Subject based Data Mining that retrieve data that could help an analyst follow a lead, and 2. Pattern based Data Mining that look for suspicious behaviors across a spread of activities. Most data mining experts consider the former a version of traditional police work chasing down leads but instead of a police officer examining a list of phone numbers of suspect calls, a computer does it. One subject based based data mining technique gaining traction among government practitioners and academics is called link analysis. Link analysis uses data to make connections between seemingly unconnected people or events. 12

13 Data Visualization is the study of the visual representation of data, meaning "information that has been abstracted in some schematic form. Itrefers to technique to communicate information clearly and effectively through graphical means (e.g., creating images, diagrams, or animations). Source: Bradbury Science Museum, Los Alamos, NM 13

14 14

15

16 + = Data Interestingness or Criteria Hidden Patterns Slice 16

17

18 Interaction data -Offers - Results - Context - Click streams - Notes Attitudinal data - Opinions - Preferences - Needs - Desires Descriptive data - Attributes - Characteristics - Self-declared info - (Geo)demographics Behavioral data - Orders - Transactions P hi - Usage history S lf d l d i f - Payment history Source: SPSS BI

19 Too many records Too manyvariables Complex non linear relationships Multi variable combination Proactive and prospective approach Source: Abbot, Data Mining: Level II 19

20 Traditional IR Work: Data file => Descriptive/Regression Analysis => Tabulations/Reports Historical Predictive Data Mining Driven IR Work: Database => Data Mining (Visualization, Association, Clustering, Predicative Modeling) => Immediate Actions Historical Predictive 20

21

22 Frequency Correlation Type of Interestingness Length of Occurrence (for sequence) Consistency Repeating/Periodicity Abnormal Behaviors Other patterns of Interestingness 22

23 Typical DBMS Approach What are total applications during the last 3 years? What is the first year retention of the fall 2006 first timefreshmen time freshmen from under representative minority? How many freshmen had attended the freshman orientation in November for the last 5 years? What is the total pledges for California alumni donation last year? How many agree and strongly agree responses did we received from the 2008 student/faculty satisfaction surveys? Data Mining Approach Which inquiries are most likely to turn into actual applications? What are the most important parameters to predict the first year attrition for next year s entering freshmen? Who are likely to enroll in the freshman orientation during the month of November? Who are likely to make pledges for alumni donation? What are the main clusters found in student/faculty satisfaction surveys? 23

24 What do we know about our students? DBMS Approach: List of students who passed English Proficiency Exam in the spring Summary of student s profile for those who failed, and dropped out last semester How many students enrolled the Business Policy course last fall semester? Data Mining Approach: What factors are contributive to learning? Who is likely to fail or drop out at the end of their 6 th year? What courses provide high FTES, better use of space? What are the course taking patterns? 24

25 DBMS Approach: List of all items that were sold in the last month? List all the items purchased by Sandy Smith? The total sales of the last month grouped by branch? How many sales transactions occurred in the month of December? Data Mining Approach: Which h items are sold together th? What items to stock k? How to place items? What discounts to offer? How best to target customers to increase sales? Which clients are most likely to respond to my next promotional mailing, and why? 25

26

27 Supervised Data Mining refers to the prior knowledge of what the outcomes exist in the data. Classification and Prediction describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown. Unsupervised Data Mining used when the researcher has no idea what hidden patterns there are in the vast database. Clustering involve in accurate identification of group membership based on maximizing the infraclass similarity and minimizing the interclass similarity. Associations and Sequences identify relationships between events that occur at one time, determines which things go together or sequential patterns in data. 27

28 Categorize your students Clustering Cafeteria meal planning Student housing planning Predict students retention/alumni donations Neural Nets/Regression Identify high risk students Estimate/predict alumni contributions Predict new student application rate Group similar students Segmentation Course planning Academic scheduling Identify student preferences for clubs and social organizations Identify courses that are taken together Association Faculty teaching load estimation Course C planning Academic scheduling Find patterns and trends over time Sequence Predict alumni donations Predict potential demand for library resources Source: Thulasi Kumar, 2004

29 Classification and Prediction Decision Trees (C&RT, C5.0, CHAID, and QUEST) Neural Networks Regressions (Linear and Logistic) Clustering K Means, TwoStep, and Kohonen SOM Association Rule/Affinity Analysis Generalized Rule Induction (GRI) CARMA (Continuous Association Rule Mining Algorithm) APRIORI 29

30

31 It is tree shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. The model predicts the value of a target variable based on several input variables. Two primary types of Decision trees: 1. Classification tree analysis is used when the predicted outcome is the class to which the data belongs. 2. Regression tree analysis is used when the predicted outcome can be considered a real number (e.g. the price of a house, or a patient s length of stay in a hospital). Advantages: Disadvantages: Fast Inherently unstable Simple to understand and interpret Can become large and complex Validation using statistical tests 31

32 Dependent Variable: Target classification is "should we play baseball?" which can be yes or no. Input Variables: Weather attributes are outlook, temperature, humidity, andwindspeed speed. They can have the following values: o outlook = { sunny, overcast, rain } o temperature = {hot, mild, cool } o humidity = { high, normal } o wind = {weak, strong } Day Outlook Temperature Humidity Wind Play ball D1 Sunny Hot High Weak No D2 Sunny Hot High Strong No D3 Overcast Hot High Weak Yes D4 Rain Mild High Weak Yes D5 Rain Cool Normal Weak Yes D6 Rain Cool Normal Strong No D7 Overcast Cool Normal Strong Yes D8 Sunny Mild High Weak No D9 Sunny Cool Normal Weak Yes D10 Rain Mild Normal Weak Yes D11 Sunny Mild Normal Strong Yes D12 Overcast Mild High Strong Yes D13 Overcast Hot Normal Weak Yes D14 Rain Mild High Strong No 32

33 C5.0 (Multiple split, no continuous targets) uses the C5.0 algorithm to build either a decision tree or a rule set. A C5.0 model works by splitting the sample based on the field that provides the maximum information gain. The Classification and Regression (C&R) Tree node is a tree-based classification and prediction method. Similar to C5.0, this method uses recursive e partitioning to split the training records into segments e s with similar output field values. (Binary split, continuous target) QUEST or Quick, Unbiased, Efficient Statistical Tree is a binary classification method for building decision trees. A major motivation in its development was to reduce the processing time required for large C&RT analyses with either many variables or many cases CHAID, or Chi-squared Automatic Interaction Detection, is a classification method for building decision trees by using chi-square statistics to identify optimal splits. CHAID first examines the cross tabulations between each of the predictor variables and the outcome and tests for significance using a chisquare independence test. 33

34 Neural network is a model that emulates human biological neural system to solve the prediction and classification problems. solutions for linear and non linear relationships between input and output variables. Does not assume any particular data distribution. 34

35 Advantages Has a mathematical foundation Robustwith noisy data Detects relationships and trends in data that traditional methods overlook Can fit complex non linear models Ability to detect all possible interactions between predictor variables Disadvantages Black Box" nature that does not easily analyze and interpret Greater computational tti lburden Virtually impossible to "interpret" the solution in traditional, analytic terms, such as those used to build theories that explain phenomena 35

36 Linear regression is an approach to modeling the relationship between a scalar dependent variable (y) and one or more predictor variables (X). The case of one predictor variable is called simple regression. More than one predictor variable ibl is multiple li l regression. The regression equation represents a straight line or plane that minimizes the squared differences between predicted and actual output values. This is a very common statistical technique for summarizing data and making predictions y= f(x) Advantages: Available in most software Widely accepted statistical technique Disadvantages: Not appropriate for many non linear problems Must meet underlying assumptions 36

37 Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable based on one or more predictor variables that may be either continuous or categorical data. 1. Binomial orbinary logistic regression refers to the instance in which the observed outcome can have only two possible types (e.g., "dead" vs. "alive", "success" vs. "failure", or "yes" vs. "no"). 2. Multinomial logistic regression ess refers eesto cases where eethe outcome can have ae three or more possible types (e.g., "better' vs. "no change" vs. "worse"). For example, logistic regression might be used to predict whether a new student will graduate within 6 years, based on observed characteristics ti of the student (test score, age, gender, pre school preparation, etc). Advantages: Well established statistical procedure Simple and easy to interpret Very fast to train and build Can be used with small sample sizes Disadvantages: Strong sensitivity to outliers Multicollinearity 37

38 Cluster analysis is an exploratory data analysis tool (unsupervised) for solving classification problems. Its object is to sort cases (people, things, events, etc) into groups, or clusters, so that the degree of association is strong between members of the same cluster and weak between members of different clusters. It is not an automatic task, but an iterative process ofknowledge discovery (interactive multi objective optimization) that involves trial and failure until the result achieves the desired properties. The result of a cluster analysis shown as the coloring of the squares into three clusters. Types of Clustering K Means Two Step Kohonen Advantages: Make up of groups in attitudinal or behavioral tests Disadvantages: Individual group members may still differ 38

39

40 K Means clustering is an algorithm to classify or to group your objects based on attributes/features into K number of group. K is positive integer number. The grouping is done by minimizing the sum of squares of distances between data and the corresponding cluster centroid. Thus the purpose is to classify the data by partitioning n observations into k clusters in which each observation belongs to the cluster with the nearest mean. 40

41 Two step cluster analysis is a technique that groups cases into pre clusters that are treated as single cases. Standard hierarchical clustering is then applied to thepre pre clusters in the secondstep step. It appropriate for large datasets or datasets that have a mixture of continuous and categorical variables (not interval or dichotomous). It processes dt data with a one pass through the dataset h dt tmethod. Therefore, it does not require a proximity table (like hierarchical classification) or an iterative process (like K means clustering) 41

42 com 42

43 Kohonen networks are a type of neural network that perform clustering, also known as a knet or a self organizing map. It seeks to describe dataset in terms of natural clusters of cases. This type of network can be used to cluster the data set into distinct groups when you don't know what those groups are at the beginning. Don't even need to know the number of groups to look for. Kohonen networks start with a large number of units, and as training progresses, theunits gravitate toward the natural clusters in the data. Source: SPSS BI 43

44

45 Association or affinity analysis is a data mining technique that discovers co occurrence relationships among activities performed by specific individuals or groups. These relationships are then expressed as a collection of association rules. Association rules are statements in the form if antecedent(s) then consequent(s) Used to perform market basket analysis, in which retailers seek to understand the purchase behavior of customers. Types of Association GRI Apriori CARMA 45

46 Customer Purchase 1 jam 2 milk 3 jam 3 bread 4 jam 4 bread 4 milk Customer Jam Bread Milk 1 T F F 2 F F T 3 T T F 4 T T T 46

47

48 Business Understanding di Data Understanding Data Preparation Modeling Evaluation Deployment Source: dm.org 48

49 Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Determine Collect tinitial ldt Data Dt Data Set Select tmodeling Evaluate Results Plan Deployment Business Objectives Initial Data Collection Data Set Description Technique Assessment of Data Deployment Plan Background Report Modeling Technique Mining Results w.r.t. Business Objectives Select Data Modeling Assumptions Business Success Plan Monitoring and Business Success Describe Data Rationale for Inclusion / Criteria Maintenance Criteria Data Description Report Exclusion Generate Test Design Approved Models Monitoring and Situation Assessment Explore Data Clean Data Inventory of Resources Data Exploration Report Data Cleaning Report Requirements, Assumptions, and Constraints Verify Data Quality Data Quality Report Construct Data Derived Attributes Risks and Contingencies Generated Records Terminology Costs and Benefits Integrate Data Merged Data Determine Data Mining Goal Data Mining Goals Data Mining Success Criteria Format Data Reformatted Data Test Design Build Model Parameter Settings Models Model Description Assess Model Model Assessment Revised Parameter Settings Review Process Review of Process Determine Next Steps List of Possible Actions Decision Mi Maintenance Plan Produce Final Report Final Report Final Presentation Review Project Experience Documentation Produce Project Plan Project Plan Initial Asessment of Tools and Techniques Source: SPSS BI

50

51 Good data= better decisions = more profit Bad data= risky decisions = potential disaster: Bad data= Errors = losses We cannot offer enough courses = angry students, drop out or transfer out to another institution You re not admitted dto your intended dmajor = angry students and parents, lost revenue We have more rooms in the dorm for new students = bad decisions if the number of students is inflated by bad data. 51

52 52

53 53

54 Scalar refer to a quantity consisting of a single real number used to measured magnitude (size). Interval = Scale with a fixed and defined interval e.g. temperature or time. Ordinal = Scale for ordering observations from low to high with any ties attributed to lack of measurement sensitivity e.g. score from a questionnaire. Nominal with order = Scale for grouping into categories with order e.g. mild, moderate or severe. This can be difficult to separate from ordinal. Nominal without order = Scale for grouping into unique categories eg e.g. eye color. Dichotomous = As for nominal but two categories only e.g. male/female. Non Scalar contains more than one value (e.g., lists, arrays, records) 54

55 Case or likewise deletion Pairwise deletion Single value substitution (by mean, median or mode of variable) Regression substitution (using values of other variables in the same row or using the overall relationships of variables into account) Marking with a dummy variable 55

56 Identify outliers (Anomaly Detection Node) Verify distributions (Data Audit Node) Relationship of variables Predictive power of variables (Auto Data Prep Node) Data reduction 56

57 Data Audit/Data Distribution Charts Number of variables Number of records Information content/predictive power 57

58

59 59

60 Successful data mining strategy involves: 1. Make data mining models comprehensible to business users 2. Translate user s questions into a data mining i problem Well defined goals, project objectives, and questions 3. Ensure to use sufficient and relevant data 4. Close the loop: identify causality, suggest actions, and measure their effect. Need domain expertise in institutional research to build, test, validate, anddeploydeploy models. 5. Careful consideration and selection of software and analysts (tech and domain expert) 6. Support from senior administrators (VPs and the President) 7. Cope with privacy and security issues 8. Misuse of information/inaccurate information 60

61

62 Free Open source Data Mining Software and Applications: R RapidMiner WEKA Commercial Data Mining Software and Applications: PASW Modeler (IBM) STATISTICA Data Miner (StatSoft) Enterpriser Miner (SAS) Oracle Data Mining CART/MARS (Salford Systems) Low Price XLMiner ($199) 62

63 63

64 64

65

66 Information www 01.ibm.com/software/analytics/spss/products/modeler d l Training modeling agency.com dli canada.html 66

67 67

68 p// /

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Introduction. A. Bellaachia Page: 1

Introduction. A. Bellaachia Page: 1 Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.

More information

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO What is Data Mining? Data Mining (Knowledge discovery in database) Data Mining: "The non trivial extraction of implicit, previously unknown, and potentially useful information from data" William J Frawley,

More information

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots?

Class 10. Data Mining and Artificial Intelligence. Data Mining. We are in the 21 st century So where are the robots? Class 1 Data Mining Data Mining and Artificial Intelligence We are in the 21 st century So where are the robots? Data mining is the one really successful application of artificial intelligence technology.

More information

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Data Project Extract Big Data Analytics course. Toulouse Business School London 2015

Data Project Extract Big Data Analytics course. Toulouse Business School London 2015 Data Project Extract Big Data Analytics course Toulouse Business School London 2015 How do you analyse data? Project are often a flop: Need a problem, a business problem to solve. Start with a small well-defined

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Web Data Mining: A Case Study. Abstract. Introduction

Web Data Mining: A Case Study. Abstract. Introduction Web Data Mining: A Case Study Samia Jones Galveston College, Galveston, TX 77550 Omprakash K. Gupta Prairie View A&M, Prairie View, TX 77446 okgupta@pvamu.edu Abstract With an enormous amount of data stored

More information

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III

Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed

More information

Data Mining Solutions for the Business Environment

Data Mining Solutions for the Business Environment Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Lecture 6 - Data Mining Processes

Lecture 6 - Data Mining Processes Lecture 6 - Data Mining Processes Dr. Songsri Tangsripairoj Dr.Benjarath Pupacdi Faculty of ICT, Mahidol University 1 Cross-Industry Standard Process for Data Mining (CRISP-DM) Example Application: Telephone

More information

Data Mining: Overview. What is Data Mining?

Data Mining: Overview. What is Data Mining? Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,

More information

A New Approach for Evaluation of Data Mining Techniques

A New Approach for Evaluation of Data Mining Techniques 181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

Data Mining Jargon. Bob Muenchen The Statistical Consulting Center

Data Mining Jargon. Bob Muenchen The Statistical Consulting Center Data Mining Jargon Bob Muenchen The Statistical Consulting Center Data mining is the automated search for useful patterns in data. It uses tools from many different disciplines, each of which uses its

More information

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI

Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: Jerzy.Stefanowski@cs.put.poznan.pl Data Mining a step in A KDD Process Data mining:

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

ANALYTICS CENTER LEARNING PROGRAM

ANALYTICS CENTER LEARNING PROGRAM Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals

More information

Data Mining for Fun and Profit

Data Mining for Fun and Profit Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools

More information

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data

More information

An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics

An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics An Overview of Predictive Analytics for Practitioners Dean Abbott, Abbott Analytics Thank You Sponsors Empower users with new insights through familiar tools while balancing the need for IT to monitor

More information

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI

Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI Introduction to Data Mining and Business Intelligence Lecture 1/DMBI/IKI83403T/MTI/UI Yudho Giri Sucahyo, Ph.D, CISA (yudho@cs.ui.ac.id) Faculty of Computer Science, University of Indonesia Objectives

More information

Data Mining Techniques

Data Mining Techniques 15.564 Information Technology I Business Intelligence Outline Operational vs. Decision Support Systems What is Data Mining? Overview of Data Mining Techniques Overview of Data Mining Process Data Warehouses

More information

Data Mining Analytics for Business Intelligence and Decision Support

Data Mining Analytics for Business Intelligence and Decision Support Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing

More information

Hexaware E-book on Predictive Analytics

Hexaware E-book on Predictive Analytics Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

not possible or was possible at a high cost for collecting the data.

not possible or was possible at a high cost for collecting the data. Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day

More information

CS590D: Data Mining Chris Clifton

CS590D: Data Mining Chris Clifton CS590D: Data Mining Chris Clifton March 10, 2004 Data Mining Process Reminder: Midterm tonight, 19:00-20:30, CS G066. Open book/notes. Thanks to Laura Squier, SPSS for some of the material used How to

More information

An Empirical Study of Application of Data Mining Techniques in Library System

An Empirical Study of Application of Data Mining Techniques in Library System An Empirical Study of Application of Data Mining Techniques in Library System Veepu Uppal Department of Computer Science and Engineering, Manav Rachna College of Engineering, Faridabad, India Gunjan Chindwani

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

Foundations of Business Intelligence: Databases and Information Management

Foundations of Business Intelligence: Databases and Information Management Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of

More information

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010.

Title. Introduction to Data Mining. Dr Arulsivanathan Naidoo Statistics South Africa. OECD Conference Cape Town 8-10 December 2010. Title Introduction to Data Mining Dr Arulsivanathan Naidoo Statistics South Africa OECD Conference Cape Town 8-10 December 2010 1 Outline Introduction Statistics vs Knowledge Discovery Predictive Modeling

More information

Data Warehousing and Data Mining in Business Applications

Data Warehousing and Data Mining in Business Applications 133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business

More information

What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM

What is Customer Relationship Management? Customer Relationship Management Analytics. Customer Life Cycle. Objectives of CRM. Three Types of CRM Relationship Management Analytics What is Relationship Management? CRM is a strategy which utilises a combination of Week 13: Summary information technology policies processes, employees to develop profitable

More information

What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling

What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : msiris@cityu.edu.hk 1 Aims To introduce the basic concepts of data mining

More information

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP

OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key

More information

Nine Common Types of Data Mining Techniques Used in Predictive Analytics

Nine Common Types of Data Mining Techniques Used in Predictive Analytics 1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

Data Mining Applications in Fund Raising

Data Mining Applications in Fund Raising Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,

More information

Application of Predictive Analytics to Higher Degree Research Course Completion Times

Application of Predictive Analytics to Higher Degree Research Course Completion Times Application of Predictive Analytics to Higher Degree Research Course Completion Times Application of Decision Theory to PhD Course Completions (2006 2013) Rachna 1 I Dhand, Senior Strategic Information

More information

MBA 8473 - Data Mining & Knowledge Discovery

MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 - Data Mining & Knowledge Discovery MBA 8473 1 Learning Objectives 55. Explain what is data mining? 56. Explain two basic types of applications of data mining. 55.1. Compare and contrast various

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study

Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study Use Data Mining Techniques to Assist Institutions in Achieving Enrollment Goals: A Case Study Tongshan Chang The University of California Office of the President CAIR Conference in Pasadena 11/13/2008

More information

Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

More information

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened

More information

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS

A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant

More information

Data Mining with SAS. Mathias Lanner mathias.lanner@swe.sas.com. Copyright 2010 SAS Institute Inc. All rights reserved.

Data Mining with SAS. Mathias Lanner mathias.lanner@swe.sas.com. Copyright 2010 SAS Institute Inc. All rights reserved. Data Mining with SAS Mathias Lanner mathias.lanner@swe.sas.com Copyright 2010 SAS Institute Inc. All rights reserved. Agenda Data mining Introduction Data mining applications Data mining techniques SEMMA

More information

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and

STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table

More information

Customer Classification And Prediction Based On Data Mining Technique

Customer Classification And Prediction Based On Data Mining Technique Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor

More information

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d. EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

More information

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam

ECLT 5810 E-Commerce Data Mining Techniques - Introduction. Prof. Wai Lam ECLT 5810 E-Commerce Data Mining Techniques - Introduction Prof. Wai Lam Data Opportunities Business infrastructure have improved the ability to collect data Virtually every aspect of business is now open

More information

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS.

PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS. PSG College of Technology, Coimbatore-641 004 Department of Computer & Information Sciences BSc (CT) G1 & G2 Sixth Semester PROJECT DETAILS Project Project Title Area of Abstract No Specialization 1. Software

More information

Data Mining Techniques in CRM

Data Mining Techniques in CRM Data Mining Techniques in CRM Inside Customer Segmentation Konstantinos Tsiptsis CRM 6- Customer Intelligence Expert, Athens, Greece Antonios Chorianopoulos Data Mining Expert, Athens, Greece WILEY A John

More information

TDWI Best Practice BI & DW Predictive Analytics & Data Mining

TDWI Best Practice BI & DW Predictive Analytics & Data Mining TDWI Best Practice BI & DW Predictive Analytics & Data Mining Course Length : 9am to 5pm, 2 consecutive days 2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7 Venue & Cost

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

IBM SPSS Modeler Professional

IBM SPSS Modeler Professional IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model

More information

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction

COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Analytics on Big Data

Analytics on Big Data Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

More information

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Data Mining and Neural Networks in Stata

Data Mining and Neural Networks in Stata Data Mining and Neural Networks in Stata 2 nd Italian Stata Users Group Meeting Milano, 10 October 2005 Mario Lucchini e Maurizo Pisati Università di Milano-Bicocca mario.lucchini@unimib.it maurizio.pisati@unimib.it

More information

The KDD Process: Applying Data Mining

The KDD Process: Applying Data Mining The KDD Process: Applying Nuno Cavalheiro Marques (nmm@di.fct.unl.pt) Spring Semester 2010/2011 MSc in Computer Science Outline I 1 Knowledge Discovery in Data beyond the Computer 2 by Visualization Lift

More information

Learning outcomes. Knowledge and understanding. Competence and skills

Learning outcomes. Knowledge and understanding. Competence and skills Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

More information

Sunnie Chung. Cleveland State University

Sunnie Chung. Cleveland State University Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:

More information

Make Better Decisions Through Predictive Intelligence

Make Better Decisions Through Predictive Intelligence IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly

More information

Data Mining for Business Analytics

Data Mining for Business Analytics Data Mining for Business Analytics Lecture 2: Introduction to Predictive Modeling Stern School of Business New York University Spring 2014 MegaTelCo: Predicting Customer Churn You just landed a great analytical

More information

Course Syllabus. Purposes of Course:

Course Syllabus. Purposes of Course: Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building

More information

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms

Data Mining is sometimes referred to as KDD and DM and KDD tend to be used as synonyms Data Mining Techniques forcrm Data Mining The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets. Extremely large datasets Discovery of the non-obvious Useful knowledge

More information

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.

DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM. DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,

More information

Banking Analytics Training Program

Banking Analytics Training Program Training (BAT) is a set of courses and workshops developed by Cognitro Analytics team designed to assist banks in making smarter lending, marketing and credit decisions. Analyze Data, Discover Information,

More information

CUSTOMER RELATIONSHIP MANAGEMENT (CRM) CII Institute of Logistics

CUSTOMER RELATIONSHIP MANAGEMENT (CRM) CII Institute of Logistics CUSTOMER RELATIONSHIP MANAGEMENT (CRM) CII Institute of Logistics Session map Session1 Session 2 Introduction The new focus on customer loyalty CRM and Business Intelligence CRM Marketing initiatives Session

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Achieve Better Insight and Prediction with Data Mining

Achieve Better Insight and Prediction with Data Mining Clementine 12.0 Specifications Achieve Better Insight and Prediction with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.

More information

IBM SPSS Direct Marketing 19

IBM SPSS Direct Marketing 19 IBM SPSS Direct Marketing 19 Note: Before using this information and the product it supports, read the general information under Notices on p. 105. This document contains proprietary information of SPSS

More information

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE Dr. Ruchira Bhargava 1 and Yogesh Kumar Jakhar 2 1 Associate Professor, Department of Computer Science, Shri JagdishPrasad Jhabarmal Tibrewala University,

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

8. Machine Learning Applied Artificial Intelligence

8. Machine Learning Applied Artificial Intelligence 8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name

More information

Using multiple models: Bagging, Boosting, Ensembles, Forests

Using multiple models: Bagging, Boosting, Ensembles, Forests Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or

More information