CRISP-DM: The life cicle of a data mining project. KDD Process
|
|
|
- Cecily Hunter
- 9 years ago
- Views:
Transcription
1 CRISP-DM: The life cicle of a data mining project KDD Process
2 Business understanding the project objectives and requirements from a business perspective. then converting this knowledge into a data mining problem definition and a preliminary plan. Determine the Business Objectives Determine requirements for Business Objectives Translate Business questions into Mining Objective
3 Business Preparation Modeling Evaluation Deployment Determine Business Objective Background Business Objective Business Success Criteria Assess Situation Inventory of Resources Requirements Assumptions Constraints Risk and Contingencies Terminology Costs & Benefits Determine Mining Goals Mining Goals Mining Success Criteria Produce Project Plan Project Plan Assessment Of Tools and Techiniques
4 understanding understanding: characterize data available for modelling. Provide assessment and verification for data.
5 Business Preparation Modeling Evaluation Deployment Collect Initial Initial Collection Describe Description Explore Exploration Verify Quality Quality
6 Business Preparation Modeling Evaluation Deployment Select Rationale for Inclusion Exclusion Format Reformatted Clean Cleaning Resulting set Description Construct Derived Attributes Generated Records Integrate Merged
7 Modeling: In this phase, various modeling techniques are selected and applied and their parameters are calibrated to optimal values. Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Therefore, stepping back to the data preparation phase is often necessary.
8 Business Preparation Modeling Evaluation Deployment Selecting Modeling Technique Modeling Technique Modeling Assumptions Generate Test Design Test Design Build Model Parameter Setting Models Model Description Assess Model Model Assessment Revised Parameter Setting
9 Evaluation At this stage in the project you have built a model (or models) that appears to have high quality from a data analysis perspective. Evaluate the model and review the steps executed to construct the model to be certain it properly achieves the business objectives. A key objective is to determine if there is some important business issue that has not been sufficiently considered.
10 Business Preparation Modeling Evaluation Deployment Evaluate Results Assessment Of DMining Results Approved Models Review Process Review of Process Determining Next Steps List of Possible Actions Decisions
11 Deployment: The knowledge gained will need to be organized and presented in a way that the customer can use it. It often involves applying live models within an organization s decision making processes, for example in real time personalization of Web pages or repeated scoring of marketing databases.
12 Deployment: It can be as simple as generating a report or as complex as implementing a repeatable data mining process across the enterprise. In many cases it is the customer, not the data analyst, who carries out the deployment steps.
13 Business Preparation Modeling Evaluation Deployment Plan Deployment Deployment Plan Plan onitoring and Maintenance Produce Final Monitoring and Maintenance Plan Final Final Presentation Review Project Experience Documentation
CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts.
CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts. As a methodology, it includes descriptions of the typical phases
Introduction to Data Mining
Introduction to Data Mining a.j.m.m. (ton) weijters (slides are partially based on an introduction of Gregory Piatetsky-Shapiro) Overview Why data mining (data cascade) Application examples Data Mining
An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics
An Overview of Predictive Analytics for Practitioners Dean Abbott, Abbott Analytics Thank You Sponsors Empower users with new insights through familiar tools while balancing the need for IT to monitor
CS590D: Data Mining Chris Clifton
CS590D: Data Mining Chris Clifton March 10, 2004 Data Mining Process Reminder: Midterm tonight, 19:00-20:30, CS G066. Open book/notes. Thanks to Laura Squier, SPSS for some of the material used How to
CRISP - DM. Data Mining Process. Process Standardization. Why Should There be a Standard Process? Cross-Industry Standard Process for Data Mining
Mining Process CRISP - DM Cross-Industry Standard Process for Mining (CRISP-DM) European Community funded effort to develop framework for data mining tasks Goals: Cross-Industry Standard Process for Mining
Performing a data mining tool evaluation
Performing a data mining tool evaluation Start with a framework for your evaluation Data mining helps you make better decisions that lead to significant and concrete results, such as increased revenue
Predictive Analytics Workshop With IBM SPSS Modeler
Predictive Analytics Workshop With IBM SPSS Modeler Introduction What Makes a Smarter City? Objectives Smarter Public Safety with IBM The Power of Predictive Analytics What IBM Strives to Accomplish in
CRISP-DM: Towards a Standard Process Model for Data Mining
CRISP-DM: Towards a Standard Process Model for Mining Rüdiger Wirth DaimlerChrysler Research & Technology FT3/KL PO BOX 2360 89013 Ulm, Germany [email protected] Jochen Hipp Wilhelm-Schickard-Institute,
CRISP-DM 1.0. Step-by-step data mining guide
CRISP-DM 1.0 Step-by-step data mining guide Pete Chapman (NCR), Julian Clinton (SPSS), Randy Kerber (NCR), Thomas Khabaza (SPSS), Thomas Reinartz (DaimlerChrysler), Colin Shearer (SPSS) and Rüdiger Wirth
Introduction to Data Mining
Introduction to Data Mining José Hernández ndez-orallo Dpto.. de Systems Informáticos y Computación Universidad Politécnica de Valencia, Spain [email protected] Horsens, Denmark, 26th September 2005
Step-by-step data mining guide
Step-by-step data mining guide Pete Chapman (NCR), Julian Clinton (SPSS), Randy Kerber (NCR), Thomas Khabaza (SPSS), Thomas Reinartz (DaimlerChrysler), Colin Shearer (SPSS) and Rüdiger Wirth (DaimlerChrysler)
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
SEER for Software - Going Beyond Out of the Box. David DeWitt Director of Software and IT Consulting
SEER for Software - Going Beyond Out of the Box David DeWitt Director of Software and IT Consulting SEER for Software is considered by a large percentage of the estimation community to be the Gold Standard
EXTENDING UML FOR MODELLING OF DATA MINING CASES
EXTENDING UML FOR MODELLING OF DATA MINING CASES Prof. LADISLAV BURITA VOJTECH ONDRYHAL EXTENDING UML FOR MODELLING OF DATA MINING CASES The article describes possible approach for modelling data mining
APPENDIX E THE ASSESSMENT PHASE OF THE DATA LIFE CYCLE
APPENDIX E THE ASSESSMENT PHASE OF THE DATA LIFE CYCLE The assessment phase of the Data Life Cycle includes verification and validation of the survey data and assessment of quality of the data. Data verification
IBM SPSS Modeler CRISP-DM Guide
IBM SPSS Modeler CRISP-DM Guide Note: Before using this information and the product it supports, read the general information under Notices on p. 40. This edition applies to IBM SPSS Modeler 14 and to
Customer Analysis - Customer analysis is done by analyzing the customer's buying preferences, buying time, budget cycles, etc.
Data Warehouses Data warehousing is the process of constructing and using a data warehouse. A data warehouse is constructed by integrating data from multiple heterogeneous sources that support analytical
ALACC Frequently Asked Questions (FAQs)
Updated October 17, 2012 ALACC Contents Web Introduction to FAQ... 2 FAQ #6... 3 ALACC Section: How to Meet ISO 17025 Requirements for Method Verification (Page 9)... 3 FAQ #13... 4 ALACC Section ALACC
Data Mining. Dr. Saed Sayad. University of Toronto 2010 [email protected]. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 [email protected] http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
CDC UNIFIED PROCESS PRACTICES GUIDE
Purpose The purpose of this document is to provide guidance on the practice of Modeling and to describe the practice overview, requirements, best practices, activities, and key terms related to these requirements.
MEDICAL LABORATORY TECHNICIAN COMPETENCY PROFILE
Description of Work: Positions in this banded class support or perform laboratory tests that are used in the diagnosis and treatment of patients and animals. Duties performed include: receiving or procuring
Project Management Concepts and Strategies
Project Management Concepts and Strategies Contact Hours: 24 Course Description This series provides a detailed examination of project management concepts and strategies. It discusses the seven components
Quality System: Design Control Procedure - Appendix
Quality System: Design Control Procedure - Appendix Page 1 of 10 Quality System: Design Control Procedure - Appendix CORP Medical Products Various details have been removed, indicated by [ ] 1. Overview
This alignment chart was designed specifically for the use of Red River College. These alignments have not been verified or endorsed by the IIBA.
Red River College Course Learning Outcome Alignment with BABOK Version 2 This alignment chart was designed specifically for the use of Red River College. These alignments have not been verified or endorsed
Data Mining with Microsoft SQL Server 2005
International DSI / Asia and Pacific DSI 2007 Full Paper (July, 2007) Data Mining with Microsoft SQL Server 2005 Henning Stolz 1), Peter Lehmann 1),Waranya Poonnawat 3) 1) Institute for Business Intelligence,
Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
Business Intelligence and Decision Support Systems
Chapter 12 Business Intelligence and Decision Support Systems Information Technology For Management 7 th Edition Turban & Volonino Based on lecture slides by L. Beaubien, Providence College John Wiley
The Software Development Life Cycle: An Overview. Last Time. Session 8: Security and Evaluation. Information Systems Security Engineering
The Software Development Life Cycle: An Overview Presented by Maxwell Drew and Dan Kaiser Southwest State University Computer Science Program Last Time Brief review of the testing process Dynamic Testing
An Introduction to the ECSS Software Standards
An Introduction to the ECSS Software Standards Abstract This introduces the background, context, and rationale for the creation of the ECSS standards system presented in this course. Addresses the concept
TASPO-ATS-L System Test Plan
TASPO-ATS-L System Test Plan Automated Targeting System-Land ATS-L_(WR_1941)_STP_1.1 Document Number: ATS-L_(WR_1941)_STP_1.1 October 6, 2011 For Official Use Only - 42414 - TASPO-ATS-L System Test Plan
MilsVPN VPN Tunnel Port Translation. Table of Contents...1 1. Introduction...2 2. VPN Tunnel Settings...2
Page 1 of 8 Table of Contents Table of Contents...1 1. Introduction...2 2. VPN Tunnel Settings...2 2.1 VPN Settings...2 2.2 MilsVPN Service Properties...3 3. Service Object Creation...3 4. Firewall rules
Project Management Plan for
Project Management Plan for [Project ID] Prepared by: Date: [Name], Project Manager Approved by: Date: [Name], Project Sponsor Approved by: Date: [Name], Executive Manager Table of Contents Project Summary...
An Introduction to Advanced Analytics and Data Mining
An Introduction to Advanced Analytics and Data Mining Dr Barry Leventhal Henry Stewart Briefing on Marketing Analytics 19 th November 2010 Agenda What are Advanced Analytics and Data Mining? The toolkit
QUALITY MANAGEMENT SYSTEM REQUIREMENTS General Requirements. Documentation Requirements. General. Quality Manual. Control of Documents
Chapter j 38 Self Assessment 729 QUALITY MANAGEMENT SYSTEM REQUIREMENTS General Requirements 1. Establishing and implementing a documented quality management system 2. Implementing a documented quality
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
Making Business Rules operational. Knut Hinkelmann
Making Business Rules operational Knut Hinkelmann Levels of Expression For expressing rules there is a trade-off between acessibility of business meaning and desirable automation Rules can be expressed
Project Implementation Process (PIP)
Vanderbilt University Medical Center Project Implementation Process (PIP).......... Project Implementation Process OVERVIEW...4 PROJECT PLANNING PHASE...5 PHASE PURPOSE... 5 TASK: TRANSITION FROM PEP TO
Customer and Business Analytic
Customer and Business Analytic Applied Data Mining for Business Decision Making Using R Daniel S. Putler Robert E. Krider CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint
In this presentation, you will be introduced to data mining and the relationship with meaningful use.
In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine
Practice Overview. REQUIREMENTS DEFINITION Issue Date: <mm/dd/yyyy> Revision Date: <mm/dd/yyyy>
DEPARTMENT OF HEALTH AND HUMAN SERVICES ENTERPRISE PERFORMANCE LIFE CYCLE FRAMEWORK PRACTIICES GUIIDE REQUIREMENTS DEFINITION Issue Date: Revision Date: Document
Process for Sales Projection Process Documentation Template: Description Sales Projection (Sales Forecasting) Process
Sales Projection Process Sui Generis Team Process for Sales Projection Process Documentation Template: Item Description Process Title Sales Projection (Sales Forecasting) Process Process # CMPE202-5-Sui1
Engineering a EIA - 632
es for Engineering a System EIA - 632 SE Tutorial es for Engr Sys - 1 Fundamental es for Engineering a System Acquisition and Supply Supply Acquisition es for Engineering A System Technical Management
EXHIBIT L. Application Development Processes
EXHIBIT L Application Development Processes Optum Development Methodology Development Overview Figure 1: Development process flow The Development phase consists of activities that include the building,
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
ExCPT Certified Pharmacy Technician (CPhT) Detailed Test Plan* 100 scored items, 20 pretest items Exam time: 2 hours 10 minutes
ExCPT Certified Pharmacy Technician (CPhT) Detailed Test Plan* 100 scored items, 20 pretest items Exam time: 2 hours 10 minutes # scored items 1. Regulations and Pharmacy Duties 35 A. Overview of technician
VAIL-Plant Asset Integrity Management System. Software Development Process
VAIL-Plant Asset Integrity Management System Software Development Process Document Number: VAIL/SDP/2008/008 Engineering For a Safer World P u b l i c Approved by : Ijaz Ul Karim Rao Revision: 0 Page:2-of-15
THE THREE "Rs" OF PREDICTIVE ANALYTICS
THE THREE "Rs" OF PREDICTIVE As companies commit to big data and data-driven decision making, the demand for predictive analytics has never been greater. While each day seems to bring another story of
Overview. Data Mining Algorithms. Going Through Loops. The CRISP-DM Model. Six Steps. Business Understanding. Data Mining End to End.
Overview Data Mining Algorithms Data Mining End to End Graham Williams Principal Data Miner, ATO Adjunct Associate Professor, ANU 1 Process CRISP-DM 2 Hot Spots NRMA Medicare 3 Evaluation and Communication
XTRADE XFR Financial Limited CIF 108/10
1 Bonuses and Promotions Terms and Conditions 1. General- terms and conditions acceptance 1.1 We expressly reserve the right to amend, supplement or modify these Terms and Conditions at any time without
Conclusion and Future Directions
Chapter 9 Conclusion and Future Directions The success of e-commerce and e-business applications depends upon the trusted users. Masqueraders use their intelligence to challenge the security during transaction
ASSET MANAGEMENT PLANNING PROCESS
Filed: -0- EB--0 Tab Page of ASSET MANAGEMENT PLANNING PROCESS.0 INTRODUCTION Hydro One s Asset Management Plan is a systematic approach to determine and optimize on-going operating and maintenance expenditures
Hazard Analysis and Critical Control Points (HACCP) 1 Overview
Manufacturing Technology Committee Risk Management Working Group Risk Management Training Guides Hazard Analysis and Critical Control Points (HACCP) 1 Overview Hazard Analysis and Critical Control Point
Tom Khabaza. Hard Hats for Data Miners: Myths and Pitfalls of Data Mining
Tom Khabaza Hard Hats for Data Miners: Myths and Pitfalls of Data Mining Hard Hats for Data Miners: Myths and Pitfalls of Data Mining By Tom Khabaza The intrepid data miner runs many risks, including being
MOTION NO. M2016-01 Contract for Transit Scheduling Software PROPOSED ACTION
MOTION NO. M2016-01 Contract for Transit Scheduling Software MEETING: DATE: TYPE OF ACTION: STAFF CONTACT: Operations and Administration Committee 01/07/2016 Final Action Bonnie Todd, Executive Director
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD
72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD Paulo Gottgtroy Auckland University of Technology [email protected] Abstract This paper is
WHITE PAPER ON TECHNICAL DEVELOPMENT APPROACH FOR SAP IMPLEMENTION PROJECTS
WHITE PAPER ON TECHNICAL DEVELOPMENT APPROACH FOR SAP IMPLEMENTION PROJECTS Author Chandra Kumar Meda Table of Contents 1. Objectives and Scope of White Paper... 2 2. Project Phases... 3 3. Methodology...
American Society of Heating Refrigeration and Air Conditioning Engineers (ASHRAE) Procedures for Commercial Building Energy Audits (2004)
Excerpt from: American Society of Heating Refrigeration and Air Conditioning Engineers (ASHRAE) Procedures for Commercial Building Energy Audits (2004) 2004 American Society of Heating, Refrigerating and
System Development and Life-Cycle Management (SDLCM) Methodology. Approval CISSCO Program Director
System Development and Life-Cycle Management (SDLCM) Methodology Subject Type Standard Approval CISSCO Program Director A. PURPOSE This standard specifies content and format requirements for a Physical
Morris Animal Foundation
General Information: Morris Animal Foundation First Award Proposal Morris Animal Foundation (MAF) is dedicated to funding proposals that are relevant to our mission of advancing the health and well-being
UL Qualified Firestop Contractor Program Management System Elements. March 13, 2013
UL Qualified Firestop Contractor Program Management System Elements March 13, 2013 UL and the UL logo are trademarks of UL LLC 2013 Benefits to becoming a Qualified Firestop Contractor Independent, 3 rd
Summary of GAO Cost Estimate Development Best Practices and GAO Cost Estimate Audit Criteria
Characteristic Best Practice Estimate Package Component / GAO Audit Criteria Comprehensive Step 2: Develop the estimating plan Documented in BOE or Separate Appendix to BOE. An analytic approach to cost
IT Project Management Methodology. Project Scope Management Support Guide
NATIONAL INFORMATION TECHNOLOGY AUTHORITY - UGANDA IT Project Management Methodology Project Scope Management Support Guide Version 0.3 Version Date Author Change Description 0.1 23 rd Mar, 2013 Gerald
5.4 Solving Percent Problems Using the Percent Equation
5. Solving Percent Problems Using the Percent Equation In this section we will develop and use a more algebraic equation approach to solving percent equations. Recall the percent proportion from the last
Planning successful data mining projects
IBM SPSS Modeler Planning successful data mining projects A practical, three-step guide to planning your first data mining project and selling it internally Contents: 1 Executive summary 2 One: Start with
TDWI Best Practice BI & DW Predictive Analytics & Data Mining
TDWI Best Practice BI & DW Predictive Analytics & Data Mining Course Length : 9am to 5pm, 2 consecutive days 2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7 Venue & Cost
Information Technology Project Management
Information Technology Project Management by Jack T. Marchewka Power Point Slides by Jack T. Marchewka, Northern Illinois University Copyright 2006 John Wiley & Sons, Inc. all rights reserved. Reproduction
AUDITING A BCP PLAN. Thomas Bronack Auditing a BCP Plan presentation Page: 1
AUDITING A BCP PLAN Thomas Bronack Auditing a BCP Plan presentation Page: 1 What are the Objectives of a Good BCP Plan Protect employees Restore critical business processes or functions to minimize the
MOF MSF. Unitek. Microsoft Operations Framework. Microsoft Solutions Framework. Train. Certify. Succeed.
Unitek MOF MSF Train. Certify. Succeed. Unitek Fremont 39465 Paseo Padre Pkwy #2900 Fremont CA, 94538 Tel: 510-249-1060 Fax: 510-249-9125 Unitek Santa Clara 1700 Wyatt Dr. Suite 15 Santa Clara, CA 95054
Software Test Plan (STP) Template
(STP) Template Items that are intended to stay in as part of your document are in bold; explanatory comments are in italic text. Plain text is used where you might insert wording about your project. This
Advanced Analytics. The Way Forward for Businesses. Dr. Sujatha R Upadhyaya
Advanced Analytics The Way Forward for Businesses Dr. Sujatha R Upadhyaya Nov 2009 Advanced Analytics Adding Value to Every Business In this tough and competitive market, businesses are fighting to gain
EHR Standards Landscape
EHR Standards Landscape Dr Dipak Kalra Centre for Health Informatics and Multiprofessional Education (CHIME) University College London [email protected] A trans-national ehealth Infostructure Wellness
Three proven methods to achieve a higher ROI from data mining
IBM SPSS Modeler Three proven methods to achieve a higher ROI from data mining Take your business results to the next level Highlights: Incorporate additional types of data in your predictive models By
BECS Pre-Trade Analytics. An Overview
BECS Pre-Trade Analytics An Overview January 2010 Citi s Pre-Trade Analytical Products and Services Citi has a long history of providing advanced analytical tools to our clients. Significant effort has
Systems Development Life Cycle (SDLC)
DEPARTMENT OF BUDGET & MANAGEMENT (SDLC) Volume 1 Introduction to the SDLC August 2006 Table of Contents Introduction... 3 Overview... 4 Page 2 of 17 INTRODUCTION 1.0 STRUCTURE The SDLC Manual consists
4 Testing General and Automated Controls
4 Testing General and Automated Controls Learning Objectives To understand the reasons for testing; To have an idea about Audit Planning and Testing; To discuss testing critical control points; To learn
Machine Learning and Data Mining. Fundamentals, robotics, recognition
Machine Learning and Data Mining Fundamentals, robotics, recognition Machine Learning, Data Mining, Knowledge Discovery in Data Bases Their mutual relations Data Mining, Knowledge Discovery in Databases,
TDWI Project Management for Business Intelligence
TDWI Project Management for Business Intelligence Format : C3 Education Course Course Length : 9am to 5pm, 2 consecutive days Date : February, 2012 Venue : Syd / Melb - TBC Cost : Early bird rate $1,998
The Application Readiness Level Metric
The Application Readiness Level Metric NASA Application Readiness Levels (ARLs) The NASA Applied Sciences Program has instituted a nine-step Application Readiness Level (ARL) index to track and manage
Product Accounting & Reporting Standard ICT Sector Guidance Document
Product Accounting & Reporting Standard ICT Sector Guidance Document Summary of Stakeholder Advisory Group Comments on the Telecom Network Services (TNS) and Desktop Managed Services (DMS) Chapters January
DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL. A Thesis. Presented to the. Faculty of. San Diego State University. In Partial Fulfillment
DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL A Thesis Presented to the Faculty of San Diego State University In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science
Il Project Cycle Management :A Technical Guide The Logical Framework Approach
European Institute of Public Administration - Institut européen d administration publique Il Project Cycle Management :A Technical Guide The Logical Framework Approach learning and development - consultancy
