CRISP-DM: The life cicle of a data mining project. KDD Process



Similar documents
CRISP-DM, which stands for Cross-Industry Standard Process for Data Mining, is an industry-proven way to guide your data mining efforts.

Introduction to Data Mining

An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics

CS590D: Data Mining Chris Clifton

CRISP - DM. Data Mining Process. Process Standardization. Why Should There be a Standard Process? Cross-Industry Standard Process for Data Mining

Performing a data mining tool evaluation

Predictive Analytics Workshop With IBM SPSS Modeler

CRISP-DM: Towards a Standard Process Model for Data Mining

CRISP-DM 1.0. Step-by-step data mining guide

Introduction to Data Mining

Step-by-step data mining guide

Database Marketing, Business Intelligence and Knowledge Discovery

SEER for Software - Going Beyond Out of the Box. David DeWitt Director of Software and IT Consulting

EXTENDING UML FOR MODELLING OF DATA MINING CASES

APPENDIX E THE ASSESSMENT PHASE OF THE DATA LIFE CYCLE

IBM SPSS Modeler CRISP-DM Guide

Customer Analysis - Customer analysis is done by analyzing the customer's buying preferences, buying time, budget cycles, etc.

ALACC Frequently Asked Questions (FAQs)

Data Mining. Dr. Saed Sayad. University of Toronto

CDC UNIFIED PROCESS PRACTICES GUIDE

MEDICAL LABORATORY TECHNICIAN COMPETENCY PROFILE

Project Management Concepts and Strategies

Quality System: Design Control Procedure - Appendix

This alignment chart was designed specifically for the use of Red River College. These alignments have not been verified or endorsed by the IIBA.

Data Mining with Microsoft SQL Server 2005

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone:

Business Intelligence and Decision Support Systems

The Software Development Life Cycle: An Overview. Last Time. Session 8: Security and Evaluation. Information Systems Security Engineering

An Introduction to the ECSS Software Standards

TASPO-ATS-L System Test Plan

MilsVPN VPN Tunnel Port Translation. Table of Contents Introduction VPN Tunnel Settings...2

Project Management Plan for

An Introduction to Advanced Analytics and Data Mining

QUALITY MANAGEMENT SYSTEM REQUIREMENTS General Requirements. Documentation Requirements. General. Quality Manual. Control of Documents

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

Making Business Rules operational. Knut Hinkelmann

Project Implementation Process (PIP)

Customer and Business Analytic

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Practice Overview. REQUIREMENTS DEFINITION Issue Date: <mm/dd/yyyy> Revision Date: <mm/dd/yyyy>

Process for Sales Projection Process Documentation Template: Description Sales Projection (Sales Forecasting) Process

Engineering a EIA - 632

EXHIBIT L. Application Development Processes

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

ExCPT Certified Pharmacy Technician (CPhT) Detailed Test Plan* 100 scored items, 20 pretest items Exam time: 2 hours 10 minutes

VAIL-Plant Asset Integrity Management System. Software Development Process

THE THREE "Rs" OF PREDICTIVE ANALYTICS

Overview. Data Mining Algorithms. Going Through Loops. The CRISP-DM Model. Six Steps. Business Understanding. Data Mining End to End.

XTRADE XFR Financial Limited CIF 108/10

Conclusion and Future Directions

ASSET MANAGEMENT PLANNING PROCESS

Hazard Analysis and Critical Control Points (HACCP) 1 Overview

Tom Khabaza. Hard Hats for Data Miners: Myths and Pitfalls of Data Mining

MOTION NO. M Contract for Transit Scheduling Software PROPOSED ACTION

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

WHITE PAPER ON TECHNICAL DEVELOPMENT APPROACH FOR SAP IMPLEMENTION PROJECTS

American Society of Heating Refrigeration and Air Conditioning Engineers (ASHRAE) Procedures for Commercial Building Energy Audits (2004)

System Development and Life-Cycle Management (SDLCM) Methodology. Approval CISSCO Program Director

Morris Animal Foundation

UL Qualified Firestop Contractor Program Management System Elements. March 13, 2013

Summary of GAO Cost Estimate Development Best Practices and GAO Cost Estimate Audit Criteria

IT Project Management Methodology. Project Scope Management Support Guide

5.4 Solving Percent Problems Using the Percent Equation

Planning successful data mining projects

TDWI Best Practice BI & DW Predictive Analytics & Data Mining

Information Technology Project Management

AUDITING A BCP PLAN. Thomas Bronack Auditing a BCP Plan presentation Page: 1

MOF MSF. Unitek. Microsoft Operations Framework. Microsoft Solutions Framework. Train. Certify. Succeed.

Software Test Plan (STP) Template

Advanced Analytics. The Way Forward for Businesses. Dr. Sujatha R Upadhyaya

EHR Standards Landscape

Three proven methods to achieve a higher ROI from data mining

BECS Pre-Trade Analytics. An Overview

Systems Development Life Cycle (SDLC)

4 Testing General and Automated Controls

Machine Learning and Data Mining. Fundamentals, robotics, recognition

TDWI Project Management for Business Intelligence

The Application Readiness Level Metric

Product Accounting & Reporting Standard ICT Sector Guidance Document

DATA ANALYSIS USING BUSINESS INTELLIGENCE TOOL. A Thesis. Presented to the. Faculty of. San Diego State University. In Partial Fulfillment

Il Project Cycle Management :A Technical Guide The Logical Framework Approach

Transcription:

CRISP-DM: The life cicle of a data mining project KDD Process

Business understanding the project objectives and requirements from a business perspective. then converting this knowledge into a data mining problem definition and a preliminary plan. Determine the Business Objectives Determine requirements for Business Objectives Translate Business questions into Mining Objective

Business Preparation Modeling Evaluation Deployment Determine Business Objective Background Business Objective Business Success Criteria Assess Situation Inventory of Resources Requirements Assumptions Constraints Risk and Contingencies Terminology Costs & Benefits Determine Mining Goals Mining Goals Mining Success Criteria Produce Project Plan Project Plan Assessment Of Tools and Techiniques

understanding understanding: characterize data available for modelling. Provide assessment and verification for data.

Business Preparation Modeling Evaluation Deployment Collect Initial Initial Collection Describe Description Explore Exploration Verify Quality Quality

Business Preparation Modeling Evaluation Deployment Select Rationale for Inclusion Exclusion Format Reformatted Clean Cleaning Resulting set Description Construct Derived Attributes Generated Records Integrate Merged

Modeling: In this phase, various modeling techniques are selected and applied and their parameters are calibrated to optimal values. Typically, there are several techniques for the same data mining problem type. Some techniques have specific requirements on the form of data. Therefore, stepping back to the data preparation phase is often necessary.

Business Preparation Modeling Evaluation Deployment Selecting Modeling Technique Modeling Technique Modeling Assumptions Generate Test Design Test Design Build Model Parameter Setting Models Model Description Assess Model Model Assessment Revised Parameter Setting

Evaluation At this stage in the project you have built a model (or models) that appears to have high quality from a data analysis perspective. Evaluate the model and review the steps executed to construct the model to be certain it properly achieves the business objectives. A key objective is to determine if there is some important business issue that has not been sufficiently considered.

Business Preparation Modeling Evaluation Deployment Evaluate Results Assessment Of DMining Results Approved Models Review Process Review of Process Determining Next Steps List of Possible Actions Decisions

Deployment: The knowledge gained will need to be organized and presented in a way that the customer can use it. It often involves applying live models within an organization s decision making processes, for example in real time personalization of Web pages or repeated scoring of marketing databases.

Deployment: It can be as simple as generating a report or as complex as implementing a repeatable data mining process across the enterprise. In many cases it is the customer, not the data analyst, who carries out the deployment steps.

Business Preparation Modeling Evaluation Deployment Plan Deployment Deployment Plan Plan onitoring and Maintenance Produce Final Monitoring and Maintenance Plan Final Final Presentation Review Project Experience Documentation