TDWI Best Practice BI & DW Predictive Analytics & Data Mining Course Length : 9am to 5pm, 2 consecutive days 2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7 Venue & Cost : Click here or visit C3 Education at www.c3businesssolutions.com Inclusions : Morning tea, lunch & afternoon tea each day Course Outline : Course workbook & presentation notes This two-day hands-on methodical workshop offers a comprehensive project-level orientation to: 1. Predictive analytics solutions -- from project assessment and preparation to industry standard process, case demonstrations, pragmatic exercises, and model lifecycle management. 2. Data mining methods and process at the tactical level. Day 1: Predictive analytics. Low risk strategies for high impact projects If you are looking for an intensive vendor-neutral and non-promotional introduction to data mining best practices and an approach to predictive analytics that is critical to modeling success, then this course is designed for you. Those in attendance will actively step through the industry standard process for data mining and realize why an advanced degree in statistics, mathematics, or computer science not required to establish a productive internal predictive analytics practice. Live working sessions reveal real-world obstacles and breakthroughs from which to interpret, learn, and apply. You Will Learn Process, principles and terminology for predictive analytics Who is utilizing predictive analytics, and why Common project pitfalls and how to avoid them Project performance and maintenance issues How to define business objectives for a decision-support system Hands-on exposure to the natural messiness of data mining How to get started Day 2: Data mining methods & techniques. Data preparation, model-building and evaluation Attendees will observe demonstrations of machine learning methods and computer-guided analytical techniques for extracting and interpreting complex patterns and relationships from large volumes of data. If you desire an intensive functional orientation to data mining concepts, tools, techniques, and supporting methods, this session is designed for you.
This vendor-neutral course broadly covers data-driven information discovery techniques and model-building tactics free of bias to any particular modeling tool or method. Popular open source and commercial packages are leveraged to illustrate methods, but not to showcase the tools. You Will Learn The data mining process and general implementation How to prepare raw data and benefit from visualization Key data mining methods and how they compare How to validate models and assess their value Data mining product selection Solution integration, ongoing performance, and maintenance Where to begin and how to obtain resources and support Ideal for IT professionals who wish to expand their business intelligence skills IT/IS executives and managers: CIOs, CKOs, CTOs, technical directors Project leaders who must extract value from their data Line of business executives and functional managers, analysts and forecasters Decision support system architects Technology planners who survey emerging technologies in order to prioritize corporate investment Consultants requiring competency in data mining and related emerging information technologies. Presenter Thomas A. (Tony) Rathburn Senior Consultant, The Modeling Agency Tony Rothburn has more than 20 years of experience in the business utilization of predictive analytics technologies. Mr. Rathburn taught MIS and statistics while an instructor in the College of Business at Kent State University. He also served as vice president of applied technologies for NeuralWare, Incorporated, a neural network tools and consulting company. Mr. Rathburn is a senior consultant with The Modeling Agency a Pennsylvania company that provides guidance and results for those who are data rich, yet information poor. Registration Please register your interest on the Education page to secure your place and receive date confirmation notifications. About TDWI TDWI, a division of 1105 Media, is the premier provider of in-depth, high-quality education and research in the business intelligence and data warehousing industry. Starting in 1995 with a single conference, TDWI is now a comprehensive resource for industry information and professional development opportunities. TDWI sponsors and promotes quarterly World Conferences, topical seminars, onsite education, a worldwide Membership program, business intelligence certification, resourceful publications, industry news, an in-depth research program, and a comprehensive website, www.tdwi.org. 2
Course Detail: TDWI Best Practice BI & DW Predictive Analytics & Data Mining Day 1: Predictive analytics. Low risk strategies for high-impact projects Core Concepts Beyond Traditional Statistics o Assumptions of Traditional Statistics o Shift your thinking Behaviours of Interest Goal of Modeling Modeling Human Behaviour Components of Mathematical Models Uses of Formulas Winning at the game we call business Attributes of a game Project Success Survey Predictive Analytics ROI Survey Predictive Analytics Business Goals Analytic Goals Why Predictive Analytics? Lab 1: Introduction Types of Models Response Risk Attrition Activation Cross-Sell and Up-Sell Profile Analysis Segmentation Net Present Value Lifetime Value Why Predictive Analytics? Low-Risk / High-ROI Project Design Low-Risk / High-ROI Projects Phased Development Cycle Positive Impact Behaviour Modeling Negative Impact Behaviour Modeling Conflict Resolution Modeling Ranking Across the Continuum Dimensionality Enhancement Refining Precision Forecasting Lab 2: Opportunity Conceptualization The CRISP-DM Process Model Environment Development Data Sandbox CRISP Development 3
o Business Understanding o Relationship Solution Space o Determine Modeling Objectives o Data Understanding o Data Preparation o Modeling A Sampling of Commercial Data Mining Software Products o Validation o Implementation Predictive Analytics is Analysis not Engineering Business Understanding Project Team Performance Metrics o Determine Business Objectives o Conversion: Objectives to Metrics o Handling Multiple Metrics o Lift and Gains Chart Interpretation o Custom Performance Charts o Enhancing Performance with Threshold Evaluation o Calculating the Current Baseline: Uplift Analysis Lab 3: Performance Metrics Modeling Objectives o The Case for Classification o Prioritize the Dependent Variable o Precision Requirements o Training for what should be done not what was done o Confirming Compatibility o Defining Modeling Objectives o Resource Availability Experimental Design o How Much Data is Needed to Develop a Model? o Training Data for Classification o Training Data for Prediction o How Many Variables? o Purpose of Experimental Design o Experimental Design: Statistics vs. Predictive Analytics o Data Sets Used o Type of Data Distribution Lab 4: Experimental Design & Data Sandbox Construction Data Understanding Data Set Determination Availability Requirements Planning Data Quality Issues o Data Errors o Outliers o Missing Data Data Types: Behavioural Characteristics o Demographic Data 4
o Behavioural Data o Psychographic Data Data Types: Mathematical Characteristics o Qualitative Variables Categorical Data Nominal Data o Quantitative Variables Ordinal Data Interval Data Continuous Data Lab 5: Data Understanding Data Preparation Data Representation Expectations o Natural Values o Binning o Bin Boundary Determination o Open Ended Ranges o Collapsed Sets o 1ofN Representations o Thermometer Representation o Bipolar Representation o Fuzzy Boundaries o Multiple Boundary Strategies o Controlling Error Data Transformation Expectations o Conversion to Linear o Converting the Shape of the Distribution o Ratios o Roll-ups o Domain Specific Transformations Data Resource Consumption Considerations Data Extraction for Replicability Lab 6: Data Preparation Modeling Matching Techniques to the Project Goals o Classification Modeling Techniques o Forecasting Modeling Techniques Variable Selection Candidate Model Evaluation Lab 7: Model Development Validation & Evaluation Lab 8: Model Evaluation & Validation Deployment End User Interface Model Run Cycle Model Maintenance 5
Summary and Next Steps Formal Project Assessment o Business Understanding o Data Understanding o Report of Findings & Recommendations Resources Day 2: Data mining methods & techniques. Data preparation, model-building and evaluation Introduction Why build predictive models? Why use advanced technologies? Why project definition is critical Why standard statistical analysis is not enough Shifting our focus Data Understanding and Preparation The Data Sandbox o The opportunity development space o Supporting phased development efforts o Capturing extract detail o Documenting representation and transformation options o Data sets developed o Training o Test o Validation Data Types Content attributes o Demographic variables o Psychographic variables o Behavioral variables o RFM variables Data Types Analytic Attributes o Qualitative variables Categorical variables Nominal variables o Quantitative variables Ordinal variables Interval variables Continuous variables Data Errors o Identification of data errors o Treatment of data errors Outliers o Identification of outliers o Treatment of outliers o How PA differs from traditional statistics Traditional statistics works o Ensemble solution development o Multiple models for different areas of the solution space o What works where Basic statistics the in general perspective (demo) Data representation strategies (demo) 6
Data transformation strategies (demo) Model Development & Evaluation Solution Types Supervised training techniques o Classification Single tail models Two-tailed models Ranking across the continuum o Forecasting models o Related behaviour modeling Unsupervised training techniques o Segmentation and clustering Candidate Model Development Technique Overview Linear regression Logistic regression Decision trees Clustering & segmentation Neural networks Evaluating Performance Analytic metric evaluation Business metric evaluation Model Selection Model Validation Phase 1 Positive Benefit Model (demo) Phase 2 Negative impact model (demo) False positive reduction (demo) Summary and Next Steps The Complementary Strategic Course Resources 7