CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

Size: px
Start display at page:

Download "CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19"

Transcription

1 PREFACE xi 1 INTRODUCTION Overview Definition Preparation Overview Accessing Tabular Data Accessing Unstructured Data Understanding the Variables and Observations Data Cleaning Transformation Variable Reduction Segmentation Preparing Data to Apply Analysis Data Mining Tasks Optimization Evaluation Model Forensics Deployment Outline of Book Overview Data Visualization Clustering Predictive Analytics Applications Software Summary Further Reading 17 2 DATA VISUALIZATION Overview Visualization Design Principles General Principles Graphics Design Anatomy of a Graph 28 v

2 vi CONTENTS 2.3 Tables Simple Tables Summary Tables Two-Way Contingency Tables Supertables Univariate Data Visualization Bar Chart Histograms Frequency Polygram Box Plots Dot Plot Stem-and-Leaf Plot Quantile Plot Quantile Quantile Plot Bivariate Data Visualization Scatterplot Multivariate Data Visualization Histogram Matrix Scatterplot Matrix Multiple Box Plot Trellis Plot Visualizing Groups Dendrograms Decision Trees Cluster Image Maps Dynamic Techniques Overview Data Brushing Nearness Selection Sorting and Rearranging Searching and Filtering Summary Further Reading 66 3 CLUSTERING Overview Distance Measures Overview Numeric Distance Measures Binary Distance Measures Mixed Variables Other Measures Agglomerative Hierarchical Clustering Overview Single Linkage Complete Linkage Average Linkage Other Methods Selecting Groups 96

3 vii 3.4 Partitioned-Based Clustering Overview k-means Worked Example Miscellaneous Partitioned-Based Clustering Fuzzy Clustering Overview Fuzzy k-means Worked Examples Summary Further Reading PREDICTIVE ANALYTICS Overview Predictive Modeling Testing Model Accuracy Evaluating Regression Models Predictive Accuracy Evaluating Classification Models Predictive Accuracy Evaluating Binary Models Predictive Accuracy ROC Charts Lift Chart Principal Component Analysis Overview Principal Components Generating Principal Components Interpretation of Principal Components Multiple Linear Regression Overview Generating Models Prediction Analysis of Residuals Standard Error Coefficient of Multiple Determination Testing the Model Significance Selecting and Transforming Variables Discriminant Analysis Overview Discriminant Function Discriminant Analysis Example Logistic Regression Overview Logistic Regression Formula Estimating Coefficients Assessing and Optimizing Results Naive Bayes Classifiers Overview Bayes Theorem and the Independence Assumption Independence Assumption Classification Process 159

4 viii CONTENTS 4.7 Summary Further Reading APPLICATIONS Overview Sales and Marketing Industry-Specific Data Mining Finance Insurance Retail Telecommunications Manufacturing Entertainment Government Pharmaceuticals Healthcare microrna Data Analysis Case Study Defining the Problem Preparing the Data Analysis Credit Scoring Case Study Defining the Problem Preparing the Data Analysis Deployment Data Mining Nontabular Data Overview Data Mining Chemical Data Data Mining Text Further Reading 213 APPENDIX A MATRICES 215 A.1 Overview of Matrices 215 A.2 Matrix Addition 215 A.3 Matrix Multiplication 216 A.4 Transpose of a Matrix 217 A.5 Inverse of a Matrix 217 APPENDIX B SOFTWARE 219 B.1 Software Overview 219 B.1.1 Software Objectives 219 B.1.2 Access and Installation 221 B.1.3 User Interface Overview 221 B.2 Data Preparation 223 B.2.1 Overview 223 B.2.2 Reading in Data 224 B.2.3 Searching the Data 225

5 ix B.2.4 Variable Characterization 227 B.2.5 Removing Observations and Variables 228 B.2.6 Cleaning the Data 228 B.2.7 Transforming the Data 230 B.2.8 Segmentation 235 B.2.9 Principal Component Analysis 236 B.3 Tables and Graphs 238 B.3.1 Overview 238 B.3.2 Contingency Tables 239 B.3.3 Summary Tables 240 B.3.4 Graphs 242 B.3.5 Graph Matrices 246 B.4 Statistics 246 B.4.1 Overview 246 B.4.2 Descriptive Statistics 248 B.4.3 Confidence Intervals 248 B.4.4 Hypothesis Tests 249 B.4.5 Chi-Square Test 250 B.4.6 ANOVA 251 B.4.7 Comparative Statistics 251 B.5 Grouping 253 B.5.1 Overview 253 B.5.2 Clustering 254 B.5.3 Associative Rules 257 B.5.4 Decision Trees 258 B.6 Prediction 261 B.6.1 Overview 261 B.6.2 Linear Regression 263 B.6.3 Discriminant Analysis 265 B.6.4 Logistic Regression 266 B.6.5 Naive Bayes 267 B.6.6 knn 269 B.6.7 CART 269 B.6.8 Neural Networks 270 B.6.9 Apply Model 271 BIBLIOGRAPHY 273 INDEX 279

Customer and Business Analytic

Customer and Business Analytic Customer and Business Analytic Applied Data Mining for Business Decision Making Using R Daniel S. Putler Robert E. Krider CRC Press Taylor &. Francis Group Boca Raton London New York CRC Press is an imprint

More information

Exploratory Data Analysis with MATLAB

Exploratory Data Analysis with MATLAB Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition

Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd Edition Brochure More information from http://www.researchandmarkets.com/reports/2170926/ Data Mining for Business Intelligence. Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner. 2nd

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Business Analytics. Methods, Models, and Decisions. James R. Evans : University of Cincinnati PEARSON

Business Analytics. Methods, Models, and Decisions. James R. Evans : University of Cincinnati PEARSON Business Analytics Methods, Models, and Decisions James R. Evans : University of Cincinnati PEARSON Boston Columbus Indianapolis New York San Francisco Upper Saddle River Amsterdam Cape Town Dubai London

More information

Data Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/

Data Mining. Dr. Saed Sayad. University of Toronto 2010 saed.sayad@utoronto.ca. http://chem-eng.utoronto.ca/~datamining/ Data Mining Dr. Saed Sayad University of Toronto 2010 saed.sayad@utoronto.ca http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING

Practical Applications of DATA MINING. Sang C Suh Texas A&M University Commerce JONES & BARTLETT LEARNING Practical Applications of DATA MINING Sang C Suh Texas A&M University Commerce r 3 JONES & BARTLETT LEARNING Contents Preface xi Foreword by Murat M.Tanik xvii Foreword by John Kocur xix Chapter 1 Introduction

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p.

Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. Introduction p. xvii Introduction to Big Data Analytics p. 1 Big Data Overview p. 2 Data Structures p. 5 Analyst Perspective on Data Repositories p. 9 State of the Practice in Analytics p. 11 BI Versus

More information

A fast, powerful data mining workbench designed for small to midsize organizations

A fast, powerful data mining workbench designed for small to midsize organizations FACT SHEET SAS Desktop Data Mining for Midsize Business A fast, powerful data mining workbench designed for small to midsize organizations What does SAS Desktop Data Mining for Midsize Business do? Business

More information

Alabama Department of Postsecondary Education

Alabama Department of Postsecondary Education Date Adopted 1998 Dates reviewed 2007, 2011, 2013 Dates revised 2004, 2008, 2011, 2013, 2015 Alabama Department of Postsecondary Education Representing Alabama s Public Two-Year College System Jefferson

More information

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier

Data Mining: Concepts and Techniques. Jiawei Han. Micheline Kamber. Simon Fräser University К MORGAN KAUFMANN PUBLISHERS. AN IMPRINT OF Elsevier Data Mining: Concepts and Techniques Jiawei Han Micheline Kamber Simon Fräser University К MORGAN KAUFMANN PUBLISHERS AN IMPRINT OF Elsevier Contents Foreword Preface xix vii Chapter I Introduction I I.

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Neil H. Timm Applied Multivariate Analysis With 42 Figures Springer Contents Preface Acknowledgments List of Tables List of Figures vii ix xix xxiii 1 Introduction 1 1.1 Overview 1 1.2 Multivariate Models

More information

Advanced In-Database Analytics

Advanced In-Database Analytics Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Marketing Analytics. Data-Driven Techniques with Microsoft Excel

Marketing Analytics. Data-Driven Techniques with Microsoft Excel Brochure More information from http://www.researchandmarkets.com/reports/2638581/ Marketing Analytics. Data-Driven Techniques with Microsoft Excel Description: Helping tech-savvy marketers and data analysts

More information

Computer-Aided Multivariate Analysis

Computer-Aided Multivariate Analysis Computer-Aided Multivariate Analysis FOURTH EDITION Abdelmonem Af if i Virginia A. Clark and Susanne May CHAPMAN & HALL/CRC A CRC Press Company Boca Raton London New York Washington, D.C Contents Preface

More information

Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs

Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs 1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be

More information

Multivariate Statistical Inference and Applications

Multivariate Statistical Inference and Applications Multivariate Statistical Inference and Applications ALVIN C. RENCHER Department of Statistics Brigham Young University A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

The Visual Statistics System. ViSta

The Visual Statistics System. ViSta Visual Statistical Analysis using ViSta The Visual Statistics System Forrest W. Young & Carla M. Bann L.L. Thurstone Psychometric Laboratory University of North Carolina, Chapel Hill, NC USA 1 of 21 Outline

More information

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, 2013. p i.

Schneps, Leila; Colmez, Coralie. Math on Trial : How Numbers Get Used and Abused in the Courtroom. New York, NY, USA: Basic Books, 2013. p i. New York, NY, USA: Basic Books, 2013. p i. http://site.ebrary.com/lib/mcgill/doc?id=10665296&ppg=2 New York, NY, USA: Basic Books, 2013. p ii. http://site.ebrary.com/lib/mcgill/doc?id=10665296&ppg=3 New

More information

Predictive Modeling Techniques in Insurance

Predictive Modeling Techniques in Insurance Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics

More information

Data Mining and Visualization

Data Mining and Visualization Data Mining and Visualization Jeremy Walton NAG Ltd, Oxford Overview Data mining components Functionality Example application Quality control Visualization Use of 3D Example application Market research

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments

COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for

More information

Instructions for SPSS 21

Instructions for SPSS 21 1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3

More information

R Graphics Cookbook. Chang O'REILLY. Winston. Tokyo. Beijing Cambridge. Farnham Koln Sebastopol

R Graphics Cookbook. Chang O'REILLY. Winston. Tokyo. Beijing Cambridge. Farnham Koln Sebastopol R Graphics Cookbook Winston Chang Beijing Cambridge Farnham Koln Sebastopol O'REILLY Tokyo Table of Contents Preface ix 1. R Basics 1 1.1. Installing a Package 1 1.2. Loading a Package 2 1.3. Loading a

More information

The Data Mining Process

The Data Mining Process Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

Data Algorithms. Mahmoud Parsian. Tokyo O'REILLY. Beijing. Boston Farnham Sebastopol

Data Algorithms. Mahmoud Parsian. Tokyo O'REILLY. Beijing. Boston Farnham Sebastopol Data Algorithms Mahmoud Parsian Beijing Boston Farnham Sebastopol Tokyo O'REILLY Table of Contents Foreword xix Preface xxi 1. Secondary Sort: Introduction 1 Solutions to the Secondary Sort Problem 3 Implementation

More information

430 Statistics and Financial Mathematics for Business

430 Statistics and Financial Mathematics for Business Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions

More information

THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK

THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK THE CERTIFIED SIX SIGMA BLACK BELT HANDBOOK SECOND EDITION T. M. Kubiak Donald W. Benbow ASQ Quality Press Milwaukee, Wisconsin Table of Contents list of Figures and Tables Preface to the Second Edition

More information

Course Syllabus. Purposes of Course:

Course Syllabus. Purposes of Course: Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building

More information

2015 Workshops for Professors

2015 Workshops for Professors SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market

More information

MTH 140 Statistics Videos

MTH 140 Statistics Videos MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

More information

Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report

Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report 2012 Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report Dinesh Ganti(61310071), Gauri Singh(61310560), Ravi Shankar(61310210), Shouri Kamtala(61310215),

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

The Comparisons. Grade Levels Comparisons. Focal PSSM K-8. Points PSSM CCSS 9-12 PSSM CCSS. Color Coding Legend. Not Identified in the Grade Band

The Comparisons. Grade Levels Comparisons. Focal PSSM K-8. Points PSSM CCSS 9-12 PSSM CCSS. Color Coding Legend. Not Identified in the Grade Band Comparison of NCTM to Dr. Jim Bohan, Ed.D Intelligent Education, LLC Intel.educ@gmail.com The Comparisons Grade Levels Comparisons Focal K-8 Points 9-12 pre-k through 12 Instructional programs from prekindergarten

More information

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics

KATE GLEASON COLLEGE OF ENGINEERING. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM KATE GLEASON COLLEGE OF ENGINEERING John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE (KGCOE- CQAS- 747- Principles of

More information

UNIT 1: COLLECTING DATA

UNIT 1: COLLECTING DATA Core Probability and Statistics Probability and Statistics provides a curriculum focused on understanding key data analysis and probabilistic concepts, calculations, and relevance to real-world applications.

More information

MS1b Statistical Data Mining

MS1b Statistical Data Mining MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics

COLLEGE OF SCIENCE. John D. Hromi Center for Quality and Applied Statistics ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE John D. Hromi Center for Quality and Applied Statistics NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP ABSTRACT In data mining modelling, data preparation

More information

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing! MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics

More information

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014

Maximierung des Geschäftserfolgs durch SAP Predictive Analytics. Andreas Forster, May 2014 Maximierung des Geschäftserfolgs durch SAP Predictive Analytics Andreas Forster, May 2014 Legal Disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e e.edu.in http://www.insof LIST OF COURSES Essential Business Skills for a Data Scientist...

More information

What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling

What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : msiris@cityu.edu.hk 1 Aims To introduce the basic concepts of data mining

More information

Quick Start. Creating a Scoring Application. RStat. Based on a Decision Tree Model

Quick Start. Creating a Scoring Application. RStat. Based on a Decision Tree Model Creating a Scoring Application Based on a Decision Tree Model This Quick Start guides you through creating a credit-scoring application in eight easy steps. Quick Start Century Corp., an electronics retailer,

More information

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

A Correlation of. to the. South Carolina Data Analysis and Probability Standards A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

More information

lop Building Machine Learning Systems with Python en source

lop Building Machine Learning Systems with Python en source Building Machine Learning Systems with Python Master the art of machine learning with Python and build effective machine learning systems with this intensive handson guide Willi Richert Luis Pedro Coelho

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Data analysis process

Data analysis process Data analysis process Data collection and preparation Collect data Prepare codebook Set up structure of data Enter data Screen data for errors Exploration of data Descriptive Statistics Graphs Analysis

More information

How to Get More Value from Your Survey Data

How to Get More Value from Your Survey Data Technical report How to Get More Value from Your Survey Data Discover four advanced analysis techniques that make survey research more effective Table of contents Introduction..............................................................2

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

A Property & Casualty Insurance Predictive Modeling Process in SAS

A Property & Casualty Insurance Predictive Modeling Process in SAS Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing

More information

Analytics on Big Data

Analytics on Big Data Analytics on Big Data Riccardo Torlone Università Roma Tre Credits: Mohamed Eltabakh (WPI) Analytics The discovery and communication of meaningful patterns in data (Wikipedia) It relies on data analysis

More information

Week 1. Exploratory Data Analysis

Week 1. Exploratory Data Analysis Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

More information

From The Little SAS Book, Fifth Edition. Full book available for purchase here.

From The Little SAS Book, Fifth Edition. Full book available for purchase here. From The Little SAS Book, Fifth Edition. Full book available for purchase here. Acknowledgments ix Introducing SAS Software About This Book xi What s New xiv x Chapter 1 Getting Started Using SAS Software

More information

What is Data mining?

What is Data mining? STAT : DATA MIIG Javier Cabrera Fall Business Question Answer Business Question What is Data mining? Find Data Data Processing Extract Information Data Analysis Internal Databases Data Warehouses Internet

More information

Business Intelligence. Data Mining and Optimization for Decision Making

Business Intelligence. Data Mining and Optimization for Decision Making Brochure More information from http://www.researchandmarkets.com/reports/2325743/ Business Intelligence. Data Mining and Optimization for Decision Making Description: Business intelligence is a broad category

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Big Ideas in Mathematics

Big Ideas in Mathematics Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Data exploration with Microsoft Excel: analysing more than one variable

Data exploration with Microsoft Excel: analysing more than one variable Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical

More information

Guidebook to R Graphics Using Microsoft Windows

Guidebook to R Graphics Using Microsoft Windows Brochure More information from http://www.researchandmarkets.com/reports/2162901/ Guidebook to R Graphics Using Microsoft Windows Description: Introduces the graphical capabilities of R to readers new

More information

Unit 1: Introduction to Quality Management

Unit 1: Introduction to Quality Management Unit 1: Introduction to Quality Management Definition & Dimensions of Quality Quality Control vs Quality Assurance Small-Q vs Big-Q & Evolution of Quality Movement Total Quality Management (TQM) & its

More information

Data Mining and Statistics for Decision Making. Wiley Series in Computational Statistics

Data Mining and Statistics for Decision Making. Wiley Series in Computational Statistics Brochure More information from http://www.researchandmarkets.com/reports/2171080/ Data Mining and Statistics for Decision Making. Wiley Series in Computational Statistics Description: Data Mining and Statistics

More information

Big Data Analytics and Optimization

Big Data Analytics and Optimization Big Data Analytics and Optimization C e r t i f i c a t e P r o g r a m i n E n g i n e e r i n g E x c e l l e n c e C e r t i f i c a t e P r o g r a m s i n A c c e l e r a t e d E n g i n e e r i n

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

MANAGEMENT STUDIES (MBA) DETAILED SYLLABUS FOR PART A & B PART A GENERAL PAPER ON TEACHING AND RESEARCH APTITUDE

MANAGEMENT STUDIES (MBA) DETAILED SYLLABUS FOR PART A & B PART A GENERAL PAPER ON TEACHING AND RESEARCH APTITUDE MANAGEMENT STUDIES (MBA) DETAILED SYLLABUS FOR PART A & B PART A GENERAL PAPER ON TEACHING AND RESEARCH APTITUDE PART-B I - Managerial Economics Nature and scope of Managerial Economics. Importance of

More information

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining

Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/8/2004 Hierarchical

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Statgraphics Centurion XVII (Version 17)

Statgraphics Centurion XVII (Version 17) Statgraphics Centurion XVII (currently in beta test) is a major upgrade to Statpoint's flagship data analysis and visualization product. It contains 32 new statistical procedures and significant upgrades

More information

Machine Learning with MATLAB David Willingham Application Engineer

Machine Learning with MATLAB David Willingham Application Engineer Machine Learning with MATLAB David Willingham Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB Streamlining the

More information

CHAPTER 1 THE CERTIFIED QUALITY ENGINEER EXAM. 1.0 The Exam. 2.0 Suggestions for Study. 3.0 CQE Examination Content. Where shall I begin your majesty?

CHAPTER 1 THE CERTIFIED QUALITY ENGINEER EXAM. 1.0 The Exam. 2.0 Suggestions for Study. 3.0 CQE Examination Content. Where shall I begin your majesty? QReview 1 CHAPTER 1 THE CERTIFIED QUALITY ENGINEER EXAM 1.0 The Exam 2.0 Suggestions for Study 3.0 CQE Examination Content Where shall I begin your majesty? The White Rabbit Begin at the beginning, and

More information

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9 Glencoe correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 STANDARDS 6-8 Number and Operations (NO) Standard I. Understand numbers, ways of representing numbers, relationships among numbers,

More information

Ira J. Haimowitz Henry Schwarz

Ira J. Haimowitz Henry Schwarz From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Clustering and Prediction for Credit Line Optimization Ira J. Haimowitz Henry Schwarz General

More information

Chapter 20: Data Analysis

Chapter 20: Data Analysis Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition

Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition Brochure More information from http://www.researchandmarkets.com/reports/2171322/ Data Mining. Concepts, Models, Methods, and Algorithms. 2nd Edition Description: This book reviews state-of-the-art methodologies

More information

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Louise.francis@data-mines.cm

More information

Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

More information

Body of Knowledge for Six Sigma Green Belt

Body of Knowledge for Six Sigma Green Belt Body of Knowledge for Six Sigma Green Belt What to Prepare For: The following is the Six Sigma Green Belt Certification Body of Knowledge that the exam will cover. We strongly encourage you to study and

More information

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days

Delivering Business Intelligence With Microsoft SQL Server 2005 or 2008 HDT922 Five Days or 2008 Five Days Prerequisites Students should have experience with any relational database management system as well as experience with data warehouses and star schemas. It would be helpful if students

More information

INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

More information

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction

New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.

More information

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA

The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Paper 156-2010 The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Abstract JMP has a rich set of visual displays that can help you see the information

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information