Data Science with R. Introducing Data Mining with Rattle and R.
|
|
|
- William Ramsey
- 10 years ago
- Views:
Transcription
1 http: // togaware. com Copyright 2013, 1/35 Data Science with R Introducing Data Mining with Rattle and R [email protected] Senior Director and Chief Data Miner, Analytics Australian Taxation Office Visiting Professor, SIAT, Chinese Academy of Sciences Adjunct Professor, Australian National University Adjunct Professor, University of Canberra Fellow, Institute of Analytics Professionals of Australia [email protected]
2 http: // togaware. com Copyright 2013, 2/35 Overview 1 An Introduction to Data Mining 2 The Rattle Package for Data Mining 3 Moving Into R 4 Getting Started with Rattle
3 An Introduction to Data Mining http: // togaware. com Copyright 2013, 3/35 Overview 1 An Introduction to Data Mining 2 The Rattle Package for Data Mining 3 Moving Into R 4 Getting Started with Rattle
4 An Introduction to Data Mining http: // togaware. com Copyright 2013, 4/35 Data Mining and Big Data Application of Machine Learning Statistics Software Engineering and Programming with Data Intuition To Big Data Volume, Velocity, Variety... to discover new knowledge... to improve business outcomes... to deliver better tailored services
5 An Introduction to Data Mining http: // togaware. com Copyright 2013, 5/35 The Business of Data Mining Australian Taxation Office Lodgment ($110M) Tax Havens ($150M) Tax Fraud ($250M) IBM Buys SPSS for $1.2B in 2009 SAS has annual revenue approaching $3B Analytics is >$100B business Amazon, ebay/paypal, Google...
6 An Introduction to Data Mining http: // togaware. com Copyright 2013, 6/35 Basic Tools: Data Mining Algorithms Linear Discriminant Analysis (lda) Logistic Regression (glm) Decision Trees (rpart, wsrpart) Random Forests (randomforest, wsrf) Boosted Stumps (ada) Neural Networks (nnet) Support Vector Machines (kernlab)... That s a lot of tools to learn in R! Many with different interfaces and options.
7 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 7/35 Overview 1 An Introduction to Data Mining 2 The Rattle Package for Data Mining 3 Moving Into R 4 Getting Started with Rattle
8 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 8/35 Why a GUI? Statistics can be complex and traps await So many tools in R to deliver insights Effective analyses should be scripted Scripting also required for repeatability R is a language for programming with data How to remember how to do all of this in R? How to skill up 150 data analysts with Data Mining?
9 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 9/35 Users of Rattle Today, Rattle is used world wide in many industries Health analytics Customer segmentation and marketing Fraud detection Government It is used by Consultants and Analytics Teams across business Universities to teach Data Mining It is and will remain freely available. CRAN and
10 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 10/35 A Tour Thru Rattle: Startup
11 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 11/35 A Tour Thru Rattle: Loading Data
12 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 12/35 A Tour Thru Rattle: Explore Distribution
13 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 13/35 A Tour Thru Rattle: Explore Correlations
14 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 14/35 A Tour Thru Rattle: Hierarchical Cluster
15 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 15/35 A Tour Thru Rattle: Decision Tree
16 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 16/35 A Tour Thru Rattle: Decision Tree Plot
17 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 17/35 A Tour Thru Rattle: Random Forest
18 The Rattle Package for Data Mining http: // togaware. com Copyright 2013, 18/35 A Tour Thru Rattle: Risk Chart
19 Moving Into R http: // togaware. com Copyright 2013, [email protected] 19/35 Overview 1 An Introduction to Data Mining 2 The Rattle Package for Data Mining 3 Moving Into R 4 Getting Started with Rattle
20 Moving Into R http: // togaware. com Copyright 2013, [email protected] 20/35 Data Miners are Programmers of Data Data miners are programmers of data A GUI can only do so much R is a powerful statistical language Professional data mining Scripting Transparency Repeatability
21 Moving Into R http: // togaware. com Copyright 2013, [email protected] 21/35 From GUI to CLI Rattle s Log Tab
22 Moving Into R http: // togaware. com Copyright 2013, [email protected] 22/35 From GUI to CLI Rattle s Log Tab
23 Moving Into R http: // togaware. com Copyright 2013, [email protected] 23/35 Step 1: Identify the Data dsname <- "weather" target <- "RainTomorrow" risk <- "RISK_MM" form <- formula(paste(target, "~.")) ds <- get(dsname) dim(ds) ## [1] names(ds) ## [1] "Date" "Location" "MinTemp" "... ## [5] "Rainfall" "Evaporation" "Sunshine" "... ## [9] "WindGustSpeed" "WindDir9am" "WindDir3pm" "... ## [13] "WindSpeed3pm" "Humidity9am" "Humidity3pm" "......
24 Moving Into R http: // togaware. com Copyright 2013, [email protected] 24/35 Step 2: Observe the Data head(ds) ## Date Location MinTemp MaxTemp Rainfall Evapora... ## Canberra ## Canberra ## Canberra ## Canberra ## Canberra tail(ds) ## Date Location MinTemp MaxTemp Rainfall Evapo... ## Canberra ## Canberra ## Canberra ## Canberra ## Canberra
25 Moving Into R http: // togaware. com Copyright 2013, [email protected] 25/35 Step 2: Observe the Data str(ds) ## 'data.frame': 366 obs. of 24 variables: ## $ Date : Date, format: " " " ## $ Location : Factor w/ 46 levels "Adelaide","Alba... ## $ MinTemp : num ## $ MaxTemp : num ## $ Rainfall : num summary(ds) ## Date Location MinTemp... ## Min. : Canberra :366 Min. : ## 1st Qu.: Adelaide : 0 1st Qu.: ## Median : Albany : 0 Median : ## Mean : Albury : 0 Mean : ## 3rd Qu.: AliceSprings : 0 3rd Qu.:
26 Moving Into R http: // togaware. com Copyright 2013, [email protected] 26/35 Step 3: Clean the Data Identify Variables (ignore <- c(names(ds)[c(1,2)], risk)) ## [1] "Date" "Location" "RISK_MM" (vars <- setdiff(names(ds), ignore)) ## [1] "MinTemp" "MaxTemp" "Rainfall" "... ## [5] "Sunshine" "WindGustDir" "WindGustSpeed" "... ## [9] "WindDir3pm" "WindSpeed9am" "WindSpeed3pm" "... ## [13] "Humidity3pm" "Pressure9am" "Pressure3pm" " dim(ds[vars]) ## [1]
27 Moving Into R http: // togaware. com Copyright 2013, [email protected] 27/35 Step 3: Clean the Data Remove Missing dim(ds[vars]) ## [1] sum(is.na(ds[vars])) ## [1] 47 ds <- na.omit(ds[vars]) sum(is.na(ds)) ## [1] 0 dim(ds) ## [1]
28 Moving Into R http: // togaware. com Copyright 2013, [email protected] 28/35 Step 3: Clean the Data Target as Categoric summary(ds[target]) ## RainTomorrow ## Min. :0.000 ## 1st Qu.:0.000 ## Median :0.000 ## Mean :0.183 ## 3rd Qu.:0.000 ## Max. : ds[target] <- as.factor(ds[[target]]) levels(ds[target]) <- c("no", "Yes") summary(ds[target]) ## RainTomorrow ## 0:268 ## 1: 60
29 Moving Into R http: // togaware. com Copyright 2013, [email protected] 29/35 Step 4: Build the Model Train/Test (n <- nrow(ds)) ## [1] 328 train <- sample(1:n, 0.70*n) length(train) ## [1] 229 test <- setdiff(1:n, train) length(test) ## [1] 99
30 Moving Into R http: // togaware. com Copyright 2013, [email protected] 30/35 Step 4: Build the Model Random Forest library(randomforest) ## randomforest ## Type rfnews() to see new features/changes/bug fixes. m <- randomforest(form, ds[train,]) m ## ## Call: ## randomforest(formula=form, data=ds[train, ]) ## Type of random forest: classification ## Number of trees: 500 ## No. of variables tried at each split: 4 ## ## OOB estimate of error rate: 12.23% ## Confusion matrix:...
31 Moving Into R http: // togaware. com Copyright 2013, [email protected] 31/35 Step 5: Evaluate the Model Risk Chart pr <- predict(m, ds[test,], type="prob")[,2] ev <- evaluaterisk(pr, ds[test, target], ds[test, risk]) riskchart(ev) 100 Score Lift Performance (%) % 1 Recall (70%) 0 Precision Caseload (%)
32 Getting Started with Rattle http: // togaware. com Copyright 2013, 32/35 Overview 1 An Introduction to Data Mining 2 The Rattle Package for Data Mining 3 Moving Into R 4 Getting Started with Rattle
33 Getting Started with Rattle http: // togaware. com Copyright 2013, 33/35 Installation Rattle is built using R Need to download and install R from cran.r-project.org Recommend also install RStudio from Then start up RStudio and install Rattle: install.packages("rattle") Then we can start up Rattle: rattle() Required packages are loaded as needed.
34 Getting Started with Rattle http: // togaware. com Copyright 2013, 34/35 Resources and References Rattle: OnePageR: Guides: Practise: Book: Data Mining using Rattle/R Chapter: Rattle and Other Tales Paper: A Data Mining GUI for R R Journal, Volume 1(2)
35 That Is All Folks Time For Questions http: // togaware. com Copyright 2013, 35/35 Thank You
How To Understand Data Mining In R And Rattle
http: // togaware. com Copyright 2014, [email protected] 1/40 Data Analytics and Business Intelligence (8696/8697) Introducing Data Science with R and Rattle [email protected] Chief
Data Science with R Ensemble of Decision Trees
Data Science with R Ensemble of Decision Trees [email protected] 3rd August 2014 Visit http://handsondatascience.com/ for more Chapters. The concept of building multiple decision trees to produce
1 R: A Language for Data Mining. 2 Data Mining, Rattle, and R. 3 Loading, Cleaning, Exploring Data in Rattle. 4 Descriptive Data Mining
A Data Mining Workshop Excavating Knowledge from Data Introducing Data Mining using R [email protected] Data Scientist Australian Taxation Office Adjunct Professor, Australian National University
Data Analytics and Business Intelligence (8696/8697)
http: // togaware. com Copyright 2014, [email protected] 1/34 Data Analytics and Business Intelligence (8696/8697) Introducing and Interacting with R [email protected] Chief Data
WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
Hands-On Data Science with R Dealing with Big Data. [email protected]. 27th November 2014 DRAFT
Hands-On Data Science with R Dealing with Big Data [email protected] 27th November 2014 Visit http://handsondatascience.com/ for more Chapters. In this module we explore how to load larger datasets
Didacticiel Études de cas
1 Theme Data Mining with R The rattle package. R (http://www.r project.org/) is one of the most exciting free data mining software projects of these last years. Its popularity is completely justified (see
Data Science Using Open Souce Tools Decision Trees and Random Forest Using R
Data Science Using Open Souce Tools Decision Trees and Random Forest Using R Jennifer Evans Clickfox [email protected] January 14, 2014 Jennifer Evans (Clickfox) Twitter: JenniferE CF January
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
Data Mining. Dr. Saed Sayad. University of Toronto 2010 [email protected]. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 [email protected] http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
R Tools Evaluation. A review by Analytics @ Global BI / Local & Regional Capabilities. Telefónica CCDO May 2015
R Tools Evaluation A review by Analytics @ Global BI / Local & Regional Capabilities Telefónica CCDO May 2015 R Features What is? Most widely used data analysis software Used by 2M+ data scientists, statisticians
2015 Workshops for Professors
SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market
Data Analytics and Business Intelligence (8696/8697)
http: // togaware. com Copyright 2014, [email protected] 1/39 Data Analytics and Business Intelligence (8696/8697) Introducing Data Mining [email protected] Chief Data Scientist Australian
COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
What is R? R s Advantages R s Disadvantages Installing and Maintaining R Ways of Running R An Example Program Where to Learn More
Bob Muenchen, Author R for SAS and SPSS Users, Co-Author R for Stata Users [email protected], http://r4stats.com What is R? R s Advantages R s Disadvantages Installing and Maintaining R Ways of Running
Predictive Modeling and Big Data
Predictive Modeling and Presented by Eileen Burns, FSA, MAAA Milliman Agenda Current uses of predictive modeling in the life insurance industry Potential applications of 2 1 June 16, 2014 [Enter presentation
Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs
1.1 Introduction Lavastorm Analytic Library Predictive and Statistical Analytics Node Pack FAQs For brevity, the Lavastorm Analytics Library (LAL) Predictive and Statistical Analytics Node Pack will be
Classification and Regression by randomforest
Vol. 2/3, December 02 18 Classification and Regression by randomforest Andy Liaw and Matthew Wiener Introduction Recently there has been a lot of interest in ensemble learning methods that generate many
ANALYTICS CENTER LEARNING PROGRAM
Overview of Curriculum ANALYTICS CENTER LEARNING PROGRAM The following courses are offered by Analytics Center as part of its learning program: Course Duration Prerequisites 1- Math and Theory 101 - Fundamentals
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
Big Data Analytics. Benchmarking SAS, R, and Mahout. Allison J. Ames, Ralph Abbey, Wayne Thompson. SAS Institute Inc., Cary, NC
Technical Paper (Last Revised On: May 6, 2013) Big Data Analytics Benchmarking SAS, R, and Mahout Allison J. Ames, Ralph Abbey, Wayne Thompson SAS Institute Inc., Cary, NC Accurate and Simple Analysis
The R pmmltransformations Package
The R pmmltransformations Package Tridivesh Jena Alex Guazzelli Wen-Ching Lin Michael Zeller Zementis, Inc.* Zementis, Inc. Zementis, Inc. Zementis, Inc. Tridivesh.Jena@ Alex.Guazzelli@ Wenching.Lin@ Michael.Zeller@
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar
Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com [email protected]
Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD
Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,
Predictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar
Data Science with R Cluster Analysis
Data Science with R Cluster Analysis [email protected] 22nd June 2014 Visit http://onepager.togaware.com/ for more OnePageR s. We focus on the unsupervised method of cluster analysis in this
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining
Oracle Advanced Analytics Oracle R Enterprise & Oracle Data Mining R The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated
Data Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
Data Analysis Bootcamp - What To Expect. Damian Herrick Founder, Principal Consultant Lake Hill Analytics, LLC
Data Analysis Bootcamp - What To Expect Damian Herrick Founder, Principal Consultant Lake Hill Analytics, LLC Why Are Companies Using Data and Analytics Today? Data + Predictive Ability + Optimization
Maximize Revenues on your Customer Loyalty Program using Predictive Analytics
Maximize Revenues on your Customer Loyalty Program using Predictive Analytics 27 th Feb 14 Free Webinar by Before we begin... www Q & A? Your Speakers @parikh_shachi Technical Analyst @tatvic Loves js
Using multiple models: Bagging, Boosting, Ensembles, Forests
Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or
CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19
PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations
SEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. BIG DATA CONFERENCE 2015 Boston August 10-13 Predicting and reducing deforestation
Predictive modelling around the world 28.11.13
Predictive modelling around the world 28.11.13 Agenda Why this presentation is really interesting Introduction to predictive modelling Case studies Conclusions Why this presentation is really interesting
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
How to Optimize Your Data Mining Environment
WHITEPAPER How to Optimize Your Data Mining Environment For Better Business Intelligence Data mining is the process of applying business intelligence software tools to business data in order to create
Index Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90
FREE echapter C H A P T E R1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. In relative terms, this means 90 percent of the data in the
Hands-On Data Science with R Exploring Data with GGPlot2. [email protected]. 22nd May 2015 DRAFT
Hands-On Data Science with R Exploring Data with GGPlot2 [email protected] 22nd May 215 Visit http://handsondatascience.com/ for more Chapters. The ggplot2 (Wickham and Chang, 215) package implements
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
IBM SPSS Modeler Professional
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
IBM SPSS Modeler Premium
IBM SPSS Modeler Premium Improve model accuracy with structured and unstructured data, entity analytics and social network analysis Highlights Solve business problems faster with analytical techniques
M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 3 - Random Forest
Nathalie Villa-Vialaneix Année 2014/2015 M1 in Economics and Economics and Statistics Applied multivariate Analysis - Big data analytics Worksheet 3 - Random Forest This worksheet s aim is to learn how
Big Data and Marketing
Big Data and Marketing Professor Venky Shankar Coleman Chair in Marketing Director, Center for Retailing Studies Mays Business School Texas A&M University http://www.venkyshankar.com [email protected]
Make Better Decisions Through Predictive Intelligence
IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly
Package acrm. R topics documented: February 19, 2015
Package acrm February 19, 2015 Type Package Title Convenience functions for analytical Customer Relationship Management Version 0.1.1 Date 2014-03-28 Imports dummies, randomforest, kernelfactory, ada Author
A Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
Data Mining. SPSS Clementine 12.0. 1. Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine
Data Mining SPSS 12.0 1. Overview Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Types of Models Interface Projects References Outline Introduction Introduction Three of the common data mining
The Predictive Data Mining Revolution in Scorecards:
January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms
Get to Know the IBM SPSS Product Portfolio
IBM Software Business Analytics Product portfolio Get to Know the IBM SPSS Product Portfolio Offering integrated analytical capabilities that help organizations use data to drive improved outcomes 123
Up Your R Game. James Taylor, Decision Management Solutions Bill Franks, Teradata
Up Your R Game James Taylor, Decision Management Solutions Bill Franks, Teradata Today s Speakers James Taylor Bill Franks CEO Chief Analytics Officer Decision Management Solutions Teradata 7/28/14 3 Polling
SEIZE THE DATA. 2015 SEIZE THE DATA. 2015
1 Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Deep dive into Haven Predictive Analytics Powered by HP Distributed R and
Risk pricing for Australian Motor Insurance
Risk pricing for Australian Motor Insurance Dr Richard Brookes November 2012 Contents 1. Background Scope How many models? 2. Approach Data Variable filtering GLM Interactions Credibility overlay 3. Model
Course Syllabus. Purposes of Course:
Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building
BIG DATA What it is and how to use?
BIG DATA What it is and how to use? Lauri Ilison, PhD Data Scientist 21.11.2014 Big Data definition? There is no clear definition for BIG DATA BIG DATA is more of a concept than precise term 1 21.11.14
ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis
ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2
Oracle 11g DB Data Warehousing ETL OLAP Statistics Anomaly and Fraud Detection with Oracle Data Mining 11g Release 2 Data Mining Charlie Berger Sr. Director Product Management, Data
What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling
MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : [email protected] 1 Aims To introduce the basic concepts of data mining
Easily Identify Your Best Customers
IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do
Big Data and Data Science: Behind the Buzz Words
Big Data and Data Science: Behind the Buzz Words Peggy Brinkmann, FCAS, MAAA Actuary Milliman, Inc. April 1, 2014 Contents Big data: from hype to value Deconstructing data science Managing big data Analyzing
Leveraging Ensemble Models in SAS Enterprise Miner
ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank
Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through
Package bigrf. February 19, 2015
Version 0.1-11 Date 2014-05-16 Package bigrf February 19, 2015 Title Big Random Forests: Classification and Regression Forests for Large Data Sets Maintainer Aloysius Lim OS_type
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
An Overview and Evaluation of Decision Tree Methodology
An Overview and Evaluation of Decision Tree Methodology ASA Quality and Productivity Conference Terri Moore Motorola Austin, TX [email protected] Carole Jesse Cargill, Inc. Wayzata, MN [email protected]
IN THE CITY OF NEW YORK Decision Risk and Operations. Advanced Business Analytics Fall 2015
Advanced Business Analytics Fall 2015 Course Description Business Analytics is about information turning data into action. Its value derives fundamentally from information gaps in the economic choices
How To Make A Credit Risk Model For A Bank Account
TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző [email protected] 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions
In this presentation, you will be introduced to data mining and the relationship with meaningful use.
In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine
Modeling Lifetime Value in the Insurance Industry
Modeling Lifetime Value in the Insurance Industry C. Olivia Parr Rud, Executive Vice President, Data Square, LLC ABSTRACT Acquisition modeling for direct mail insurance has the unique challenge of targeting
IBM SPSS Direct Marketing
IBM Software IBM SPSS Statistics 19 IBM SPSS Direct Marketing Understand your customers and improve marketing campaigns Highlights With IBM SPSS Direct Marketing, you can: Understand your customers in
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Prerequisites. Course Outline
MS-55040: Data Mining, Predictive Analytics with Microsoft Analysis Services and Excel PowerPivot Description This three-day instructor-led course will introduce the students to the concepts of data mining,
Data Mining with R. Decision Trees and Random Forests. Hugh Murrell
Data Mining with R Decision Trees and Random Forests Hugh Murrell reference books These slides are based on a book by Graham Williams: Data Mining with Rattle and R, The Art of Excavating Data for Knowledge
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Starting Smart with Oracle Advanced Analytics
Starting Smart with Oracle Advanced Analytics Great Lakes Oracle Conference Tim Vlamis Thursday, May 19, 2016 Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed
Predictive Analytics Certificate Program
Information Technologies Programs Predictive Analytics Certificate Program Accelerate Your Career Offered in partnership with: University of California, Irvine Extension s professional certificate and
Data are everywhere. IBM projects that every day we generate 2.5
C HAPTER 1 Big Data and Analytics Data are everywhere. IBM projects that every day we generate 2.5 quintillion bytes of data. 1 In relative terms, this means 90 percent of the data in the world has been
Data Mining Lab 5: Introduction to Neural Networks
Data Mining Lab 5: Introduction to Neural Networks 1 Introduction In this lab we are going to have a look at some very basic neural networks on a new data set which relates various covariates about cheese
IBM SPSS Modeler 15 In-Database Mining Guide
IBM SPSS Modeler 15 In-Database Mining Guide Note: Before using this information and the product it supports, read the general information under Notices on p. 217. This edition applies to IBM SPSS Modeler
Role of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
Oracle Data Miner (Extension of SQL Developer 4.0)
An Oracle White Paper September 2013 Oracle Data Miner (Extension of SQL Developer 4.0) Integrate Oracle R Enterprise Mining Algorithms into a workflow using the SQL Query node Denny Wong Oracle Data Mining
White Paper. How Streaming Data Analytics Enables Real-Time Decisions
White Paper How Streaming Data Analytics Enables Real-Time Decisions Contents Introduction... 1 What Is Streaming Analytics?... 1 How Does SAS Event Stream Processing Work?... 2 Overview...2 Event Stream
MS1b Statistical Data Mining
MS1b Statistical Data Mining Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Administrivia and Introduction Course Structure Syllabus Introduction to
Hexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
Our Raison d'être. Identify major choice decision points. Leverage Analytical Tools and Techniques to solve problems hindering these decision points
Analytic 360 Our Raison d'être Identify major choice decision points Leverage Analytical Tools and Techniques to solve problems hindering these decision points Empowerment through Intelligence Our Suite
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
