Data Mining: STATISTICA



Similar documents
Example 3: Predictive Data Mining and Deployment for a Continuous Output Variable

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

How to Deploy Models using Statistica SVB Nodes

Data Mining III: Numeric Estimation

COC131 Data Mining - Clustering

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

Data Mining Part 5. Prediction

2 Decision tree + Cross-validation with R (package rpart)

not possible or was possible at a high cost for collecting the data.

from Larson Text By Susan Miertschin

The Predictive Data Mining Revolution in Scorecards:

Deployment of Predictive Models. Sumit Kumar Bardhan

2. A typical business process

Data Mining with Weka

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Data Mining. Nonlinear Classification

EXCEL XML SPREADSHEET TUTORIAL

Predictive Modeling and Big Data

Predictive Analytics: The Postsecondary Use Case

Community College Transfer Students Persistence at University

How To Understand How Weka Works

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Data Mining. SPSS Clementine Clementine Overview. Spring 2010 Instructor: Dr. Masoud Yaghini. Clementine

Data Mining Part 5. Prediction

Knowledge Discovery and Data Mining

Data Mining for Customer Service Support. Senioritis Seminar Presentation Megan Boice Jay Carter Nick Linke KC Tobin

ID-Spec Large boosts your performance in electrical distribution design :

testo dello schema Secondo livello Terzo livello Quarto livello Quinto livello

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

What is Data Mining? Data Mining (Knowledge discovery in database) Data mining: Basic steps. Mining tasks. Classification: YES, NO

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Audit Analytics. --An innovative course at Rutgers. Qi Liu. Roman Chinchila

Benchmarking of different classes of models used for credit scoring

Overview, Goals, & Introductions

Promises and Pitfalls of Big-Data-Predictive Analytics: Best Practices and Trends

Creating a Distribution List from an Excel Spreadsheet

Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

Deep Learning For Text Processing

Oracle Advanced Analytics 12c & SQLDEV/Oracle Data Miner 4.0 New Features

Using Microsoft Excel to Plot and Analyze Kinetic Data

Creating an Excel Database for a Mail Merge on a PC. Excel Spreadsheet Mail Merge. 0 of 8 Mail merge (PC)

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

Tutorial Segmentation and Classification

Weight of Evidence Module

Oracle Data Miner (Extension of SQL Developer 4.0)

Data Mining and Visualization

Building In-Database Predictive Scoring Model: Check Fraud Detection Case Study

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING

Record Keeping FOR A SMALL BUSINESS

Quantrix & Excel: 3 Key Differences A QUANTRIX WHITE PAPER

Data Mining Applications in Higher Education

Using multiple models: Bagging, Boosting, Ensembles, Forests

Introduction to Microsoft Access 2003

Data analysis and regression in Stata

Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report

Merging Labels, Letters, and Envelopes Word 2013

Equity forecast: Predicting long term stock price movement using machine learning

Estimating a market model: Step-by-step Prepared by Pamela Peterson Drake Florida Atlantic University

Ensembles and PMML in KNIME

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

About Dell Statistica

Chapter 6. The stacking ensemble approach

WPS USER GUIDE FOR TRAVEL AGENCIES

A Guide to Getting Started with the AmeriCorps VISTA Applicant Tracking Tool

A Case of Study on Hadoop Benchmark Behavior Modeling Using ALOJA-ML

Data mining techniques: decision trees

Azure Machine Learning, SQL Data Mining and R

Budgeting in QuickBooks

Knowledge Discovery and Data Mining

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics

Data Mining: Overview. What is Data Mining?

Data Mining with SQL Server Data Tools

Qi Liu Rutgers Business School ISACA New York 2013

Data Warehousing and Data Mining in Business Applications

Virginia s Department of Minority Business Enterprise

How to Use the Cash Flow Template

2011 Data Miner Survey Highlights

Excel Tutorial. Bio 150B Excel Tutorial 1

New Work Item for ISO Predictive Analytics (Initial Notes and Thoughts) Introduction

Today's Topics. COMP 388/441: Human-Computer Interaction. simple 2D plotting. 1D techniques. Ancient plotting techniques. Data Visualization:

Improving Demand Forecasting

Chapter Quarterly Financial Report Instructions (Quarterly Format)

Creating a Gradebook in Excel

Management Decision Making. Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011

Lowering social cost of car accidents by predicting high-risk drivers

Studying Auto Insurance Data

BOR 6335 Data Mining. Course Description. Course Bibliography and Required Readings. Prerequisites

EXCEL Tutorial: How to use EXCEL for Graphs and Calculations.

In this tutorial, we try to build a roc curve from a logistic regression.

DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING

Transcription:

Data Mining: STATISTICA Outline Prepare the data Classification and regression 1

Prepare the Data Statistica can read from Excel,.txt and many other types of files Compared with WEKA, Statistica is much easier in terms of data preparing Open an Excel File Click the Import selected sheet to Spreadsheet Select the desired Excel sheet where your data is stored Get variable names from the first row 2

Open an Excel File Change variable type Open an Excel File Change variable type 3

Classification and Regression C&RT Boosting tree Neural Networks C&RT Classification Iris data is used as a example data set 4

C&RT Classification Click Data Mining menu and find the Interactive Trees C&RT Classification View the final tree and understand the results 5

C&RT---Regression Use the CPU data set and select the regression analysis Don t check it Regression tree structure C&RT---Regression 6

C&RT---Regression Predicted values Boosting tree Classification In Data Mining menu and find the Boosted Trees 7

Boosting tree Classification See the results and predictor s importance Boosting tree Classification See the results and predictor s importance 8

CPU data set Boosting tree Regression Boosting tree Regression See the results and predictor s importance Predicted values 9

Boosting tree Regression See the results of Observed values vs. Predicted values Boosting tree Regression See the results and predictor s importance 10

Neural Networks Classification In Data Mining menu and find the Automated Neural Networks Neural Networks Classification Choose Classification, then select variables 11

Neural Networks Classification Statistica will try a set of different neural networks and keep the best ones Neural Networks Classification See the classification results 12

Neural Networks Classification See the classification results---predictions Neural Networks Classification See the classification results---predictions 13

Neural Networks Classification See the classification results---confusion matrix Neural Networks Regression CPU data set 14

Neural Networks Regression CPU data set, select variables Neural Networks Regression Training and results 15

Neural Networks Regression Predictions Neural Networks Regression Some statistics about the predictions 16