Application of Predictive Analytics to Higher Degree Research Course Completion Times

Size: px
Start display at page:

Download "Application of Predictive Analytics to Higher Degree Research Course Completion Times"

Transcription

1 Application of Predictive Analytics to Higher Degree Research Course Completion Times Application of Decision Theory to PhD Course Completions ( ) Rachna 1 I Dhand, Senior Strategic Information COMPANY Analyst NAME

2 Application of Predictive Analytics to Higher Degree Research Course Completion Times Higher Degree Research (HDR) students carry significant costs for Universities. Failure of students to complete either on time or at all results in sub optimal resource utilisation and impacts to government grant allocations and ratings. Objective The objective of this project was to analyse the historical completion time for ECU PhD candidates and identify the primary determinants for the same, so that some intervention strategies can be implemented for future students for their timely completion. Methodology Classification Decision Science Models are used to predict the Completion Times for HDR candidates. Some shortlisted Models are CHAID, QUEST and C I

3 What is Predictive Analytics? 3 I

4 Data Mining Introduction, Modelling and Prediction Accuracy Use of Data Mining strategies to identify who is at-risk of drop-out or who is likely to take longer time to finish his or her degree is not a new subject in Institutional Research (IR). Explanatory models by regression and path analysis have contributed substantially to our understanding of student retention (Adam and Gaither, 2005; Pascarella and Terenzini, 2005; Braxton, 2000). Though published studies on the use and prediction accuracy of data-mining approaches in IR are few. Luan (2002) explained the application of neural-net and decision tree analysis in predicting the transfer of college students to four-year institutions. Byres Gonzalez and Desjardins (2002) showed neural-network model predicts with better accuracy over Binary Logistic Regression. Prediction Accuracy does not solely depends on the type of model chosen for predictions but is also dependant on the independent variables chosen, their measurement levels and data size. 4 I

5 Predicting Completion Status for Higher Degree Research Candidates Is it possible to predict the Completion Status of the HDR student given the 1) Variable set of Demographic and Course information 2) Research Experience and Nature of Research Project 3) Faculty and School information 4) Supervisor information... 5 I

6 HDR Completion Status Prediction Analysis and Modeling Approach Analysis Modelling 1 2 Objective The main objective of the model is to predict the Completion Time for future applicants where Completion Time is defined as the period spent by the candidate since commencement of the HDR degree till the completion of the degree. Analysis The analysis takes into account the PhD students from 2006 to 2013 with their research performance and demographic information from all faculties of ECU as the history data set. The completion time is estimated. 3 Information Value Analysis (IVA) The research performance and demographic variables are pooled through IVA to filter the most correlated variables with Completion Time and finalized for Modeling dataset. 4 Methodology The dataset and estimated Completion time are modeled using Decision based Predictive Modeling. The tested models are C5.0, CHAID and QUEST, following classification based paradigm for modeling and scoring. 6 I

7 HDR Completion Status Prediction Historical data analysis filters PhD candidacy outcomes for ECU with reference years from 2006 to Information Value Analysis screens out the primary determinants correlated with Completion Time that is to be targeted with Model Building. 2 Completion Time Prediction Modelling 3 Target definition is based on the Historical outcome and Model building is initiated using classification Models CHAID, C5.0 and Quest. 4 The Model with best result and most accurate emulation of the actual target is chosen to score the future candidates 7 I

8 Predicting Completion Status for Higher Degree Research Candidates Historical Analysis and Understanding Data used for Prediction... 8 I

9 HDR Cohort Analysis Count by Citizenship ( ) Note: The small cohort size for HDR Enrolments poses constraints to Modelling process thereby making classification models more suitable for building and training process.

10 HDR Candidacy Time Distribution Domestic Enrolled Candidates ( ) Candidacy Time is the time spent by the student since commencement of the PhD degree till he or she reaches the final state of completion or discontinuity of the degree or stay enrolled for longer duration.

11 HDR Course Attempt Distribution by Final Outcome Domestic Vs International

12 HDR Completion Status Estimation Scope of Modeling HDR Research Report Data (Set of 45 variables related to student Research Experience and Candidature Progress) Student Course Details (Set of 215 variables) Research Data (Milestone status, ABS Research Classifications and Scholarship Status data) Domestic PhD + International PhD Candidates Only Target Definition for Modelling (Completion Outcome) Discard Inactive and Intermittent Status Modelling Dataset Student Demographic and Course Information Training Dataset (90%) ( ) Testing & Scoring Dataset (10%) (2013) C5.0, CHAID or QUEST Decision Tree Model

13 HDR Completion Status Estimation Target Definition Actual Candidacy Status Completion Time (Calculated in Years) Target: T_ATTEMPT_STATUS date_years_difference ( D_COURSE_COMMENCEM ENT_DT, D_COURSE_COMPLETION_ DT) T_COURSE_ATTEMPT_STAT US matches "ENROL*" then D_INTAKE_PERIOD 1. WILL_COMPLETE ( Candidacy <= 4 Years) 2. WILL_COMPLETE_LATE (Candidacy > 4 Years) 3. STILL_ENROLLED (Candidacy > 3.5 Years) 4. WLL_DISCONTINUE (Attrition Flags set for all Teaching Periods) date_years_difference (D_COURSE_COMMENCE MENT_DT,T_COURSE_DISC ONTINUED_DT 5. IMMATURE VINTAGE (Candidacy < 3.5 Years) [Discarded]

14 Predicting Completion Status for Higher Degree Research Candidates Using Classification Models for Prediction I

15 Decision Tree Models Due to non-linear relationships of indicator with Target and having a nominal target outcome, Decision Tree Models were selected for predicting the Completion Time outcomes for currently enrolled domestic students as well as International students. The outcome from the Model is: 1)Target Prediction (STILL_ENROLLED, WILL_COMPLETE,WILL_DISCONTINUE, WILL_COMPLETE_LATE). 2)Confidence Score for each enrolled student (ranges between 0 and 1). Rule Induction is basically categorised into: C5.0, Chi-Square Automatic Interaction Detection (CHAID), QUEST and Classification and Regression (C&R )Tree. C5.0 Model handles Nominal or Flag targets with All Predictor categories (nominal, Continuous, or Flag). 15 I

16 Decision Tree Models Model Criteria C5.0 CHAID QUEST Type of Split for Categorical Targets Multiple Multiple Binary Continuous Target No Yes No Continuous Predictors Yes No Yes Criteria for Predictor Selection Information Measure Chi-Square F-Test for Continuous Statistical Supports Bagging/Boosting Yes Yes Yes 16 I

17 Predictor Importance F-Test Association with Target CHAID Milestones Achieved School Name Load Completed Field of Education Course Fraction Completed Meeting Frequency Basis of Admission Research Literature Funding Category Changed Predicts Annual the Leaves target Availed with Best Accuracy. CHAID performs Chi-Square tests for Predictor Importance and Variable Reduction. The test preferably gives higher importance to continuous variables rather than nominal or categorical. 17 I

18 Predictor Importance Information Value Analysis C5.0 Course Fraction Completed Literature Review Feedback Field of Education Milestone Achieved Mode of Attendance Age at Enrolment NESB Indicator C5.0 performs Information Value (IV) and Weight of Evidence (WoE) Method for Variable Reduction. While WoE analyzes the predictive power of a variable in relation to the targeted outcome. IV assesses the overall predictive power of the variable being considered. 18 I

19 Completion Status Modeling Decision Tree Models Used C5.0 Decision Tree Model Completion Status Estimation CHAID QUEST

20 Actual Candidature Status C 5.0 Decision Tree Model Target Following Predicted Candidature Status STILL_ENROLLED 18.0% 101 STILL_ENROLLED 33.51% 188 WILL_COMPLETE 14.08% 79 WILL_COMPLETE 8.91% 50 WILL_COMPLETE_LATE 14.8% 83 WILL_COMPLETE_LATE 4.46% 25 WILL_DISCONTINUE 53.12% 298 WILL_DISCONTINUE 53.12% I

21 C 5.0 Decision Tree Model Model Evaluation & Analysis Results for output field T_ATTEMPT_STATUS Comparing $C-T_ATTEMPT_STATUS with T_ATTEMPT_STATUS Correct % Wrong % Total 561 Performance Evaluation STILL_ENROLLED 1.0 WILL_COMPLETE WILL_COMPLETE_LATE WILL_DISCONTINUE Confidence Values Report for $CC-T_ATTEMPT_STATUS Range Mean Correct Mean Incorrect Always Correct Above (0% of cases) Always Incorrect Below 0.35 (0% of cases) 85.56% Accuracy Above Fold Correct Above 0.86 (53.57% of cases) 21 I

22 Actual Candidature Status CHAID Decision Tree Model Target Following Predicted Candidature Status STILL_ENROLLED 18.0% 101 STILL_ENROLLED 11.51% 65 WILL_COMPLETE 14.08% 79 WILL_COMPLETE 14.26% 80 WILL_COMPLETE_LATE 14.8% 83 WILL_COMPLETE_LATE 13.9% 78 WILL_DISCONTINUE 53.12% 298 WILL_DISCONTINUE 60.25% I

23 CHAID Decision Tree Model Model Evaluation & Analysis Results for output field T_ATTEMPT_STATUS Comparing $R-T_ATTEMPT_STATUS with T_ATTEMPT_STATUS Correct % Wrong % Total 561 Performance Evaluation STILL_ENROLLED WILL_COMPLETE WILL_COMPLETE_LATE WILL_DISCONTINUE Confidence Values Report for $RC-T_ATTEMPT_STATUS Range Mean Correct Mean Incorrect Always Correct Above (0% of cases) Always Incorrect Below 0.3 (0% of cases) 76.05% Accuracy Above Fold Correct Above (42.4% of cases) 23 I

24 Actual Candidature Status QUEST Decision Tree Model Target Following Predicted Candidature Status STILL_ENROLLED 18.0% 101 STILL_ENROLLED 25.18% 141 WILL_COMPLETE 14.08% 79 WILL_COMPLETE 12.32% 69 WILL_COMPLETE_LATE 14.8% 83 WILL_COMPLETE_LATE 2.5% 14 WILL_DISCONTINUE 53.12% 298 WILL_DISCONTINUE 60.0% I

25 QUEST Decision Tree Model Model Evaluation & Analysis Results for output field T_ATTEMPT_STATUS Comparing $R-T_ATTEMPT_STATUS with T_ATTEMPT_STATUS Correct % Wrong % Total 560 Performance Evaluation STILL_ENROLLED WILL_COMPLETE WILL_COMPLETE_LATE WILL_DISCONTINUE Confidence Values Report for $RC-T_ATTEMPT_STATUS Range Mean Correct Mean Incorrect Always Correct Above (0% of cases) Always Incorrect Below (0% of cases) 77% Accuracy Above I

26 Predicting Completion Status for Higher Degree Research Candidates Validating Prediction Accuracy I

27 Model Comparison Confidence Level Distributions CHAID QUEST C5.0 Predicts the target with Best Accuracy. Predicts the Target with weak accuracy of the three models used. Predicts the target with average accuracy Strong Weak Average 27 I

28 Completion Status Estimation Prediction Accuracy by Target Values Field STILL_ENROLLED* WILL_COMPLETE* WILL_COMPLETE_LATE* WILL_DISCONTINUE* Importance $RC-T_ATTEMPT_STATUS Important Field STILL_ENROLLED* WILL_COMPLETE* WILL_COMPLETE_LATE* WILL_DISCONTINUE* Importance $CC-T_ATTEMPT_STATUS Important Field STILL_ENROLLED* WILL_COMPLETE* WILL_COMPLETE_LATE* WILL_DISCONTINUE* Importance $RC-T_ATTEMPT_STATUS Important QUEST C5.0 CHAID 28 I

29 Completion Status Modeling Conclusion HDR DQ standards need to be raised. Data has good predictor strength. But it should be consistently populated over the span time used for prediction. This is an example text. Example text. Go ahead and replace it. This is an example text. Example text. The model has good prediction accuracy, though ECU s HDR Cohort is very small (700 students Approx). This is an example text. Example text. Go ahead and replace it. This is an example text. Example text. The limitation with the modeling process was that only classification models can be used because of the limited size of the cohort. Neural Net and Logistic Regression modeling cannot be applied. This is an example text. Example text. Go ahead and replace it. This is an example text. Example text. The next phase will be to design the Reporting Standards and Intervention Strategies, so that the modeling outcome can be used effectively to reduce the completion time for future students. This is an example text. Example text. Go ahead and replace it. This is an example text. Example text. QUEST C5.0 CHAID 29 I

30 References 1. Adam, J., and Gaither, G. H. Retention in Higher Education: A Selective Resource Guide. In G.H. Gaither (ed.), Minority Retention: What Works? New Directions for Institutional Research, no San Francisco: Jossey-Bass, Pascarella, E., and Terenzini, P. How College Affects Students. San Francisco: Jossey- Bass, Braxton, J. Reworking the Student Departure Puzzle. Nashville, Tenn.: Vanderbilt University Press, Luan, J. Data Mining and its Applications in Higher Education. In A. M. Serban and J. Luan (eds.), Knowledge Management: Building a Competitive Advantage in Higher Education. New Directions for Institutional Research, no San Francisco: Jossey- Bass, Byers Gonzalez, J., and DesJardins, S. Artificial Neural Networks: A New Approach for Predicting Application Behaviour. Research in Higher Education, 2002, 43 (2), I

31 Questions 31 I

Data Mining Applications in Higher Education

Data Mining Applications in Higher Education Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2

More information

Estimating Student Retention and Degree-Completion Time: Decision Trees and Neural Networks Vis-à-Vis Regression

Estimating Student Retention and Degree-Completion Time: Decision Trees and Neural Networks Vis-à-Vis Regression 2 Focusing on student retention and time to degree completion, this study illustrates how institutional researchers may benefit from the power of predictive analyses associated with data-mining tools.

More information

What is Predictive Analytics?

What is Predictive Analytics? What is Predictive Analytics? Firstly, Analytics is the use of data, statistical analysis, and explanatory and predictive models to gain insights and act on complex issues. EDUCAUSE Center for Applied

More information

Predictive Modeling in Enrollment Management: New Insights and Techniques. uversity.com/research

Predictive Modeling in Enrollment Management: New Insights and Techniques. uversity.com/research Predictive Modeling in Enrollment Management: New Insights and Techniques uversity.com/research Table of Contents Introduction 1. The Changing Paradigm of College Choice: Every Student Is Different 2.

More information

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100

Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three

More information

Start-up Companies Predictive Models Analysis. Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov

Start-up Companies Predictive Models Analysis. Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov Start-up Companies Predictive Models Analysis Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov Abstract: A quantitative research is performed to derive a model for predicting the success of Bulgarian start-up

More information

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century

An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO

More information

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods

Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Predicting Student Persistence Using Data Mining and Statistical Analysis Methods Koji Fujiwara Office of Institutional Research and Effectiveness Bemidji State University & Northwest Technical College

More information

Gerry Hobbs, Department of Statistics, West Virginia University

Gerry Hobbs, Department of Statistics, West Virginia University Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit

More information

Dawn Broschard, EdD Senior Research Analyst Office of Retention and Graduation Success [email protected]

Dawn Broschard, EdD Senior Research Analyst Office of Retention and Graduation Success dbroscha@fiu.edu Using Decision Trees to Analyze Students at Risk of Dropping Out in Their First Year of College Based on Data Gathered Prior to Attending Their First Semester Dawn Broschard, EdD Senior Research Analyst

More information

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets

Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.

EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d. EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models

More information

A Property & Casualty Insurance Predictive Modeling Process in SAS

A Property & Casualty Insurance Predictive Modeling Process in SAS Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing

More information

Predictive Analytics in Action

Predictive Analytics in Action Predictive Analytics in Action San Diego County San Diego County 4,083 sq. miles 8 th largest county by size 3 million residents 5 th largest by population 725,000 children 13.9% living below poverty level

More information

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell

THE HYBRID CART-LOGIT MODEL IN CLASSIFICATION AND DATA MINING. Dan Steinberg and N. Scott Cardell THE HYBID CAT-LOGIT MODEL IN CLASSIFICATION AND DATA MINING Introduction Dan Steinberg and N. Scott Cardell Most data-mining projects involve classification problems assigning objects to classes whether

More information

Data Mining Methods: Applications for Institutional Research

Data Mining Methods: Applications for Institutional Research Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014

More information

Prediction of Stock Performance Using Analytical Techniques

Prediction of Stock Performance Using Analytical Techniques 136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University

More information

Predictive Modeling Techniques in Insurance

Predictive Modeling Techniques in Insurance Predictive Modeling Techniques in Insurance Tuesday May 5, 2015 JF. Breton Application Engineer 2014 The MathWorks, Inc. 1 Opening Presenter: JF. Breton: 13 years of experience in predictive analytics

More information

Predicting Student Performance by Using Data Mining Methods for Classification

Predicting Student Performance by Using Data Mining Methods for Classification BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

More information

THE PREDICTIVE MODELLING PROCESS

THE PREDICTIVE MODELLING PROCESS THE PREDICTIVE MODELLING PROCESS Models are used extensively in business and have an important role to play in sound decision making. This paper is intended for people who need to understand the process

More information

Promoting Student Retention Through Classroom Practice * Vincent Tinto Syracuse University USA

Promoting Student Retention Through Classroom Practice * Vincent Tinto Syracuse University USA Promoting Student Retention Through Classroom Practice * Vincent Tinto Syracuse University USA Introduction Many universities in the United States speak of the importance of increasing student retention.

More information

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and

STATISTICA. Financial Institutions. Case Study: Credit Scoring. and Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT

More information

Chapter 12 Discovering New Knowledge Data Mining

Chapter 12 Discovering New Knowledge Data Mining Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to

More information

Weight of Evidence Module

Weight of Evidence Module Formula Guide The purpose of the Weight of Evidence (WoE) module is to provide flexible tools to recode the values in continuous and categorical predictor variables into discrete categories automatically,

More information

Data Mining: A Magic Technology for College Recruitment. Tongshan Chang, Ed.D.

Data Mining: A Magic Technology for College Recruitment. Tongshan Chang, Ed.D. Data Mining: A Magic Technology for College Recruitment Tongshan Chang, Ed.D. Principal Administrative Analyst Admissions Research and Evaluation The University of California Office of the President [email protected]

More information

Data Mining Techniques Chapter 6: Decision Trees

Data Mining Techniques Chapter 6: Decision Trees Data Mining Techniques Chapter 6: Decision Trees What is a classification decision tree?.......................................... 2 Visualizing decision trees...................................................

More information

Newcastle University. Educational Partnerships. Framework for Joint and Dual PhDs

Newcastle University. Educational Partnerships. Framework for Joint and Dual PhDs Newcastle University Educational Partnerships Framework for Joint and Dual PhDs These principles are provided as a guide for the development of joint or dual PhD programmes and should be read in conjunction

More information

Community College Transfer Students Persistence at University

Community College Transfer Students Persistence at University Community College Transfer Students Persistence at University Alexandra List Denise Nadasen University of Maryland University College Presented at Northeast Association for Institutional Research, November

More information

Graduate Student Perceptions of the Use of Online Course Tools to Support Engagement

Graduate Student Perceptions of the Use of Online Course Tools to Support Engagement International Journal for the Scholarship of Teaching and Learning Volume 8 Number 1 Article 5 January 2014 Graduate Student Perceptions of the Use of Online Course Tools to Support Engagement Stephanie

More information

USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS

USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS USING PREDICTIVE ANALYTICS TO UNDERSTAND HOUSING ENROLLMENTS Heather Kelly, Ed.D., University of Delaware Karen DeMonte, M.Ed., University of Delaware Darlena Jones, Ph.D., EBI MAP-Works Predictive Analytics:

More information

T-61.6010 Non-discriminatory Machine Learning

T-61.6010 Non-discriminatory Machine Learning T-61.6010 Non-discriminatory Machine Learning Seminar 1 Indrė Žliobaitė Aalto University School of Science, Department of Computer Science Helsinki Institute for Information Technology (HIIT) University

More information

CATHOLIC UNIVERSITY OF HEALTH AND ALLIED SCIENCES - GUIDELINES FOR JOINT PHD PROGRAMMES

CATHOLIC UNIVERSITY OF HEALTH AND ALLIED SCIENCES - GUIDELINES FOR JOINT PHD PROGRAMMES 2014 CATHOLIC UNIVERSITY OF HEALTH AND ALLIED SCIENCES - GUIDELINES FOR JOINT PHD PROGRAMMES School of Graduate Studies 1/8/2014 Contents 1.0 Introduction... 3 2.0 Guidelines for Joint PhD Programs...

More information

Machine Learning Algorithms and Predictive Models for Undergraduate Student Retention

Machine Learning Algorithms and Predictive Models for Undergraduate Student Retention , 225 October, 2013, San Francisco, USA Machine Learning Algorithms and Predictive Models for Undergraduate Student Retention Ji-Wu Jia, Member IAENG, Manohar Mareboyana Abstract---In this paper, we have

More information

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario [email protected]

Data Mining in CRM & Direct Marketing. Jun Du The University of Western Ontario jdu43@uwo.ca Data Mining in CRM & Direct Marketing Jun Du The University of Western Ontario [email protected] Outline Why CRM & Marketing Goals in CRM & Marketing Models and Methodologies Case Study: Response Model Case

More information

In the past two decades, the federal government has dramatically

In the past two decades, the federal government has dramatically Degree Attainment of Undergraduate Student Borrowers in Four-Year Institutions: A Multilevel Analysis By Dai Li Dai Li is a doctoral candidate in the Center for the Study of Higher Education at Pennsylvania

More information

Application for Admission to a Higher Degree by Research International

Application for Admission to a Higher Degree by Research International Application for Admission to a Higher Degree by Research International Instructions for applicants Before lodging this form please discuss your research proposal with a potential supervisor in the faculty

More information

Data Mining Algorithms Part 1. Dejan Sarka

Data Mining Algorithms Part 1. Dejan Sarka Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses

More information

Data Mining is the process of knowledge discovery involving finding

Data Mining is the process of knowledge discovery involving finding using analytic services data mining framework for classification predicting the enrollment of students at a university a case study Data Mining is the process of knowledge discovery involving finding hidden

More information

Learning Analytics: Targeting Instruction, Curricula and Student Support

Learning Analytics: Targeting Instruction, Curricula and Student Support Learning Analytics: Targeting Instruction, Curricula and Student Support Craig Bach Office of the Provost, Drexel University Philadelphia, PA 19104, USA ABSTRACT For several decades, major industries have

More information

PhD and Research Master Degree Scholarship Guidelines

PhD and Research Master Degree Scholarship Guidelines PhD and Research Master Degree Scholarship Guidelines Document Number: RTSC15:042 Date Approved: 4 May 2015 Date of Commencement: 5 May 2015 For official Use only 1. Context/Overview These Guidelines support

More information

Evaluation in Online STEM Courses

Evaluation in Online STEM Courses Evaluation in Online STEM Courses Lawrence O. Flowers, PhD Assistant Professor of Microbiology James E. Raynor, Jr., PhD Associate Professor of Cellular and Molecular Biology Erin N. White, PhD Assistant

More information

EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS? 2007-08

EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS? 2007-08 EARLY VS. LATE ENROLLERS: DOES ENROLLMENT PROCRASTINATION AFFECT ACADEMIC SUCCESS? 2007-08 PURPOSE Matthew Wetstein, Alyssa Nguyen & Brianna Hays The purpose of the present study was to identify specific

More information

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

Easily Identify Your Best Customers

Easily Identify Your Best Customers IBM SPSS Statistics Easily Identify Your Best Customers Use IBM SPSS predictive analytics software to gain insight from your customer database Contents: 1 Introduction 2 Exploring customer data Where do

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Persistence in University Continuing Education Online Classes

Persistence in University Continuing Education Online Classes International Review of Research in Open and Distance Learning Volume 8, Number 3. ISSN: 1492-3831 November 2007 Persistence in University Continuing Education Online Classes Jia University of California

More information

School of Psychology and Counselling PhD Scholarship 2014

School of Psychology and Counselling PhD Scholarship 2014 School of Psychology and Counselling PhD Scholarship 2014 1. BACKGROUND This scholarship will be awarded to a student of exceptional research potential undertaking a Higher Degree by Research (HDR) PhD

More information

Attrition in Online and Campus Degree Programs

Attrition in Online and Campus Degree Programs Attrition in Online and Campus Degree Programs Belinda Patterson East Carolina University [email protected] Cheryl McFadden East Carolina University [email protected] Abstract The purpose of this study

More information

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges

A Basic Guide to Modeling Techniques for All Direct Marketing Challenges A Basic Guide to Modeling Techniques for All Direct Marketing Challenges Allison Cornia Database Marketing Manager Microsoft Corporation C. Olivia Rud Executive Vice President Data Square, LLC Overview

More information

Combining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds

Combining Linear and Non-Linear Modeling Techniques: EMB America. Getting the Best of Two Worlds Combining Linear and Non-Linear Modeling Techniques: Getting the Best of Two Worlds Outline Who is EMB? Insurance industry predictive modeling applications EMBLEM our GLM tool How we have used CART with

More information

Using Predictive Analytics to Improve the Bottom Line ***** http://www.unr.edu/ia/research

Using Predictive Analytics to Improve the Bottom Line ***** http://www.unr.edu/ia/research Using Predictive Analytics to Improve the Bottom Line ***** http://www.unr.edu/ia/research Serge Herzog, PhD Director, Institutional Analysis Consultant, CRDA StatLab University of Nevada, Reno Reno, NV

More information

DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING

DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING DECISION TREE ANALYSIS: PREDICTION OF SERIOUS TRAFFIC OFFENDING ABSTRACT The objective was to predict whether an offender would commit a traffic offence involving death, using decision tree analysis. Four

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

Variable Selection in the Credit Card Industry Moez Hababou, Alec Y. Cheng, and Ray Falk, Royal Bank of Scotland, Bridgeport, CT

Variable Selection in the Credit Card Industry Moez Hababou, Alec Y. Cheng, and Ray Falk, Royal Bank of Scotland, Bridgeport, CT Variable Selection in the Credit Card Industry Moez Hababou, Alec Y. Cheng, and Ray Falk, Royal ank of Scotland, ridgeport, CT ASTRACT The credit card industry is particular in its need for a wide variety

More information

Predictive Modeling of Titanic Survivors: a Learning Competition

Predictive Modeling of Titanic Survivors: a Learning Competition SAS Analytics Day Predictive Modeling of Titanic Survivors: a Learning Competition Linda Schumacher Problem Introduction On April 15, 1912, the RMS Titanic sank resulting in the loss of 1502 out of 2224

More information

How To Get A Degree From Une

How To Get A Degree From Une !!!" #$!!" %&'&'& ()*(+',,,,- HOW TO APPLY The following application form can be used by potential international students to apply for admission to postgraduate research/higher degree research programs

More information

Predictive modelling around the world 28.11.13

Predictive modelling around the world 28.11.13 Predictive modelling around the world 28.11.13 Agenda Why this presentation is really interesting Introduction to predictive modelling Case studies Conclusions Why this presentation is really interesting

More information

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning

How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume

More information

Benchmarking of different classes of models used for credit scoring

Benchmarking of different classes of models used for credit scoring Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science

TNS EX A MINE BehaviourForecast Predictive Analytics for CRM. TNS Infratest Applied Marketing Science TNS EX A MINE BehaviourForecast Predictive Analytics for CRM 1 TNS BehaviourForecast Why is BehaviourForecast relevant for you? The concept of analytical Relationship Management (acrm) becomes more and

More information

Master of Clinical Psychology (Program 7601- coursework) Doctor of Philosophy (Clinical Psychology) (Program 9064 research)

Master of Clinical Psychology (Program 7601- coursework) Doctor of Philosophy (Clinical Psychology) (Program 9064 research) Graduate Programs in Clinical Psychology Master of Clinical Psychology (Program 7601- coursework) Doctor of Philosophy (Clinical Psychology) (Program 9064 research) In 2015, the Research School of Psychology

More information

Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America

Application of SAS! Enterprise Miner in Credit Risk Analytics. Presented by Minakshi Srivastava, VP, Bank of America Application of SAS! Enterprise Miner in Credit Risk Analytics Presented by Minakshi Srivastava, VP, Bank of America 1 Table of Contents Credit Risk Analytics Overview Journey from DATA to DECISIONS Exploratory

More information

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry

Potential Value of Data Mining for Customer Relationship Marketing in the Banking Industry Advances in Natural and Applied Sciences, 3(1): 73-78, 2009 ISSN 1995-0772 2009, American Eurasian Network for Scientific Information This is a refereed journal and all articles are professionally screened

More information

Binary Logistic Regression

Binary Logistic Regression Binary Logistic Regression Main Effects Model Logistic regression will accept quantitative, binary or categorical predictors and will code the latter two in various ways. Here s a simple model including

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

PharmaSUG2011 Paper HS03

PharmaSUG2011 Paper HS03 PharmaSUG2011 Paper HS03 Using SAS Predictive Modeling to Investigate the Asthma s Patient Future Hospitalization Risk Yehia H. Khalil, University of Louisville, Louisville, KY, US ABSTRACT The focus of

More information

Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM

Paper AA-08-2015. Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Paper AA-08-2015 Get the highest bangs for your marketing bucks using Incremental Response Models in SAS Enterprise Miner TM Delali Agbenyegah, Alliance Data Systems, Columbus, Ohio 0.0 ABSTRACT Traditional

More information

Microsoft Azure Machine learning Algorithms

Microsoft Azure Machine learning Algorithms Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql [email protected] http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation

More information

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19 PREFACE xi 1 INTRODUCTION 1 1.1 Overview 1 1.2 Definition 1 1.3 Preparation 2 1.3.1 Overview 2 1.3.2 Accessing Tabular Data 3 1.3.3 Accessing Unstructured Data 3 1.3.4 Understanding the Variables and Observations

More information

Sutee Sujitparapitaya, Ph.D. Institutional Effectiveness and Analytics San José State University

Sutee Sujitparapitaya, Ph.D. Institutional Effectiveness and Analytics San José State University Sutee Sujitparapitaya, Ph.D. Associate Vice President for Institutional Effectiveness and Analytics San José State University Email: [email protected] Copyright Sutee Sujitparapitaya, 2011

More information

Alex Vidras, David Tysinger. Merkle Inc.

Alex Vidras, David Tysinger. Merkle Inc. Using PROC LOGISTIC, SAS MACROS and ODS Output to evaluate the consistency of independent variables during the development of logistic regression models. An example from the retail banking industry ABSTRACT

More information

CHAID Decision Tree: Reverse Mortgage Loan Termination Example

CHAID Decision Tree: Reverse Mortgage Loan Termination Example CHAID Decision Tree: Reverse Mortgage Loan Termination Example Business Context Reverse Mortgage Loan (RML) enables Senior Citizens to avail of periodical payments from a lender against the mortgage of

More information

Strategies for Promoting Gatekeeper Course Success Among Students Needing Remediation: Research Report for the Virginia Community College System

Strategies for Promoting Gatekeeper Course Success Among Students Needing Remediation: Research Report for the Virginia Community College System Strategies for Promoting Gatekeeper Course Success Among Students Needing Remediation: Research Report for the Virginia Community College System Josipa Roksa Davis Jenkins Shanna Smith Jaggars Matthew

More information

WILLIS A. JONES. [email protected] Office: (859) 257-1607

WILLIS A. JONES. willis.a.jones@uky.edu Office: (859) 257-1607 WILLIS A. JONES University of Kentucky, College of Education, Educational Policy Studies and Evaluation, 131 Taylor Education Building, Lexington, KY 40506 [email protected] Office: (859) 257-1607

More information