WHEN LEARNING ANALYTICS MEET BIG DATA The Predictive Analytics Reporting (PAR) Framework November 8, 2012
THE PAR FRAMEWORK Moderator: Ellen Wagner, WICHE Cooperative for Educational Technologies (WCET) Presenter: Pearl Imada Iboshi, University of Hawaii System Presenter: Mike Sharkey, University of Phoenix Presenter: Jonathan Sherrill, Colorado Community College Online
IN TODAY S SESSION Orientation to PAR Framework Analyst Perspectives, PAR Framework Proof of Concept Role on Project Team Biggest Aha about the Data Biggest Aha about the Process PAR Implementation Directions Summary and Conclusions
WHAT IS THE PAR (PREDICTIVE ANALYTICS REPORTING) FRAMEWORK? A big data analysis project to informs student loss prevention and identify drivers related to loss and momentum. WCET member institutions voluntarily contribute deidentified student records to create a single federated database. Descriptive, inferential and predictive analysis techniques applied to look for patterns that inform our understanding of loss and momentum to improve student success.
PAR FRAMEWORK OBJECTIVES Identify common variables likely to influence student retention and progression, and measure the degree of influence; Determine if measures and definitions of retention, progression, and completion differ materially among various types of postsecondary institutions; Discover advantages and/or disadvantages to particular statistical and methodological approaches pertaining to identifying profiles of students considered to be at-risk. Begin to identify the highest impact interventions with targeted pools of at risk-students 5
PAR Framework POC (May 2011-Jan 2012 PAR Framework Bridge (Feb July, 2012 PAR Framework Implementation (August 2012 Jan 2014) Construct federated data base First opportunity for institutional researcher access to DB Extend federated data set by adding student records from 10 new partner institutions Normalize variables between and among institutions Deep evaluation of 33 common and 9 constructed variables normalized in POC Validate data model, data definitions, taxonomy for remediation, institutional benchmarking framework Determine essential processes needed for full project execution Ongoing quality assurance of DB Define infrastructure for scaling Completed on time and on budget, per terms of grant January, 2012 Proposed December 2011. Approved February 27, 2012 Completed July 2012 Provide roadmap to selfsufficiency (analytics products, licensing and go-to-market plans)
FAST FACTS FROM THE PROOF OF CONCEPT Funded by Bill & Melinda Gates Foundation Managed by WICHE Cooperative for Educational Technologies, operated by WCET core project team 6 institutional partners 2 4 year schools (U. of HI system, U of Illinois - Springfield) 2 community colleges (CCC Online, Rio Salado) 2 for-profit institutions (APUS, U. of Phoenix) 3,200,000 course level records 640,000 student level records 2 In-kind donations IBM Tableau
University of Hawaii System Perspectives DR. PEARL IMADA IBOSHI
INTRODUCTION Role on Project Team One of two members from the University of Hawaii Team One of the few members from IR UH system unique attributes 3 four years, 1 mixed, 6 two years All on one system, Banner System encourages taking courses at any 2 year without going through separate admissions process
INTRODUCTION TO THE DATA The data selected for the POC was purposefully very narrow in scope: Students taking distance classes during 2010 Calendar Year Data for online courses only plus all Dev Ed courses Success was defined as graduated or still enrolled on or before June 1, 2011 The initial specifications for the project explicitly avoided setting out null hypotheses for testing, intending only to show that we could federate and anonymize data from 6 institutions, 640,000 students and 3.1 million course-level records. The choice of variables did determine what factors could be tested as having an influence on success.
LIST OF PAR POC VARIABLES Institution Identifier PAR Student ID Degree Type Academic Level CIP Code Multiple Major Academic Status Institution Student Course Completes Total Course Extensions Total Degree Extensions Previous Term Mean GPA Prior Term Withdrawals Degree Hours Attempted Degree Hours Completed Course Size Concurrent Courses Gender Non Res Alien Status Race/Ethnicity Course Start Date Course End Date Month and year of birth Military Classification /Veteran Transfer Credits Program Changes Prior Degree Completions Course Grade Dev Ed Courses Attempted Dev Ed Courses Completed Degree Start Date Dev Ed Course Indicator
BIGGEST PROBLEMS WITH COMBINING THE DATA FROM SIX INSTITUTIONS Types of problems: Hard time defining: prior term mean GPA, what is a term if you only allow one class per term? What is prior term for students who take a break? Hard time making comparable: academic status, was person here at set date? Some institutions allow leaves or have their own definition of persistence Transfer credit comparability & definition: Dynamic variable but collected only at one time point. Differing availability of accepted vs. all Transfer credits. Hard to gather: multiple majors and program changes, count number of program changes Hard to translating a complex situation into a simple measure when every institution handles things differently: concurrent vs. sequential degrees
BIGGEST AHA ABOUT THE DATA Need to spend more time up front precisely defining the data and making sure that everything is comparable across institutions. It is too difficult to try to fix things after the fact. But if you spend the time it is possible to do. The factors most affecting success, such as course completion ratios, were consistent across institutions.
BIGGEST AHA ABOUT THE PROCESS At the start of the project, everyone was cautious about how much data to share. The initial model had only the project team with access to the data At the end of the project, everyone wanted access to the data to run their own analyses and everyone was willing to share their data. We have the same goals and problems, and hopefully will be able to use the data to find similar solutions.
University of Phoenix Perspectives MIKE SHARKEY
EVOLVING THOUGHTS ON DATA Gather the data Turn the data into information Use the information to help students
Colorado Community Colleges Online Perspectives JONATHAN SHERRILL
OUR GOALS To Better Understand Retention Challenges in Online Classes Lower pass rates than regular classes why? Withdrawals waste student money and COF credit. Can we help them know ahead of time? To Understand Data Trends Affecting Developmental Education Students Pass rate are lower for online than on campus. Graduation rate is low. Small chance of graduation if someone assesses into the lowest levels. Consumes too much COF credit.
BECAUSE OF PAR We hired a data analyst dedicated to student retention. We were able to collect much of our data into one place. Planted the seed that data is to be used, not just protected. Made contacts across the field in private and public organizations. Learned from our peers.
POST PAR STUDIES Developmental placement Whether Dev-Ed students do well in future courses once the progression is done Effect of taking classes early vs. late in the degree Students who take the courses they place into vs. skip them Signals Risk Expression
WE LOOK FORWARD TO Seeing the extended analysis that will take place on the richer dataset provided in PAR. Making continued use of our expanding dataset.
WHAT S NEW? American Public University System* Ashford University Broward College Capella University Colorado Community College System* Lone Star College System Penn State World Campus Rio Salado College* Sinclair Community College Troy University University of Central Florida University of Hawaii System* University of Illinois Springfield* University of Maryland University College University of Phoenix* Western Governors University
WHAT S NEW 10 new institutional partners 1,000,000 students, 6,000,000 courses Course level detail Financial aid detail Prepping for LMS, non cognitive, satisfaction and engagement data
THE OPPORTUNITIES: Extending the PAR Framework data model, adding new variables; Proactively using data to optimize learning quality as a function of student success; Establishing a model for guiding student success in (online) learning settings; Providing common benchmarks to guide institutional evaluation and decision-making affecting student success in online programs.
WE CAN DO A BETTER JOB OF REMOVING BARRIERS TO STUDENT SUCCESS http://img.photobucket.com/albums/v720/hanmersprings/whack-a-mole.jpg
THANKS FOR YOUR INTEREST http://wcet.wiche.edu http://wcet.wiche.edu/advance/par-framework