Business Analytics Research and Teaching Perspectives



Similar documents
TABLE OF CONTENTS 1 Chapter 1: Introduction 2 Chapter 2: Big Data Technology & Business Case 3 Chapter 3: Key Investment Sectors for Big Data

Big Data. What is Big Data? Over the past years. Big Data. Big Data: Introduction and Applications

Mind Commerce. Commerce Publishing v3122/ Publisher Sample

How To Understand The Business Case For Big Data

Sunnie Chung. Cleveland State University

The Big Data Market: Business Case, Market Analysis & Forecasts

INVESTOR PRESENTATION. First Quarter 2014

Big Data and Data Science: Behind the Buzz Words

Big Data Analytics. Copyright 2011 EMC Corporation. All rights reserved.

ANALYTICS CENTER LEARNING PROGRAM

IIMB s Analytics Lab

Data Warehouse design

5 Keys to Unlocking the Big Data Analytics Puzzle. Anurag Tandon Director, Product Marketing March 26, 2014

Name: Srinivasan Govindaraj Title: Big Data Predictive Analytics

How Big Is Big Data Adoption? Survey Results. Survey Results Big Data Company Strategy... 6

Mind Commerce. Commerce Publishing v3122/ Publisher Sample

EVERYTHING THAT MATTERS IN ADVANCED ANALYTICS

This Symposium brought to you by

Fluency With Information Technology CSE100/IMT100

Introduction to Big Data! with Apache Spark" UC#BERKELEY#

COULD VS. SHOULD: BALANCING BIG DATA AND ANALYTICS TECHNOLOGY WITH PRACTICAL OUTCOMES

IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users

Big Data Too Big To Ignore

Data Warehousing and Data Mining in Business Applications

Comparative Analysis of the Main Business Intelligence Solutions

The 4 Pillars of Technosoft s Big Data Practice

Enterprise Solutions. Data Warehouse & Business Intelligence Chapter-8

The Data Mining Process

Chapter 6. Foundations of Business Intelligence: Databases and Information Management

Worldwide Advanced and Predictive Analytics Software Market Shares, 2014: The Rise of the Long Tail

IBM SPSS Modeler Professional

Foundations of Business Intelligence: Databases and Information Management

The Big Data Market: Business Case, Market Analysis & Forecasts

Microsoft SQL Server and Oracle Database:

Business Analytics and Data Visualization. Decision Support Systems Chattrakul Sombattheera

Global SME Big Data Market

SIGNIFICANCE OF BUSINESS INTELLIGENCE APPLICATIONS FOR BETTER DECISION MAKING & BUSINESS PERFORMANCE

Brochure More information from

Chapter 1. Contrasting traditional and visual analytics approaches

KNOWLEDGE BASE DATA MINING FOR BUSINESS INTELLIGENCE

Big Data Technologies Compared June 2014

Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER

TOP 8 TRENDS FOR 2016 BIG DATA

Enhancing Sales and Operations Planning with Forecasting Analytics and Business Intelligence WHITE PAPER

The 3 questions to ask yourself about BIG DATA

APPROACHABLE ANALYTICS MAKING SENSE OF DATA

Investor Presentation. Second Quarter 2015

Market for Telecom Structured Data, Big Data, and Analytics: Business Case, Analysis and Forecasts

III JORNADAS DE DATA MINING

Worldwide Business Analytics Software Forecast and 2013 Vendor Shares

Business Intelligence. A Presentation of the Current Lead Solutions and a Comparative Analysis of the Main Providers

Hadoop Market - Global Industry Analysis, Size, Share, Growth, Trends, and Forecast,

INVESTOR PRESENTATION. Third Quarter 2014

Hexaware E-book on Predictive Analytics

Centerpoint Sports Medicine and Wellness Education Catalog

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Analytics Industry Trends Survey. Research conducted and written by:

Big Data in Financial Services Industry: Market Trends, Challenges, and Prospects

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Are You Ready for Big Data?

W o r l d w i d e B u s i n e s s A n a l y t i c s S o f t w a r e F o r e c a s t a n d V e n d o r S h a r e s

BUSINESS INTELLIGENCE

A Tour of the Zoo the Hadoop Ecosystem Prafulla Wani

Make Better Decisions Through Predictive Intelligence

Architecting for the Internet of Things & Big Data

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Big Data Multi-Platform Analytics (Hadoop, NoSQL, Graph, Analytical Database)

Capture Share Report Asia Pacific Business Intelligence and Analytics Solution Providers. Authored by Phil Hassey: - Founder capioit

Business Intelligence: The European Perspective

Hadoop and Data Warehouse Friends, Enemies or Profiteers? What about Real Time?

Customized Report- Big Data

Questionnaire about the skills necessary for people. working with Big Data in the Statistical Organisations

Case No COMP/M SAP / BUSINESS OBJECTS. REGULATION (EC) No 139/2004 MERGER PROCEDURE. Article 6(1)(b) NON-OPPOSITION Date: 27/11/2007

DANIEL ADOMAKO ASAMOAH Curriculum Vitae

AGENDA. What is BIG DATA? What is Hadoop? Why Microsoft? The Microsoft BIG DATA story. Our BIG DATA Roadmap. Hadoop PDW

Think bigger about business intelligence create an informed healthcare organization.

DECISION SUPPORT SYSTEMS OR BUSINESS INTELLIGENCE. WHICH IS THE BEST DECISION MAKER?

White Paper: Datameer s User-Focused Big Data Solutions

Advanced Big Data Analytics with R and Hadoop

Overview, Goals, & Introductions

Web Data Mining: A Case Study. Abstract. Introduction

Big Data Market Size and Vendor Revenues

Better planning and forecasting with IBM Predictive Analytics

Tutorial: Big Data Algorithms and Applications Under Hadoop KUNPENG ZHANG SIDDHARTHA BHATTACHARYYA

The Future of Data Management

Transcription:

Business Analytics Research and Teaching Perspectives Ramesh Sharda, Daniel A. Asamoah and Natraj Ponna Institute for Research in Information Systems, Spears School of Business, Oklahoma State University Stillwater, OK 74078

Outline Three Types of Analytics Research Perspectives Athletic Injury Analysis Big Data Research Teaching Perspectives Analytics Ecosystem The Value of RFID Tracking Systems for Inventory Management in Healthcare 2

Types of Analytics Predictive Statistical Analysis and Data Mining Reporting Visualization Periodic, ad hoc Reporting Trend Analysis Prescriptive Management Sc. Models and Solution Reporting Visualization on Periodic, ad hoc reporting Trend Analysis Predictive Statistical Analysis and Data Mining Prescriptive Management Sc. Models and Solutions 3

Significant growth in all three types of analytics Prescriptive analytics Application of ORMS techniques in building revenue optimization models Predictive analytics Development of algorithms and applying them in new and innovative settings Big Data analytics volume, variety, velocity of data Current opportunities sensors/large scale observations/analyses Examples Entertainment industry Healthcare Sports Research Perspectives

Predicting Injury Recovery Time in College Football Daniel Asamoah and Ramesh Sharda 5

Research Background NCAA was formed to help curb occurrence of college football injuries and deaths 66,313 college students participated in NCAA s Divisions I, II and III football leagues in 2009 2010 season Football is still one of the most injury prone sports Aim was to investigate injury occurrences and PREDICT how long an injury would take to heal The Value of RFID Tracking Systems for Inventory Management in Healthcare 6

Introduction: Why Injury Recovery Time? Despite efforts by NCAA, there is still frequent occurrence of injuries Uncertainty of how long an injury might take to heal This affects a coach's decision on player management and tactics. Knowledge of injury recovery time could be used in a decision support system by coaches and medical staff to manage injuries and plan future game strategies with cognizance of how long players would take to recover from injuries The Value of RFID Tracking Systems for Inventory Management in Healthcare 7

Sports Injury Studies Most occurring injuries (Powell et al. 1999; Vastag, 2002) Ways of managing football injuries for overall team performance (Fukada et al., 2012; Dick et al., 2007) Injury Analysis (Feeley et al., 2008; Comin et al., 2012) Re injury prevention (Heiderscheit et al., 2010) Post injury analysis of certain conditions such as concussion (McCrea et al., 2003) Variables that affect injury recovery (Lau et al., 2011; Tuberville et al.; 2003) The Value of RFID Tracking Systems for Inventory Management in Healthcare 8

Research Gap Studies that have investigated injury recovery have mostly used only traditional regression or descriptive statistical analysis methods. Focus has been on factors that affect injury recovery time rather than make quantifiable time prediction for recovery. The Value of RFID Tracking Systems for Inventory Management in Healthcare 9

Research Question How accurately can college football injury recovery time be predicted using data mining methods? what are the most important factors that affect occurrence of injuries? The Value of RFID Tracking Systems for Inventory Management in Healthcare 10

Data mining is the process of analyzing data from different perspectives and summarizing it into useful information Visual Analytics Methodology: Data Mining Predictive Analytics Decision Tree: data classification and prediction method commonly used due to its intuitive ability to explain relationship among several variables. Neural Network (NN): a mathematical and computational model for pattern recognition and data classification through a learning process. Sequence Analysis: a method for understanding the relationships among a group of variables based on the order of occurrence. The Value of RFID Tracking Systems for Inventory Management in Healthcare 11

Methodology: Data Mining The Value of RFID Tracking Systems for Inventory Management in Healthcare 12

Methodology: Data Collection and Preparation Data was collected from a healthcare IT company Off field data was deleted Variables were categorized as Intrinsic and Extrinsic variables Target variable (recovery time) was generated based on injury date and resolved date variables Total of 17 variables and 384 football records were left after data reduction. The data set was partitioned into two sub sets; a training set (70%) and a testing set(30%). The Value of RFID Tracking Systems for Inventory Management in Healthcare 13

Variables Used No. Variable Name Variable Description 1 RECID Unique player identification 2 Position Position athlete was playing when injury occurred 3 Injury Date Date injury occurred 4 Closed Date Date injury was resolved 5 Recovery Time Period injury took to heal 6 Action Taken Who handled injury when it occurred e.g. team physician 7 Body Part Part of body where injury occurred e.g. knee 8 Laterality Lateral position of injury on body e.g. Left knee 9 Injury Type of injury that occurred e.g. strain 10 Severity Degree of seriousness of injury 11 Current Status Seriousness of injury on a progressive basis 12 Occurred During Type of event when injury occurred e.g. practice session 13 Event Location Location of field when injury occurred 14 Field Type Type of field surface when injury occurred e.g. grass 15 Opponent Opposing team when injury occurred 16 Activity Motion athlete was in when injury occurred e.g. running 17 Onset How injury happened e.g. gradual, acute contact The Value of RFID Tracking Systems for Inventory Management in Healthcare 14

Results Visualization The Value of RFID Tracking Systems for Inventory Management in Healthcare 15

Injuries During Different Activities More injuries occurred during practice sessions than during actual games Adequate safety mechanisms were probably not enforced during training sessions as compared to actual games. The Value of RFID Tracking Systems for Inventory Management in Healthcare 16

The Value of RFID Tracking Systems for Inventory Management in Healthcare 17

Major Injuries at Various Playing Positions Most frequently occurring injuries Strain, Sprain, Contusion and Concussion in decreasing order of frequency. Players in defensive positions are more prone to injuries The Value of RFID Tracking Systems for Inventory Management in Healthcare 18

The Value of RFID Tracking Systems for Inventory Management in Healthcare 19

Results Prediction Models The Value of RFID Tracking Systems for Inventory Management in Healthcare 20

Prediction Models A NN model was built based on recovery time in months Consequently, for a more refined (and informative) model, NN, CHAID and C&RT models were built based on weeks Determined Within 1 category away The accuracy of the prediction 1 step away from the correct prediction. The Value of RFID Tracking Systems for Inventory Management in Healthcare 21

Model 1(Neural Network Model) Model Evaluation Generally better when categories in output variable (recovery time) were few. Predicted early recovery time much better. Model 2 (CHAID Model) Fared better when categories in output variable (recovery time) was increased. Predicted early and mid term recovery time much better. Model 3 (C&RT) Predicted late term recovery time much better Improved model accuracy in the within 1 category away measure Model 4 (Sequence Analysis) Portrayed interesting trends in the injury data The Value of RFID Tracking Systems for Inventory Management in Healthcare 22

Neural Network Model Evaluation Classification table for NN s performance (in months) Predicted 0 1Month 1 2Months 2 4Months 4 6Months 6 24Months Observed 0 1Month 100.0% 0.0% 0.0% 0.0% 0.0% 1 2Months 0.0% 93.3% 3.3% 3.3% 0.0% 2 4Months 4.5% 0.0% 86.4% 9.1% 0.0% 4 6Months 0.0% 0.0% 22.2% 77.8% 0.0% 6 24Months 0.0% 20.0% 0.0% 10% 70.0% The Value of RFID Tracking Systems for Inventory Management in Healthcare 23

Neural Network Model Evaluation Accuracy measures for NN (in months) Months to recovery 0 1 1 2 2 4 4 6 6 24 Predicted/ observed Accuracy (%) 100 93.3 86.4 77.8 70 Within 1 category away (%) 100 96.7 95.5 100 80 The Value of RFID Tracking Systems for Inventory Management in Healthcare 24

Model Evaluation Accuracy measures for NN, C&RT and CHAID (in weeks) Weeks to recovery 0 1 1 2 2 3 3 4 4 8 8 16 16 96 Predicted/observedMonths to recovery Accuracy (%) 80 65.2 80.1 58.8 38.1 17.7 45.5 CHAID Predicted/ observed Within 1 category away (%) 88.6 93.5 88.5 88.2 47.6 23.5 45.5 Accuracy (%) 46.0 60.4 78.3 68.4 31.8 64.7 90.9 C&RT Within 1 category away (%) 67.6 81.2 95.7 84.2 50.0 76.5 90.9 Accuracy (%) 18.8 65.5 25.7 4.8 15.4 0.0 6.2 NN Within 1 category away (%) 87.5 93.1 74.3 42.9 19.2 0.0 6.2 The Value of RFID Tracking Systems for Inventory Management in Healthcare 25

Sequence Analysis (Both Body Parts and Injuries) Injuries in lower extremities seem to result in higher extremity injuries Those who have had Strain on Femur are likely to suffer injuries on shoulder and knee. Injuries seem to repeat Those who have had ankle sprains are more likely to suffer another injury on ankle. Precautionary measures could be used to minimize recurrence Antecedent Consequent Support (%) Confidence (%) Femur and Strain Shoulder 25.5 22.5 Femur and Strain Knee 25.5 20.0 Ankle and Sprain Ankle 22.3 25.7 Ankle and Sprain Ankle 22.3 25.7 Ankle and Sprain Hand 22.3 20.0 Femur and Strain Femur 25.5 30.0 The Value of RFID Tracking Systems for Inventory Management in Healthcare 26

Recommendation Most injuries are treated by training room staff. To decrease injury time some of these might be taken to the specialist right from injury onset. Significant amount of injury occurred during practice. Adequate safety mechanisms have to be in place during training too. When a player is prone to a type of injury given past injury occurrence, precautionary measures should be put in place in order to minimize future injuries (based on sequence analysis). Players in defensive positions are more prone to injuries. The Value of RFID Tracking Systems for Inventory Management in Healthcare 27

Big Data Research Opportunities Social Media + Healthcare Marketing Twitter analysis Sensor data applications (RFID etc.)

Big Data Research Opportunities Large Scale Mining of Social Media Healthcare Data: A Patient Centered Perspective of Chronic Disease Management

-General studies on the role of Social media in healthcare (physician/provider driven information dissemination -Targeted studies on disease management Big Data Research Opportunities Lots of BI and Analytics studies and few with Big Data in: -Advertisement - Product Adoption Big Data Analytics -General healthcare studies in genetics -Less studies in disease management -Insufficient study using Big Data Analytics -Intent is to perform in-depth comparative analysis on multistructured data efficient disease management strategies Social Media Disease Management MDD, COPD

Modeling Brand Post Popularity in Online Social Networks

Research Motivation 60% of social customers recommend a brand to a friend after following the brand on Twitter or Facebook 50% of them are more likely to buy from that brand as well (Constant Contact, 2011)

Research Questions How information moves on online social networks platforms? How users respond to various stimuli? How a content becomes popular on OSNs?

Research Objective Use Hawkes process as one of the most widely used point process in the literature to shed light on how the content popularity on OSNs can be described by a function of time and the number of followers

Managerial Implementations Influential users in amplifying the brand post popularity Role of influential users in engaging of a brand post Engaging more influential users during the early life of the brand post Incentives to influential users Release time for products of on the market

Teaching Perspectives Analytics Ecosystem

Analytics Ecosystem Data Infrastructure Providers Data Warehouse Industry Middleware Industry Data Aggregated Distributers Analytics Focused Software Developers ANALYTICS ECOSYSTEM Analytics Industry Analysts & Influencers Academic Providers & Certification Agencies Analytics User Organizations Application Developers: Industry Specific or General

Data Infrastructure Category Major Hardware and Storage solutions Examples IBM, Dell, HP, Oracle, EMC, NetApp Indigenous Hardware and Software Platform IBM, Oracle, Teradata Hardware independent Data solutions Specialized Integrated Software Network Infrastructure/Cloud Computing Big Data Infrastructure Services and Training Microsoft SQL Server, MySQL SAP Amazon, Salesforce.com Cloudera, Hortonworks, Hadoop, Map Reduce, NoSQL

Data Warehouse Category Examples Data integration Services EMC, IBM, Microsoft, Oracle, SAP, Teradata

Middleware Business Intelligence Platforms Category Reporting Analytics solutions Examples IBM Cognos, Microstrategy, Hyperion (Oracle), Plum, SAP Business Objects 40

Data Aggregators / Distributors Category Data Collection Examples Comscore, Experian, Google, Nielsen, Omniture,Axiom 41

Analytics Application Developers Category Analytics Consulting Specific Solutions/Industry Specific Domain Specific Solutions Web/Social media/location Analytics Specialized Analytical Solutions Examples IBM, SAS, Teradata Cerner (Healthcare), IBM Watson (Healthcare and Insurance), Sabre (Travel Industry) Axiom (demographic clustering), FICO & Experian (Credit Score Classification), Demandtec (Pricing Optimization) Sense Networks (location based user profiling), X+1 & Rapleaf (email based profiling), Bluecava (user identification through device usage), Simulmedia (Targeted advertising based on user's TV watching) Shazam, Siri(iphone), Google Now(Android) 42

Analytics User Organizations Category Private Sector, Government, Education, Military etc. Examples Too numerous to classify 43

Analytic Industry Analysts and Influencers Category Private Sector, Government, Education, Military etc. Professional Organizations Professional Membership based societies Analytics Ambassadors Examples Too numerous to classify Gartner group, Data Warehousing Institute, Forrester, McKinsey INFORMS, Special Interest Group on Decision Support Systems (SIGDSS), Teradata, SAS Individual contributors of analytics through seminars, books and other publications 44

Academic Providers / Certification Agencies Category Analytic Academic Programs Analytics Certification programs Examples Business school offerings including courses on information systems, marketing, management sciences etc. Computer science, statistics, mathematics, and industrial engineering departmental courses IBM, Microsoft, Microstrategy, Oracle, SAS, Teradata 45

Role of universities and academic programs Business School offerings in the areas of information systems, Marketing, Management Sciences Courses from computer science, statistics, mathematics, and industrial engineering departments Academic alliance programs Teaching Perspectives Teradata University Network (teradatauniversitynetwork.com) Certificates offered by major vendors

Conclusion Exciting time for research and teaching in analytics! Questions/Comments?