MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION
|
|
|
- Marjorie Walsh
- 9 years ago
- Views:
Transcription
1 MERGING BUSINESS KPIs WITH PREDICTIVE MODEL KPIs FOR BINARY CLASSIFICATION MODEL SELECTION Matthew A. Lanham & Ralph D. Badinelli Virginia Polytechnic Institute and State University Department of Business Information Technology (0235) 1007 Pamplin Hall, Blacksburg, VA Abstract This study provides an example from a national retailer using binary classification techniques to model the propensity that a product will sell within a certain time horizon. We posit that a firm performing predictive analytics should consider the statistical performance, as well as the performance that a set of potential models will have with respect to business indicator(s) that the model is supporting. Model assessment statistics (e.g. AUC, overall accuracy, etc.) are important metrics that gauge how well a model will predict future observations, but we have discovered that using them in isolation is insufficient when deciding which model performs optimally with regard to the business. Modeling the propensity that a product will sell in a particular store using several binary classification techniques, we capture their traditional assessment statistics and point out which model would likely be chosen. We show that a better solution would be to build a decision model that selects the best forecasts using both traditional assessment statistics and business performance. Keywords: Analytics, Model Assessment, Model Selection Introduction Model assessment statistics (e.g. AUC, overall accuracy, etc.) are important and commonly used to gauge how well a model will predict future observations. We posit that using these statistics in isolation are insufficient when deciding which model performs the best with regard to the actual business problem. The analytics movement continues to gain traction among firm executives, so it is more important than ever that practitioners are linking the complicated algorithms they are using to actual business outcomes those algorithms are supporting. Whether practitioners are doing this correctly or not is unclear, but what is becoming apparent is that executives are looking at the results of previously made decisions and their corresponding results from their Business Intelligence (BI) platforms. BI is an umbrella term to describe analytical concepts and methods to improve managerial decision-making by using fact-based support and reporting systems [1]. Recently the focus has shifted slightly from BI to Business Analytics (BA). BA and BI are often used interchangeably,
2 but BA is really a component of BI that provides the value from data analyses and modeling techniques [2]. BA is the scientific process of transforming data into insight for making better decisions [3]. The effective use of BI reporting systems by upper management is providing them insights in regards to the BA solutions being provided, and used by, decision-makers further down the decision-making hierarchy. In turn, BI is providing a useful feedback loop to the BA practitioners and decision-support solutions they are developing. We structure this paper by turning the business problem into an analytics problem, describe the model methodologies we employed, detail our decision model selection procedure, and provide some results. Lastly, we discuss the positive impacts and insights of our solution and how we are working to validate our decision model using different constraints and parameters. Business Problem to Analytics Problem A retailer s assortment decision asks what are the optimal products to offer in a particular location, how much inventory to carry for those products [4]. The process of determining an assortment plan can happen at various points in a year, the assortment decision involves listing and delisting products over time as consumer demand changes [5]. Thus, decision-makers (e.g. category managers) require having parameters that gauge a products future selling propensity to help support their assortment decision effectively. In cases such as this, refining the business requirements with decision-makers can lead to important discoveries that lead to better decision-support. We discovered that our predictive forecasts require certain unique characteristics to be used effectively. First, it is important to the assortment decision that the probabilities are discriminatory in nature. We do not mean discriminatory in the classical binary classification sense (e.g. True-positives/True-negatives), but rather the probabilities are spread over the entire [0,1] probability space. The reason for this is to allow the decision-maker to be able to identify one SKU that is better than a competing substitutable SKU. For example, if a grocery store had three ketchup products to choose from to put into a store (e.g. Heinz, Hunts, Store brand), but had a constraint to put only two into the assortment, having probabilities that are the same value present discriminatory issues. The second requirement we discovered as we were formulating potential analytical solutions to this problem was that the category mangers were not the only stakeholder using such measures for decision-making purposes. It turned out that such propensity-to-purchase aggregated forecasts are used by executives in the decision-making hierarchy to support strategic planning initiatives (e.g. grow category A by 5%). These executives were not only using these measures, but were able to identify from their BI suites which probability forecasts were performing as expected and which were inconsistent with actual business performance (e.g. percentage sold). Methodology Selection Since our response is categorical (e.g. sell or not sell), the analytical problem is deemed a binary classification problem under the predictive modeling umbrella. When binary classification algorithms are employed they try to classify an object (e.g. SKU) into one of only two possible groups. One interesting aspect is the decision cut-off need not be fifty percent, which is common practice. Any threshold could be employed based on a variety of reasons (e.g. unbalanced data
3 set, problem-specific, etc.). The idea is that the estimated probability value greater than the specified threshold results in a product being classified as a seller, while less than the threshold a non-seller. To generate our probabilities we chose several commonly used binary classification algorithms, such as logistic regression, classification tree, C5.0 decision tree, Quest decision tree, CHAID, decision list, linear discriminant model (LDA), artificial neural networks using a multi-layer perceptron and a radial basis function, as well as a support-vector-machine (SVM) using a radius basis function. We also trained boosted and bagged versions of these models, as well as a heterogeneous ensemble model that used weighted predictions from the other models based on their corresponding model confidence/accuracy. Data & Modeling Building The data set investigated consists of 49,656 records entailing one product category from a national retailer. Each observation contained a store-sku combination. For example, store i has j records, where each record is a unique SKU that has sold or not sold over some time horizon. The attributes used or time-horizon employed could not be disclosed due to confidentiality concerns, but the attributes entail measures with respect to the store, SKU, and demographic profiles. All records were products that were stocked and sold, stocked and did not sell, or were purchased in some other fashion (e.g. online, in-store, telephone) and delivered to a store for customer pickup in a store. The general predictive model employed by all algorithms used was as follows: 1 if sold one or more units of sku i in store j Y ij = f(attribute list), where Y ij = { 0 if did not sell one or more of sku i in store j All algorithms were trained using a 50/50 balanced training data set due to it being slightly unbalanced. Balancing is common practice when data sets are unbalanced as the algorithms will tend to build a model classifying the majority class better than the minority class. The data was trained and assessed using a 70/30 percent training/testing partition with 10-fold cross-validation. To assess the models we report common statistical performance measures, such as area under the curve (AUC), overall classification accuracy (using a 50% cutoff), as well as lift and profit measures. In practice many competing models are generated and the retailer will often choose one among the set. Results Plotting the percentage of products sold versus the forecasted probabilities binned together in five percent bins reveals the actual business performance to expected business performance over that specific time horizon. Ideally if the models are producing correct probabilities over this horizon they would follow a 45 degree line as shown via the black dotted line in the Figure 1 plots below. The Linear Discriminant Analysis (LDA) model is very consistent for each probability bin, but consistently underperforms compared to the actual percentage that sold. The boosted neural network radial basis function (BANN_RBF) model revealed approximately 35 percent of SKUs within every probability bin previously sold. Since the number of SKUs can
4 vary within each probability bin, such an occurrence can happen, but may not be obvious based on the plots alone. Figure 1: Percentage sold of products versus estimated probability to sell for each model (Note that not all models are shown due to proceedings length restrictions). Interestingly, most of the models perform well statistically (e.g. ROC/AUC, etc.) as shown in Table 1, but some perform better than others with regard to actual business performance. Traditional Assessment & Model Selection Measures Customized Business Assessment & Selection Measures Model Max Profit Max Profit Occurs in (%) Lift Overall Accuracy Area Under Curve B0-Intercept B1-Slope R-squared Logit / / / / / CART / / / / / C / / / / / Quest / / / / / CHAID / / / / / Decision list / / / / / LDA / / / / / ANN (MLP) / / / / / ANN (RBF) / / / / / SVM (RBF) 1340 / / 1 1 / / / Boosted CART (30) / / / / / Boosted C5.0 (30) 7330 / / / / / Boosted Quest (30) / / / / / Boosted CHAID (30) / / / / / Boosted ANN MLP (10) / / / / / Boosted ANN RBF (10) / / / / / Bagged CART (30) / / / / / Bagged Quest (30) / / / / / Bagged CHAID (30) / / / / / Bagged ANN MLP (10) / / / / / Bagged ANN RBF (10) / / / / / Heterogeneous Ensemble / /
5 Table 1: Statistical and business assessment and selection measures In practice, often the model having the best predictive model assessment statistic would be chosen. In this case, the Boosted C5.0 decision tree is best (i.e. AUC = 81.8%). From a business perspective, this model performs well as can be seen above. The values follow a linear trend line (i.e. R-squared = 0.995) and also has probabilities nicely dispersed across the entire probability space. However, the probabilities generated are pessimistic (i.e. slope = 0.828) compared to actual sales performance. Next, we show how we solve this problem to create a more robust solution both with regard to the business and even leads to increased overall classification accuracy. Deployment - Decision Model Our solution takes advantage of all of the intelligent models available. We accomplish this by selecting the model having the closest probabilities to the actual percentage sold for each five percent bin. We incorporate constraints so that poor models (i.e. do not generate probabilities in at least half the bins, have an AUC less than 60%, etc.) are excluded from being used even if they have probabilities that are perfectly aligned with actual business performance. Terms and definitions: x ij = the probabilities from model i are chosen for bin j, x ij {0,1}; i = 1,.., N; j = 1,.., M β 0,i = the estimated intercept parameter corresponding to liner regression model i; i = 1,.., N β 1,i = the estimated slope parameter corresponding to linear regression model i; i = 1,.., N ρ i = the r-squared statistics corresponding to liner regression model i; i = 1,.., N α i = the area under of the curve for model i based on the testing set; i = 1,.., N τ i = the overall training accuracy of model i based on a 50% decision cutoff threshold; ; i = 1,.., N φ i = the out of sample testing accuracy of model i; i = 1,.., N y ij = the 45 degree line value for model i and bin j; i = 1,.., N; j = 1,.., M y ij = the percentage sold for model i within each bin j; i = 1,.., N; j = 1,.., M K = large value to penalize the objective function θ ij = squared error of % sold to 45 degree line for model i for bin j, such that = { (y ij y ij ) 2 if y ij exists K otherwise Objective function: min θ ij x ij Constraints: (1) x ij (φ i τ i ) 0.05 j, i (use only valid models) (2) x ij (τ i φ i ) 0.10 j, i (ignore overfit models) (3) 0.45 x ij β 0,i 0.45 j, i (reasonable intercept) (4) 0.70 x ij β 1,i 1.30 j, i (reasonable slope) (5) 0.50 x ij ρ i 1 j, i (linear fit model points close to line) (6) x ij α i 0.60 j, i (probabilities can only come from intelligent models) j i (7) M j δ(θ ij K) 0.5 (only use models having probabilities that exist in more than half the [0,1] space) M where δ(θ ij K) = { 1 if θ ij K = 0 0 if θ ij K 0 (8) x ij {0,1} (binary decision) Our decision model led to using forecasts from ten different predictive models. Among the 22 models we generated, nine were excluded from consideration because they did not meet the
6 inclusion criteria (e.g. constraint (6)). Models excluded came from the basic set, boosted set, bagged set, as well as the heterogeneous ensemble model. The final set of models selected for each bin as shown in Figure 2 achieved a classification accuracy of 76.6%, which is 3.54% better than the best model, but more importantly leads to propensity to purchase estimates that follow more closely with business performance (e.g. R-square of 99.1%). Figure 2 Final Selection Results Conclusions & Future Research Modeling the propensity that a product will sell in a particular store using several binary classification techniques and reviewing their traditional assessment statistics we identify the model that would have likely be chosen and put into production in practice. Interestingly, the optimal model does not perform optimally with regard to the business. If many competing models are generated, which is common, a retailer could choose to strategically use all intelligent forecasts that are most likely to match business performance. This can not only reduce uncertainty for planning purposes, but also helps achieve buy-in with direct decision-makers that rely on the decision-support solutions the analytics professionals are generating. We are currently working on validating different decision models having different constraints and parameters to identify a model that performs optimally for many product categories. References [1] Chen, H., R.H. Chiang, and V.C. Storey, Business intelligence research. MISQ Special Issue (forthcoming), [2] Wixom, B., et al., The current state of business intelligence in academia. Communications of the Association for Information Systems, (1): p. 16. [3] INFORMS, INFORMS Certified Analytics Professional (CAP) Examination Study Guide. 2014: [4] Kök, A.G., M.L. Fisher, and R. Vaidyanathan, Assortment planning: Review of literature and industry practice, in Retail Supply Chain Management. 2015, Springer. p [5] Hübner, A.H. and H. Kuhn, Retail category management: State-of-the-art review of quantitative research and software applications in assortment and shelf space management. Omega, (2): p
A supply chain analytics approach to product assortment optimization
A supply chain analytics approach to product assortment optimization Rajesh Kumar ([email protected]), Vishwanathan Rajagopalan, Jayesh Baldania, Nidhi Sagar, Santanu Sinha, Priyanka Dahiya, Jyotirmay
Azure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
Ensemble Modeling with R
Doctoral Candidate/Merchandise Data Scientist MatthewALanham.com Virginia Tech Department of Business Information Technology Advance Auto Parts, Inc. Outline Outline My Background and Research Pros and
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
Retail Category Management
Alexander Hiibner Retail Category Management Decision Support Systems for Assortment, Shelf Space, Inventory and Price Planning fyj. Springer Contents 1 Outline 1 1.1 Background and Motivation 1 1.2 Objectives
THE THREE "Rs" OF PREDICTIVE ANALYTICS
THE THREE "Rs" OF PREDICTIVE As companies commit to big data and data-driven decision making, the demand for predictive analytics has never been greater. While each day seems to bring another story of
Data Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS
MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Business Analytics and Credit Scoring
Study Unit 5 Business Analytics and Credit Scoring ANL 309 Business Analytics Applications Introduction Process of credit scoring The role of business analytics in credit scoring Methods of logistic regression
Beating the NCAA Football Point Spread
Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 11 Sajjad Haider Fall 2013 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
Accenture Perfect CPG Analytics. End-to-end analytics services for fact-based business decisions and high-performing execution
Accenture Perfect CPG Analytics End-to-end analytics services for fact-based business decisions and high-performing execution Moving from insights to action at speed Consumer Packaged Goods (CPG) companies
COPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
Affinity Insight Retail Basket Analysis
Affinity Insight Retail Basket Analysis Shantanu Goswami. SAP Data Science. 2014 Legal disclaimer The information in this presentation is confidential and proprietary to SAP and may not be disclosed without
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
APPLICATION OF DATA MINING TECHNIQUES FOR DIRECT MARKETING. Anatoli Nachev
86 ITHEA APPLICATION OF DATA MINING TECHNIQUES FOR DIRECT MARKETING Anatoli Nachev Abstract: This paper presents a case study of data mining modeling techniques for direct marketing. It focuses to three
Planning Demand For Profit-Driven Supply Chains
Demand Planning for Profit-Driven Supply Chains epaper / Adexa Common epaper Series Pitfalls in Supply Chain System Implementations Author: William H. Green Planning Demand For Profit-Driven Supply Chains
Make Better Decisions Through Predictive Intelligence
IBM SPSS Modeler Professional Make Better Decisions Through Predictive Intelligence Highlights Easily access, prepare and model structured data with this intuitive, visual data mining workbench Rapidly
Nine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 10 Sajjad Haider Fall 2012 1 Supervised Learning Process Data Collection/Preparation Data Cleaning Discretization Supervised/Unuspervised Identification of right
A Decision-Support System for New Product Sales Forecasting
A Decision-Support System for New Product Sales Forecasting Ching-Chin Chern, Ka Ieng Ao Ieong, Ling-Ling Wu, and Ling-Chieh Kung Department of Information Management, NTU, Taipei, Taiwan [email protected],
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
Discovering, Not Finding. Practical Data Mining for Practitioners: Level II. Advanced Data Mining for Researchers : Level III
www.cognitro.com/training Predicitve DATA EMPOWERING DECISIONS Data Mining & Predicitve Training (DMPA) is a set of multi-level intensive courses and workshops developed by Cognitro team. it is designed
Data Mining. Dr. Saed Sayad. University of Toronto 2010 [email protected]. http://chem-eng.utoronto.ca/~datamining/
Data Mining Dr. Saed Sayad University of Toronto 2010 [email protected] http://chem-eng.utoronto.ca/~datamining/ 1 Data Mining Data mining is about explaining the past and predicting the future by
Behavioral Segmentation
Behavioral Segmentation TM Contents 1. The Importance of Segmentation in Contemporary Marketing... 2 2. Traditional Methods of Segmentation and their Limitations... 2 2.1 Lack of Homogeneity... 3 2.2 Determining
SOLUTION OVERVIEW SAS MERCHANDISE INTELLIGENCE. Make the right decisions through every stage of the merchandise life cycle
SOLUTION OVERVIEW SAS MERCHANDISE INTELLIGENCE Make the right decisions through every stage of the merchandise life cycle Deliver profitable returns and rewarding customer experiences Challenges Critical
Data-Driven Decisions: Role of Operations Research in Business Analytics
Data-Driven Decisions: Role of Operations Research in Business Analytics Dr. Radhika Kulkarni Vice President, Advanced Analytics R&D SAS Institute April 11, 2011 Welcome to the World of Analytics! Lessons
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Classification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
A Property & Casualty Insurance Predictive Modeling Process in SAS
Paper AA-02-2015 A Property & Casualty Insurance Predictive Modeling Process in SAS 1.0 ABSTRACT Mei Najim, Sedgwick Claim Management Services, Chicago, Illinois Predictive analytics has been developing
Advanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
KnowledgeSTUDIO HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES
HIGH-PERFORMANCE PREDICTIVE ANALYTICS USING ADVANCED MODELING TECHNIQUES Translating data into business value requires the right data mining and modeling techniques which uncover important patterns within
A Study on the Comparison of Electricity Forecasting Models: Korea and China
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 675 683 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.675 Print ISSN 2287-7843 / Online ISSN 2383-4757 A Study on the Comparison
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
A Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear
Improving Demand Forecasting
Improving Demand Forecasting 2 nd July 2013 John Tansley - CACI Overview The ideal forecasting process: Efficiency, transparency, accuracy Managing and understanding uncertainty: Limits to forecast accuracy,
Course Syllabus. Purposes of Course:
Course Syllabus Eco 5385.701 Predictive Analytics for Economists Summer 2014 TTh 6:00 8:50 pm and Sat. 12:00 2:50 pm First Day of Class: Tuesday, June 3 Last Day of Class: Tuesday, July 1 251 Maguire Building
Towards applying Data Mining Techniques for Talent Mangement
2009 International Conference on Computer Engineering and Applications IPCSIT vol.2 (2011) (2011) IACSIT Press, Singapore Towards applying Data Mining Techniques for Talent Mangement Hamidah Jantan 1,
Equity forecast: Predicting long term stock price movement using machine learning
Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK [email protected] Abstract Long
Data Mining Methods: Applications for Institutional Research
Data Mining Methods: Applications for Institutional Research Nora Galambos, PhD Office of Institutional Research, Planning & Effectiveness Stony Brook University NEAIR Annual Conference Philadelphia 2014
Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016
Event driven trading new studies on innovative way of trading in Forex market Michał Osmoła INIME live 23 February 2016 Forex market From Wikipedia: The foreign exchange market (Forex, FX, or currency
IBM SPSS Modeler Professional
IBM SPSS Modeler Professional Make better decisions through predictive intelligence Highlights Create more effective strategies by evaluating trends and likely outcomes. Easily access, prepare and model
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
Welcome. Data Mining: Updates in Technologies. Xindong Wu. Colorado School of Mines Golden, Colorado 80401, USA
Welcome Xindong Wu Data Mining: Updates in Technologies Dept of Math and Computer Science Colorado School of Mines Golden, Colorado 80401, USA Email: xwu@ mines.edu Home Page: http://kais.mines.edu/~xwu/
Strengthening Diverse Retail Business Processes with Forecasting: Practical Application of Forecasting Across the Retail Enterprise
Paper SAS1833-2015 Strengthening Diverse Retail Business Processes with Forecasting: Practical Application of Forecasting Across the Retail Enterprise Alex Chien, Beth Cubbage, Wanda Shive, SAS Institute
A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century
An Overview of Data Mining: Predictive Modeling for IR in the 21 st Century Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIRPO
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Machine Learning Logistic Regression
Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.
Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research [email protected]
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research [email protected] Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
Get to Know the IBM SPSS Product Portfolio
IBM Software Business Analytics Product portfolio Get to Know the IBM SPSS Product Portfolio Offering integrated analytical capabilities that help organizations use data to drive improved outcomes 123
Data Mining + Business Intelligence. Integration, Design and Implementation
Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution
ElegantJ BI. White Paper. The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis
ElegantJ BI White Paper The Competitive Advantage of Business Intelligence (BI) Forecasting and Predictive Analysis Integrated Business Intelligence and Reporting for Performance Management, Operational
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan
Executive Master's in Business Administration Program
Executive Master's in Business Administration Program College of Business Administration 1. Introduction \ Program Mission: The UOS EMBA program has been designed to deliver high quality management education
Evaluating the Effectiveness of Dynamic Pricing Strategies on MLB Single-Game Ticket Revenue
Evaluating the Effectiveness of Dynamic Pricing Strategies on MLB Single-Game Ticket Revenue Joseph Xu, Peter Fader, Senthil Veeraraghavan The Wharton School, University of Pennsylvania Philadelphia, PA,
WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat
Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise
How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK
How Organisations Are Using Data Mining Techniques To Gain a Competitive Advantage John Spooner SAS UK Agenda Analytics why now? The process around data and text mining Case Studies The Value of Information
An Overview of Predictive Analytics for Practitioners. Dean Abbott, Abbott Analytics
An Overview of Predictive Analytics for Practitioners Dean Abbott, Abbott Analytics Thank You Sponsors Empower users with new insights through familiar tools while balancing the need for IT to monitor
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
Why Ensembles Win Data Mining Competitions
Why Ensembles Win Data Mining Competitions A Predictive Analytics Center of Excellence (PACE) Tech Talk November 14, 2012 Dean Abbott Abbott Analytics, Inc. Blog: http://abbottanalytics.blogspot.com URL:
Ensembles and PMML in KNIME
Ensembles and PMML in KNIME Alexander Fillbrunn 1, Iris Adä 1, Thomas R. Gabriel 2 and Michael R. Berthold 1,2 1 Department of Computer and Information Science Universität Konstanz Konstanz, Germany [email protected]
Supply & Demand Management
Supply & Demand Management Planning and Executing Across the Entire Supply Chain Strategic Planning Demand Management Replenishment/Order Optimization Collaboration/ Reporting & Analytics Network Optimization
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
Business Analytics Syllabus
B6101 Business Analytics Fall 2014 Business Analytics Syllabus Course Description Business analytics refers to the ways in which enterprises such as businesses, non-profits, and governments can use data
Increasing Demand Insight and Forecast Accuracy with Demand Sensing and Shaping. Ganesh Wadawadigi, Ph.D. VP, Supply Chain Solutions, SAP
Increasing Demand Insight and Forecast Accuracy with Demand Sensing and Shaping Ganesh Wadawadigi, Ph.D. VP, Supply Chain Solutions, SAP Legal disclaimer The information in this presentation is confidential
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
What is Data Mining? MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling. MS4424 Data Mining & Modelling
MS4424 Data Mining & Modelling MS4424 Data Mining & Modelling Lecturer : Dr Iris Yeung Room No : P7509 Tel No : 2788 8566 Email : [email protected] 1 Aims To introduce the basic concepts of data mining
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Predictive Dynamix Inc
Predictive Modeling Technology Predictive modeling is concerned with analyzing patterns and trends in historical and operational data in order to transform data into actionable decisions. This is accomplished
Predictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath [email protected] National Institute of Industrial Engineering (NITIE) Vihar
Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com
SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Knowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk [email protected] Tom Kelsey ID5059-19-B &
Enhancing Compliance with Predictive Analytics
Enhancing Compliance with Predictive Analytics FTA 2007 Revenue Estimation and Research Conference Reid Linn Tennessee Department of Revenue [email protected] Sifting through a Gold Mine of Tax Data
Forecasting Stock Prices using a Weightless Neural Network. Nontokozo Mpofu
Forecasting Stock Prices using a Weightless Neural Network Nontokozo Mpofu Abstract In this research work, we propose forecasting stock prices in the stock market industry in Zimbabwe using a Weightless
A Comparison of Leading Data Mining Tools
A Comparison of Leading Data Mining Tools John F. Elder IV & Dean W. Abbott Elder Research Fourth International Conference on Knowledge Discovery & Data Mining Friday, August 28, 1998 New York, New York
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
The Value of Connecting Supply Data to Demand
The Value of Connecting Supply Data to Demand Companies achieve over 10% sales lift by connecting supply chain data to demand when launching new products. Changing the Product Launch Conversation Companies
TDWI Best Practice BI & DW Predictive Analytics & Data Mining
TDWI Best Practice BI & DW Predictive Analytics & Data Mining Course Length : 9am to 5pm, 2 consecutive days 2012 Dates : Sydney: July 30 & 31 Melbourne: August 2 & 3 Canberra: August 6 & 7 Venue & Cost
Benchmarking of different classes of models used for credit scoring
Benchmarking of different classes of models used for credit scoring We use this competition as an opportunity to compare the performance of different classes of predictive models. In particular we want
Scalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
Start-up Companies Predictive Models Analysis. Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov
Start-up Companies Predictive Models Analysis Boyan Yankov, Kaloyan Haralampiev, Petko Ruskov Abstract: A quantitative research is performed to derive a model for predicting the success of Bulgarian start-up
New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
15 : Demand Forecasting
15 : Demand Forecasting 1 Session Outline Demand Forecasting Why Forecast Demand? Business environment is uncertain, volatile, dynamic and risky. Better business decisions can be taken if uncertainty can
