The Operational Value of Social Media Information. Social Media and Customer Interaction
|
|
- Phillip James Hart
- 8 years ago
- Views:
Transcription
1 The Operational Value of Social Media Information Dennis J. Zhang (Kellogg School of Management) Ruomeng Cui (Kelley School of Business) Santiago Gallino (Tuck School of Business) Antonio Moreno-Garcia (Kellogg School of Management) Social Media and Customer Interaction Social media is computer-mediated tool that allow people to create, share or exchange information in virtual communities and networks 83% of Forbes 5 companies use at least one of the social media sites to interact with customers. Top 5 retailers in U.S. earned $3.3 billion from social shopping in 214 (26% higher than 213) 2 1
2 11/17/15 Facebook Facebook accounts for 65% of total social revenue. 3 Research Question In practice with limited data, can we improve sales forecast accuracy and in turn inventory management by incorporating publicly available social media information? If so, by how much? How should we select forecasting models when incorporating social media information? How should we extract features from social media information to improve forecast accuracy? Roadmap Data Features Algorithms Framework Results and Implications 4 2
3 Sales and Social Data (213/1 213/8) Online Apparel Company Sales Operational Data Daily sales and revenue data (2 years) The company s promotion and marketing data (1 year) The company s own daily sales forecast Jan Feb Mar Apr May Jun Jul Sales, Comments, Posts Normalized Comments, Posts and Sales (Data with Forecast) Social Media Data (from Facebook API) The company's 1.5k posts, 24k comments, and 22k likes from more than 7k Facebook users (2 years). Jan Feb Mar Apr May Jun Jul 5 Forecasting Framework Features Features: Basic Features (per day) Past sales, advertisement, promotion data, seasonality Non-textual Features Number of posts, number of comments from users, number unique users visited, etc. Social Features (per day) Textual Feature Number of unique words in comments, average sentiment of comments, etc. For each day: Features i 3 Features i 2 Features i 1 Sales i Sales i = f(features i 1, Features i 2, Features i 3 ) In our algorithm, we use past 7 days data, each day has 4 features. Therefore, our forecasting function takes 7 * 4 = 28 features in total. Socher et al. (213) 6 3
4 Forecasting Framework Forecast Algorithm Forecasting models (linear and nonlinear, with variable selections): Linear models: linear regression, Lasso and forward selection Support vector machine: Linear and Radial Kernel Ensemble models: Gradient boosting model and Random forest. Sales at t-1 Tree 1 Tree 2 Tree N < 5 5 Sales at t = 5 Sentiment at t-1 < 1 1 Sales at t = 4 Sales at t = 6 Voting Leo Breiman (21), Anyd Liaw and MaBhew Wiener (215). 7 Forecasting Framework Forecast Framework Training Period: 9 days Testing Period: 45 days Training Period: 1-fold cross validation with 3 repeats (Industry standard) to tune hyper-parameters. Performance Metrics: Mean Absolute Percentage Error: MAPE = 1 P N N i=1 P N Root Mean Squared Error: RMSE = i=1 (F i S i ) 2 q 1 N Fi Si Si Testing Period: We update our forecast models every 5 days. (Test period = 45 -> We have /5 = 1 models in each testing period). Prediction Horizon (1 7): the number of days ahead we predict. (i.e., Prediction Horizon = 2 ó We predict the sales of the day after tomorrow) 8 4
5 . Sales Jan Feb Mar Apr May Jun Jul Jan Feb Mar Apr May Jan Jun Feb Jul Mar Apr May Jun Jul Jan Feb Mar Apr May Jun Jul Comments Posts Sales Sales 11/17/15 Forecasting Framework Information Value Basic Features Basic Features Basic Features Social Features ML model+ Framework Unknown model + Framework Company Proprietary Information Social Media Forecast Baseline Forecast Company Forecast Value of Social Media Information Value of Machine Learning Models 9 Forecasting Main Results Model: Random Forest (with number of trees and tree depth as tunable variables) Training: 213/1/1 213/4/1 (9 days) Testing: 213/4/2 213/5/6 (45 days) 12 MAPE of Sales Forecasts over Different Prediction Horizaons MAPE (%) %* 23.%*** 9.5% 2.4%** 27.3%*** 15.5%** 18.5%** model Base Forecast Social Forecast Company Forecast Prediction Horizon (days) 1 5
6 Different Starting times MAPE of Sales Forecasts over Different Starting s (Prediction Horizon = 1) 2 MAPE (%) %** model Base Forecast Social Forecast Company Forecast 1 23.%*** 18.7%** 14.%* 13.6%** 5 Jan 1 Jan 15 Feb 1 Feb 15 Mar 1 Starting Value of Social Information is significantly positive throughout different training and testing periods. 11 Different Machine Learning models MAPE of Sales Forecasts over Different Models (Prediction Hoizon = 1) 15 MAPE (%) 1 5 Linear Models SVM Ensemble Models Data base social Linear RegressionForward Selection Lasso SVM (Linear) SVM (Radial) GBM Random Forest Social media Information is valuable if the statistical model : Is nonlinear in features Has variable selection 12 6
7 Model Inspection How important is social media information in forecasting models? Random Forest: Weighted-average Gini Index. 12 Top 2 Features with Highest Gini Importance 9 Past Sales Gini Importance 6 Sentiment Past Comments Type Base Non textual Textual 3 Features Social media information is important in forecasting. Textual processing of social media information is helpful. 13 Take-Away Social Media information helps! In practice with limited data, we can reduce MAPE by 15% when incorporating social media information. Forecasting models that are nonlinear and can actively select variables will benefit more from incorporating social media information. Only counting of social media activities are not enough. Textual analysis of comments and posts are also, if not more, important. 14 7
8 Model Inspection How important is social media information in forecasting models? Random Forest: Weighted-average Gini (impurity) Index. For example, there is a box of balls, half of them are red and half of them of green. We want to predict the color of the ball base on two binary features A and B. All balls with A = 1 are red. Half of balls with B = 1 are red. Which feature is more important?.5*.5= A 1 GI = B 1 GI = - Gini index: Gini impurity index: G = P n c i=1 p i(1 p i ) Gini index of a node: I n = G parent Pi G split i 13 Model Inspection Forecast Algorithm How important is social media information in forecasting models? Lasso: how many variables left from basic features and social features? 1 lasso = 1 NX X px (y i px ij j ) 2 + j A 2 i=1 j=1 j=1 We use a simple and efficient Lasso implementation: Least Angle Regression (Efron et al. 23) We can inspect the model now to see which variables are more correlated with our forecasted sales. 15 8
9 Model Inspection How important is social media information in forecasting models? Lasso: how many variables left from basic features and social features? 3 Number of Features in Lasso Regression Number of Features % 44.1% 44.1% Base Non textual Textual Feature Type Social media information is important in forecasting. Textual processing of social media information is helpful. 16 InLasso FALSE TRUE Least Angle Regression lasso = 1 2 NX X (y i px ij j ) 2 + i=1 1. Find the most correlated x i. 2. Move i to the least square coe cient until some x j becomes as correlated with the residuals as x i. 3. Move ( i, j) to the joint least square coe cients of the residuals until some x k becomes as correlated as x i and x j. 4. Repeat 2, 3 until we selected k variables. j=1 1 px j A j=1 18 9
The Operational Value of Social Media Information
The Operational Value of Social Media Information Ruomeng Cui Kelley School of Business, Indiana University, Bloomington, IN 47405, cuir@indiana.edu Santiago Gallino Tuck School of Business, Dartmouth
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More information10/27/14. Consumer Credit Risk Management. Tackling the Challenges of Big Data Big Data Analytics. Andrew W. Lo
The Challenge of Consumer Credit Risk Management Consumer Credit Risk Management $3T of consumer credit outstanding as of 8/13 $840B of it is revolving consumer credit Average credit card debt as of 10/13:
More informationCross Validation. Dr. Thomas Jensen Expedia.com
Cross Validation Dr. Thomas Jensen Expedia.com About Me PhD from ETH Used to be a statistician at Link, now Senior Business Analyst at Expedia Manage a database with 720,000 Hotels that are not on contract
More informationPredict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationBeating the MLB Moneyline
Beating the MLB Moneyline Leland Chen llxchen@stanford.edu Andrew He andu@stanford.edu 1 Abstract Sports forecasting is a challenging task that has similarities to stock market prediction, requiring time-series
More informationClassification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
More informationFast Analytics on Big Data with H20
Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,
More informationDecision Trees from large Databases: SLIQ
Decision Trees from large Databases: SLIQ C4.5 often iterates over the training set How often? If the training set does not fit into main memory, swapping makes C4.5 unpractical! SLIQ: Sort the values
More informationData Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
More informationMicrosoft Azure Machine learning Algorithms
Microsoft Azure Machine learning Algorithms Tomaž KAŠTRUN @tomaz_tsql Tomaz.kastrun@gmail.com http://tomaztsql.wordpress.com Our Sponsors Speaker info https://tomaztsql.wordpress.com Agenda Focus on explanation
More informationApplication of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation
Application of Event Based Decision Tree and Ensemble of Data Driven Methods for Maintenance Action Recommendation James K. Kimotho, Christoph Sondermann-Woelke, Tobias Meyer, and Walter Sextro Department
More informationApplied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets
Applied Data Mining Analysis: A Step-by-Step Introduction Using Real-World Data Sets http://info.salford-systems.com/jsm-2015-ctw August 2015 Salford Systems Course Outline Demonstration of two classification
More informationHow To Make A Credit Risk Model For A Bank Account
TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions
More informationPredicting borrowers chance of defaulting on credit loans
Predicting borrowers chance of defaulting on credit loans Junjie Liang (junjie87@stanford.edu) Abstract Credit score prediction is of great interests to banks as the outcome of the prediction algorithm
More informationHeritage Provider Network Health Prize Round 3 Milestone: Team crescendo s Solution
Heritage Provider Network Health Prize Round 3 Milestone: Team crescendo s Solution Rie Johnson Tong Zhang 1 Introduction This document describes our entry nominated for the second prize of the Heritage
More informationClassifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
More informationA BEST Case: Forecast Improvement Project. A Tale of Two BUs
A BEST Case: Forecast Improvement Project A Tale of Two BUs Green Belt Project Scope: EUR Region Two Business Units with distinct supply chains Goal: Accuracy improvement by Q3 2013 Workshop in Nov 2013
More informationPackage trimtrees. February 20, 2015
Package trimtrees February 20, 2015 Type Package Title Trimmed opinion pools of trees in a random forest Version 1.2 Date 2014-08-1 Depends R (>= 2.5.0),stats,randomForest,mlbench Author Yael Grushka-Cockayne,
More informationBOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING
BOOSTED REGRESSION TREES: A MODERN WAY TO ENHANCE ACTUARIAL MODELLING Xavier Conort xavier.conort@gear-analytics.com Session Number: TBR14 Insurance has always been a data business The industry has successfully
More informationAT&T Global Network Client for Windows Product Support Matrix January 29, 2015
AT&T Global Network Client for Windows Product Support Matrix January 29, 2015 Product Support Matrix Following is the Product Support Matrix for the AT&T Global Network Client. See the AT&T Global Network
More informationDATA MINING SPECIES DISTRIBUTION AND LANDCOVER. Dawn Magness Kenai National Wildife Refuge
DATA MINING SPECIES DISTRIBUTION AND LANDCOVER Dawn Magness Kenai National Wildife Refuge Why Data Mining Random Forest Algorithm Examples from the Kenai Species Distribution Model Pattern Landcover Model
More informationUNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee
UNDERSTANDING THE EFFECTIVENESS OF BANK DIRECT MARKETING Tarun Gupta, Tong Xia and Diana Lee 1. Introduction There are two main approaches for companies to promote their products / services: through mass
More informationAnalysis One Code Desc. Transaction Amount. Fiscal Period
Analysis One Code Desc Transaction Amount Fiscal Period 57.63 Oct-12 12.13 Oct-12-38.90 Oct-12-773.00 Oct-12-800.00 Oct-12-187.00 Oct-12-82.00 Oct-12-82.00 Oct-12-110.00 Oct-12-1115.25 Oct-12-71.00 Oct-12-41.00
More informationBIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376
Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.
More informationMGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
More informationApplying Data Science to Sales Pipelines for Fun and Profit
Applying Data Science to Sales Pipelines for Fun and Profit Andy Twigg, CTO, C9 @lambdatwigg Abstract Machine learning is now routinely applied to many areas of industry. At C9, we apply machine learning
More informationMaximize Revenues on your Customer Loyalty Program using Predictive Analytics
Maximize Revenues on your Customer Loyalty Program using Predictive Analytics 27 th Feb 14 Free Webinar by Before we begin... www Q & A? Your Speakers @parikh_shachi Technical Analyst @tatvic Loves js
More informationMachine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
More informationStock Market Forecasting Using Machine Learning Algorithms
Stock Market Forecasting Using Machine Learning Algorithms Shunrong Shen, Haomiao Jiang Department of Electrical Engineering Stanford University {conank,hjiang36}@stanford.edu Tongda Zhang Department of
More informationMining Wiki Usage Data for Predicting Final Grades of Students
Mining Wiki Usage Data for Predicting Final Grades of Students Gökhan Akçapınar, Erdal Coşgun, Arif Altun Hacettepe University gokhana@hacettepe.edu.tr, erdal.cosgun@hacettepe.edu.tr, altunar@hacettepe.edu.tr
More informationEnsemble Methods. Adapted from slides by Todd Holloway h8p://abeau<fulwww.com/2007/11/23/ ensemble- machine- learning- tutorial/
Ensemble Methods Adapted from slides by Todd Holloway h8p://abeau
More informationPerformance Measures in Data Mining
Performance Measures in Data Mining Common Performance Measures used in Data Mining and Machine Learning Approaches L. Richter J.M. Cejuela Department of Computer Science Technische Universität München
More informationPredictive Modeling of Titanic Survivors: a Learning Competition
SAS Analytics Day Predictive Modeling of Titanic Survivors: a Learning Competition Linda Schumacher Problem Introduction On April 15, 1912, the RMS Titanic sank resulting in the loss of 1502 out of 2224
More informationClassification and Regression by randomforest
Vol. 2/3, December 02 18 Classification and Regression by randomforest Andy Liaw and Matthew Wiener Introduction Recently there has been a lot of interest in ensemble learning methods that generate many
More informationThe Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network
, pp.67-76 http://dx.doi.org/10.14257/ijdta.2016.9.1.06 The Combination Forecasting Model of Auto Sales Based on Seasonal Index and RBF Neural Network Lihua Yang and Baolin Li* School of Economics and
More informationGerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
More informationUsing Ensemble of Decision Trees to Forecast Travel Time
Using Ensemble of Decision Trees to Forecast Travel Time José P. González-Brenes Guido Matías Cortés What to Model? Goal Predict travel time at time t on route s using a set of explanatory variables We
More informationCOMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*
COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) 2 Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun
More informationCOMPARISON OF FIXED & VARIABLE RATES (25 YEARS) CHARTERED BANK ADMINISTERED INTEREST RATES - PRIME BUSINESS*
COMPARISON OF FIXED & VARIABLE RATES (25 YEARS) 2 Fixed Rates Variable Rates FIXED RATES OF THE PAST 25 YEARS AVERAGE RESIDENTIAL MORTGAGE LENDING RATE - 5 YEAR* (Per cent) Year Jan Feb Mar Apr May Jun
More informationFine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms
Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Yin Zhao School of Mathematical Sciences Universiti Sains Malaysia (USM) Penang, Malaysia Yahya
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationApplied Multivariate Analysis - Big data analytics
Applied Multivariate Analysis - Big data analytics Nathalie Villa-Vialaneix nathalie.villa@toulouse.inra.fr http://www.nathalievilla.org M1 in Economics and Economics and Statistics Toulouse School of
More informationModel Combination. 24 Novembre 2009
Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy
More informationHow To Predict Web Site Visits
Web Site Visit Forecasting Using Data Mining Techniques Chandana Napagoda Abstract: Data mining is a technique which is used for identifying relationships between various large amounts of data in many
More informationD-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
More informationPredicting Flight Delays
Predicting Flight Delays Dieterich Lawson jdlawson@stanford.edu William Castillo will.castillo@stanford.edu Introduction Every year approximately 20% of airline flights are delayed or cancelled, costing
More informationBike sharing model reuse framework for tree-based ensembles
Bike sharing model reuse framework for tree-based ensembles Gergo Barta 1 Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Magyar tudosok krt. 2.
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationINCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT
58 INCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT Sudipa Sarker 1 * and Mahbub Hossain 2 1 Department of Industrial and Production Engineering Bangladesh
More informationEXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER. Copyr i g ht 2013, SAS Ins titut e Inc. All rights res er ve d.
EXPLORING & MODELING USING INTERACTIVE DECISION TREES IN SAS ENTERPRISE MINER ANALYTICS LIFECYCLE Evaluate & Monitor Model Formulate Problem Data Preparation Deploy Model Data Exploration Validate Models
More informationLocation matters. 3 techniques to incorporate geo-spatial effects in one's predictive model
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is
More informationPackage acrm. R topics documented: February 19, 2015
Package acrm February 19, 2015 Type Package Title Convenience functions for analytical Customer Relationship Management Version 0.1.1 Date 2014-03-28 Imports dummies, randomforest, kernelfactory, ada Author
More informationMachine Learning Methods for Demand Estimation
Machine Learning Methods for Demand Estimation By Patrick Bajari, Denis Nekipelov, Stephen P. Ryan, and Miaoyu Yang Over the past decade, there has been a high level of interest in modeling consumer behavior
More informationWinning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering
IEICE Transactions on Information and Systems, vol.e96-d, no.3, pp.742-745, 2013. 1 Winning the Kaggle Algorithmic Trading Challenge with the Composition of Many Models and Feature Engineering Ildefons
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationSales Forecast for Pickup Truck Parts:
Sales Forecast for Pickup Truck Parts: A Case Study on Brake Rubber Mojtaba Kamranfard University of Semnan Semnan, Iran mojtabakamranfard@gmail.com Kourosh Kiani Amirkabir University of Technology Tehran,
More informationBig Data Techniques Applied to Very Short-term Wind Power Forecasting
Big Data Techniques Applied to Very Short-term Wind Power Forecasting Ricardo Bessa Senior Researcher (ricardo.j.bessa@inesctec.pt) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with
More informationTree Ensembles: The Power of Post- Processing. December 2012 Dan Steinberg Mikhail Golovnya Salford Systems
Tree Ensembles: The Power of Post- Processing December 2012 Dan Steinberg Mikhail Golovnya Salford Systems Course Outline Salford Systems quick overview Treenet an ensemble of boosted trees GPS modern
More informationForecasting the first step in planning. Estimating the future demand for products and services and the necessary resources to produce these outputs
PRODUCTION PLANNING AND CONTROL CHAPTER 2: FORECASTING Forecasting the first step in planning. Estimating the future demand for products and services and the necessary resources to produce these outputs
More informationCOPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
More informationAn innovative approach combining industrial process data analytics and operator participation to implement lean energy programs: A Case Study
An innovative approach combining industrial process data analytics and operator participation to implement lean energy programs: A Case Study Philippe Mack, Pepite SA Joanna Huddleston, Pepite SA Bernard
More informationRole of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign
Role of Customer Response Models in Customer Solicitation Center s Direct Marketing Campaign Arun K Mandapaka, Amit Singh Kushwah, Dr.Goutam Chakraborty Oklahoma State University, OK, USA ABSTRACT Direct
More informationChapter 11 Boosting. Xiaogang Su Department of Statistics University of Central Florida - 1 -
Chapter 11 Boosting Xiaogang Su Department of Statistics University of Central Florida - 1 - Perturb and Combine (P&C) Methods have been devised to take advantage of the instability of trees to create
More informationAgenda. Mathias Lanner Sas Institute. Predictive Modeling Applications. Predictive Modeling Training Data. Beslutsträd och andra prediktiva modeller
Agenda Introduktion till Prediktiva modeller Beslutsträd Beslutsträd och andra prediktiva modeller Mathias Lanner Sas Institute Pruning Regressioner Neurala Nätverk Utvärdering av modeller 2 Predictive
More informationEvent driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016
Event driven trading new studies on innovative way of trading in Forex market Michał Osmoła INIME live 23 February 2016 Forex market From Wikipedia: The foreign exchange market (Forex, FX, or currency
More informationUsing multiple models: Bagging, Boosting, Ensembles, Forests
Using multiple models: Bagging, Boosting, Ensembles, Forests Bagging Combining predictions from multiple models Different models obtained from bootstrap samples of training data Average predictions or
More informationUsing INZight for Time series analysis. A step-by-step guide.
Using INZight for Time series analysis. A step-by-step guide. inzight can be downloaded from http://www.stat.auckland.ac.nz/~wild/inzight/index.html Step 1 Click on START_iNZightVIT.bat. Step 2 Click on
More informationModule 6: Introduction to Time Series Forecasting
Using Statistical Data to Make Decisions Module 6: Introduction to Time Series Forecasting Titus Awokuse and Tom Ilvento, University of Delaware, College of Agriculture and Natural Resources, Food and
More informationE-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce
More informationCase 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138. Exhibit 8
Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 1 of 138 Exhibit 8 Case 2:08-cv-02463-ABC-E Document 1-4 Filed 04/15/2008 Page 2 of 138 Domain Name: CELLULARVERISON.COM Updated Date: 12-dec-2007
More informationOMBU ENTERPRISES, LLC. Process Metrics. 3 Forest Ave. Swanzey, NH 03446 Phone: 603-209-0600 Fax: 603-358-3083 E-mail: OmbuEnterprises@msn.
OMBU ENTERPRISES, LLC 3 Forest Ave. Swanzey, NH 03446 Phone: 603-209-0600 Fax: 603-358-3083 E-mail: OmbuEnterprises@msn.com Process Metrics Metrics tell the Process Owner how the process is operating.
More informationIntroduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu
Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics
More informationModel-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups
Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups Achim Zeileis, Torsten Hothorn, Kurt Hornik http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation: Trees, leaves, and
More informationOBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS
OBJECTIVE ASSESSMENT OF FORECASTING ASSIGNMENTS USING SOME FUNCTION OF PREDICTION ERRORS CLARKE, Stephen R. Swinburne University of Technology Australia One way of examining forecasting methods via assignments
More informationSINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND
SINGULAR SPECTRUM ANALYSIS HYBRID FORECASTING METHODS WITH APPLICATION TO AIR TRANSPORT DEMAND K. Adjenughwure, Delft University of Technology, Transport Institute, Ph.D. candidate V. Balopoulos, Democritus
More informationTowards Effective Recommendation of Social Data across Social Networking Sites
Towards Effective Recommendation of Social Data across Social Networking Sites Yuan Wang 1,JieZhang 2, and Julita Vassileva 1 1 Department of Computer Science, University of Saskatchewan, Canada {yuw193,jiv}@cs.usask.ca
More informationDecompose Error Rate into components, some of which can be measured on unlabeled data
Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance
More informationDemand forecasting & Aggregate planning in a Supply chain. Session Speaker Prof.P.S.Satish
Demand forecasting & Aggregate planning in a Supply chain Session Speaker Prof.P.S.Satish 1 Introduction PEMP-EMM2506 Forecasting provides an estimate of future demand Factors that influence demand and
More informationEvaluation of Machine Learning Techniques for Green Energy Prediction
arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques
More informationA COMPARISON OF REGRESSION MODELS FOR FORECASTING A CUMULATIVE VARIABLE
A COMPARISON OF REGRESSION MODELS FOR FORECASTING A CUMULATIVE VARIABLE Joanne S. Utley, School of Business and Economics, North Carolina A&T State University, Greensboro, NC 27411, (336)-334-7656 (ext.
More informationCAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate
More informationRegression and Time Series Analysis of Petroleum Product Sales in Masters. Energy oil and Gas
Regression and Time Series Analysis of Petroleum Product Sales in Masters Energy oil and Gas 1 Ezeliora Chukwuemeka Daniel 1 Department of Industrial and Production Engineering, Nnamdi Azikiwe University
More informationNew Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Introduction
Introduction New Work Item for ISO 3534-5 Predictive Analytics (Initial Notes and Thoughts) Predictive analytics encompasses the body of statistical knowledge supporting the analysis of massive data sets.
More informationIndian School of Business Forecasting Sales for Dairy Products
Indian School of Business Forecasting Sales for Dairy Products Contents EXECUTIVE SUMMARY... 3 Data Analysis... 3 Forecast Horizon:... 4 Forecasting Models:... 4 Fresh milk - AmulTaaza (500 ml)... 4 Dahi/
More informationCI6227: Data Mining. Lesson 11b: Ensemble Learning. Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore.
CI6227: Data Mining Lesson 11b: Ensemble Learning Sinno Jialin PAN Data Analytics Department, Institute for Infocomm Research, A*STAR, Singapore Acknowledgements: slides are adapted from the lecture notes
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationLossless Compression of Cloud-Cover Forecasts for Low-Overhead Distribution in Solar-Harvesting Sensor Networks
Lossless Compression of Cloud-Cover Forecasts for Low-Overhead Distribution in Solar-Harvesting Sensor Networks Christian Renner and Phu Anh Tuan Nguyen ENSsys 14, Memphis, TN, USA November 6 th, 2014
More informationThe multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of
More informationComparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
More informationData Analysis of Trends in iphone 5 Sales on ebay
Data Analysis of Trends in iphone 5 Sales on ebay By Wenyu Zhang Mentor: Professor David Aldous Contents Pg 1. Introduction 3 2. Data and Analysis 4 2.1 Description of Data 4 2.2 Retrieval of Data 5 2.3
More informationCYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION
CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION MATIJA STEVANOVIC PhD Student JENS MYRUP PEDERSEN Associate Professor Department of Electronic Systems Aalborg University,
More informationProblem set on Cross Product
1 Calculate the vector product of a and b given that a= 2i + j + k and b = i j k (Ans 3 j - 3 k ) 2 Calculate the vector product of i - j and i + j (Ans ) 3 Find the unit vectors that are perpendicular
More informationIDENTIFICATION OF DEMAND FORECASTING MODEL CONSIDERING KEY FACTORS IN THE CONTEXT OF HEALTHCARE PRODUCTS
IDENTIFICATION OF DEMAND FORECASTING MODEL CONSIDERING KEY FACTORS IN THE CONTEXT OF HEALTHCARE PRODUCTS Sushanta Sengupta 1, Ruma Datta 2 1 Tata Consultancy Services Limited, Kolkata 2 Netaji Subhash
More informationMachine Learning Capacity and Performance Analysis and R
Machine Learning and R May 3, 11 30 25 15 10 5 25 15 10 5 30 25 15 10 5 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 0 2 4 6 8 101214161822 100 80 60 40 100 80 60 40 100 80 60 40 30 25 15 10 5 25 15 10
More informationPredictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar
More informationEnhanced Vessel Traffic Management System Booking Slots Available and Vessels Booked per Day From 12-JAN-2016 To 30-JUN-2017
From -JAN- To -JUN- -JAN- VIRP Page Period Period Period -JAN- 8 -JAN- 8 9 -JAN- 8 8 -JAN- -JAN- -JAN- 8-JAN- 9-JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- -JAN- 8-JAN- 9-JAN- -JAN- -JAN- -FEB- : days
More informationKnowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &
More information