Forecast Model for Box-office Revenue of Motion Pictures

Similar documents

5. Multiple regression

A Latent Variable Approach to Validate Credit Rating Systems using R

Multivariate Logistic Regression

Week 5: Multiple Linear Regression

2. What is the general linear model to be used to model linear trend? (Write out the model) = or

Comparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function

Simple Methods and Procedures Used in Forecasting

Forecasting in STATA: Tools and Tricks

Empirical Project, part 2, ECO 672, Spring 2014

Key highlights Entertainment & Media Outlook in Italy

E(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F

Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 )

Logistic Regression (a type of Generalized Linear Model)

Gamma Distribution Fitting

AP Physics 1 and 2 Lab Investigations

Cross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models.

NIKE Case Study Solutions

Testing for Granger causality between stock prices and economic growth

A Primer on Forecasting Business Performance

11. Analysis of Case-control Studies Logistic Regression

Outline: Demand Forecasting

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

TIME SERIES ANALYSIS

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Characteristics of Global Calling in VoIP services: A logistic regression analysis

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Nonlinear Regression Functions. SW Ch 8 1/54/

Statistical Models in R

Time series Forecasting using Holt-Winters Exponential Smoothing

STATISTICA Formula Guide: Logistic Regression. Table of Contents

Module 5: Multiple Regression Analysis

Ridge Regression. Patrick Breheny. September 1. Ridge regression Selection of λ Ridge regression in R/SAS

Generalized Linear Models

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

Discussion Section 4 ECON 139/ Summer Term II

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command

Model selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013

Premaster Statistics Tutorial 4 Full solutions

GLM I An Introduction to Generalized Linear Models

Objectives of Chapters 7,8

The Bass Model: Marketing Engineering Technical Note 1

Forecasting in supply chains

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

A Basic Introduction to Missing Data

R2MLwiN Using the multilevel modelling software package MLwiN from R

Demand Forecasting LEARNING OBJECTIVES IEEM Understand commonly used forecasting techniques. 2. Learn to evaluate forecasts

Time Series Analysis with R - Part I. Walter Zucchini, Oleg Nenadić

Simple Linear Regression Inference

Examples. David Ruppert. April 25, Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Using simulation to calculate the NPV of a project

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

2. Linear regression with multiple regressors

Preholiday Returns and Volatility in Thai stock market

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

Correlation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Centre for Central Banking Studies

Causal Forecasting Models

USE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY

Psychology 205: Research Methods in Psychology

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

hp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines

Regression and Programming in R. Anja Bråthen Kristoffersen Biomedical Research Group

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 27 Using Predictor Variables. Chapter Table of Contents

Introduction to Quantitative Methods

Supplement to Call Centers with Delay Information: Models and Insights

INCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Analysis of Bayesian Dynamic Linear Models

Estimation of σ 2, the variance of ɛ

Basic Statistical and Modeling Procedures Using SAS

Interaction between quantitative predictors

Univariate Regression

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Industry Environment and Concepts for Forecasting 1

TIME SERIES ANALYSIS

Integrated Resource Plan

Imputing Missing Data using SAS

Earnings Announcement and Abnormal Return of S&P 500 Companies. Luke Qiu Washington University in St. Louis Economics Department Honors Thesis

Lin s Concordance Correlation Coefficient

Production Planning. Chapter 4 Forecasting. Overview. Overview. Chapter 04 Forecasting 1. 7 Steps to a Forecast. What is forecasting?

Multiple Linear Regression

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

Dynamics of Knowledge Based Industries in Korea

A COMPARISON OF REGRESSION MODELS FOR FORECASTING A CUMULATIVE VARIABLE

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

International Statistical Institute, 56th Session, 2007: Phil Everson

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

South Carolina College- and Career-Ready (SCCCR) Algebra 1

Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY

SUMAN DUVVURU STAT 567 PROJECT REPORT

1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ

The Basic Two-Level Regression Model

Transcription:

Forecast Model for Box-office Revenue of Motion Pictures Jae-Mook Lee, Tae-Hyung Pyo December 7, 2009 1 Introduction The main objective of the paper is to develop an econometric model to forecast box-office revenue of motion pictures. Considering the importance demand for new products, marketing researches have developed various demand forecasting models. However, these models forecast future demands based on either several months of initial sales data after new product introduction or the survey data on customer purchase intention. Different from these model, we forecast future demands of new products based on the analysis of historical sales patterns of similar products, Even thought we apply our model to the case of revenue forecasting for motion pictures, it can easily be applied to forecast future demands for several other industries such as books, music albums, video and pharmaceutical products. 1

2 Development of Forecast Model for Box- Office Revenue Econometric forecast model for box-office revenue before new release can be divided into two categories. The first type of forecast model is a little more traditional revenue forecasting method to estimate total revenue of each film directly. This model can be represented in more detail as follows: Q i = β 0 + β 1 X 1i +... + β K X Ki + ǫ i (1) where Q i indicates the revenue of film i, X Ki indicates the value of Kth independent variable for film i, β is the parameter to be estimated and ǫ i means the error term of this model. The above equation indicates a typical regression model, and often uses total sales or revenue of box-office as independent Q i. Here, apply Equation (1) to a series of previous film data based on which we can determine boxoffice revenue, and estimate parameter β. Then substitute X, an independent variable of a film under forecast for it. As a result, we can estimate box-office revenue of the film. The second method to forecast the box-office revenue of motion pictures is forecasting weekly box-office revenue/sales pattern of each film, i.e. summing up weekly forecast values from the first to the last week in each month to estimate total revenue. First, it is necessary to set up a step for estimating various box-office patterns from existing data on weekly revenue/sales of each film, followed by grouping those patterns into several parameters. 2

According to the result of determined the patterns in the number of weekly sales over various films, this study may suggest the following model: log Q it = γ 0i + γ 1i t + ǫ it (2) where Q it indicates the box-office revenue in the time frame of t after new release of film i, t means the interval of data collection, which also shows how long weeks go by after release herein, indicates the parameter to be estimated, and ǫ it means the error term of model. As already used by previous marketing researchers, Equation (2) shows a forecast model on the assumption that the patterns of weekly box-office revenue for a film follows the function of exponential decay, i.e. becomes gradually declined after new release. In addition, in view of characteristics of Equation (2), it is interesting that γ 0i is a parameter that summarizes the information on sales in the 1st week after release, while γ 1i is a parameter that summarizes the information on the decaying rate of spectator number after release. To apply Equation (2), we need the data on weekly box-office revenue ranging from release to the end of each film. Moreover, γ is a parameter for estimation in each film. So the estimation of this regression equation comprises a course of grouping data on weekly revenue/sales into two parameters, i.e. γ 0i and γ 1i. First, estimate γ 0i and γ 1i respectively in each film. Then apply a regressive model using each of these parameters as dependent variables, while using film characteristics shown in Equation (1) as independent variables. In other words, replace only dependent variables in Equation (1) by γ 0i and γ 1i to estimate a desired regression model herein. 3

Estimate the second regression equation, and then substitute X, the value of independent variables of a film under forecast for the equation, so that we can estimate γ 0i and γ 1i of the film in question. Furthermore, substitute these two parameters for Equation (2), so that we can forecast the whole trend of box-office revenue over the time frame of a film ranging from its release to the end. Of course, total revenue of the film can be determined by summing up weekly box-office revenues ranging from new release to the end. We also applied Bayesian regression on these two parameters using a package R2OpenBUGS and compared two methods in terms of MSE(mean square error) and MAE(mean absolute error). 3 The Data The data under analysis covers 266 films released nationwide in the year 2000 in Seoul, Korea. Total sales of each movie can be replaced with total numbers of ticket sales since the prices for a movie are the same for all movies. Advertising expenditure data was obtained from the movie advertising agent, Dave, in Seoul. Dave monitored four major media (TV, radio, newspaper, and magazine) and estimated the expenditure according to each movie s advertisement frequency and time. Other data was collected from the Korean Film Archive (http://www.koreafilm.or.kr). 4

Variables lnsales tv fre news fre dir mean star mean ACAD SF season revi Table 1: Summery statistics of variables Attributes of variables Natural log on total number of admission on each movie Number of TV advertisement Number of newspaper advertisement Average of previous movie admission for the director Average of previous movie admissions for the main actor Genre: Dummy variable for action and adventure movie Genre: Dummy variable for scientific fiction movie Dummy variable for high demand season for movie Review on movies 4 Estimation and Comparison of Models 4.1 How to Forecast the Weekly Revenue of Movies As described above, this approach attempts to forecast revenue of movies over two steps: The first step is to draw certain sales patterns from weekly data for each film. Here, Equation (2) is applied to the data on sales for each film. In other words, this study attempts to estimate 200 regression equations, because the number of films in estimation sample reached 200. The frequency distribution for 200 values of γ 0i and γ 1i as drawn in this course can be outlined as shown in Figure 1. In view of characteristics of Equation (2), γ 0i must have positive values, because it is a parameter that summarizes the information on sales in the first week after release. Total 266 5

estimated values of γ 0i were all positive. On the other hand, γ 1i is a parameter that summarizes the information on the slope of weekly sales revenue after new release of a film. So in order that exponential function assumed in Equation (2) may have significance, the value of γ 1i must be negative. It was estimated that total 266 estimated values of γ 0i were all negative. It implies that the models suggested herein have more or less validity. Figure 1: Frequency of intercepts(γ 0i ) and slopes(γ 1i ) Frequency 0 10 20 30 40 Frequency 0 10 20 30 40 4 6 8 10 12 14 Estimates of Intercept 10 8 6 4 2 0 Estimates of Slope Next procedure refers to modeling the relationships between two parameters(γ 0i and γ 1i ) summarizing information on weekly revenue and the attribute variables of film. The results of regression equation using γ 0i as dependent variable are listed in Table 2, while the results of regression equation using γ 1i as dependent variable are outlined in Table 3. 6

First, Table 2 shows that adjust R 2 reaches 0.621, and only 3 variables such as tv fre, news fre, and revi were significant out of 8 variables at p = 0.05. It is very interesting that newspaper advertisement had the most significant impact on revenue in the first week after new release of a film. Also it was found that other variables such as tv fre and revi had significant effects on revenue in the first week after new release. Table 2: Relation between the intercept(γ 0i ) and variables Variables OLS of Bayesian Intercept 6.639(0.24)** 10.066(0.090)** tv fre 0.018(0.005)** 0.018(0.005)** news fre 0.043(0.005)** 0.043(0.005)** dir mean 7.38e-07(7.560e-07) 6.96e-07(7.6e-07) ACAD -0.179(0.213) -0.181(0.215) SF 0.133(0.363) 0.144(0.374) revi 0.650(0.073)** 0.648(0.072)** season 0.350(0.238) 0.343(0.246) star 2.75-0.7(6.60e-07) 2.73e-07(6.70e-07) Adjusted-R 2 0.621 **significant at p = 0.05 Additionally, Table 2 shows the result of applying Bayesian regression analysis to 8 independent variables. The estimate converged quickly. The Gelman-Rubin diagnostics and confidence interval for estimates are summarized in figure1 of appendix. The result is almost identical to OLS model except the Bayesian regression gave larger value for intercept. The signifi- 7

cant coefficients and their estimates are not much different. Likewise, Table 3 outlines the results of regression equation using γ 1i as dependent variable. Adjusted-R 2 reached 0.367, and 4 variables such as tv fre, news fre, SF, and revi are significant at p = 0.10. Here, it is noteworthy that SF became significant in estimating weekly sales and it is negatively related to them, which implies that the movie admission for SF movies drops quickly then other genre. Moreover, it was found that advertisement on TV and newspaper, and positive review from professional film reviewers also played an effective role in keeping film running over a long period. Table 3: Relation between the slope(γ 1i ) and variables Variables OLS of Bayesian Intercept -7.080(0.328)** -4.430(0.125)** tv fre 0.013(0.006)** 0.013( 0.006)** news fre 0.040(0.007)** 0.040( 0.007)** dir mean 5.2e-07(1.05e-06) 4.6e-07(1.1e-06) ACAD -0.196(0.297) -0.200( 0.300) SF -1.266(0.506)** -1.251( 0.522)** revi 0.453(0.102)** 0.449( 0.101)** season 0.646(0.332) 0.636( 0.344) star 1.8e-07(9.2e-07) 1.8e-07(9.4e-07) Adjusted-R 2 0.367 **significant at p = 0.05 Table 3 shows the results of applying Bayesian regression analysis to 8 8

independent variables. As shown in figure 2 of appendix the model has no sign of divergence. Similar to previous estimates on the first week sales(intercept), parameter estimates are almost identical except the intercepts. We will then compare the accuracy of two method using MSE and MAE. 4.2 Comparison of Two Forecast Models in the Level of Precision A formulated assessment scale is required to determine which one of two models as suggested herein is superior to the other in the level of forecast precision. The typical methods for measuring fitness of model are AIC and BIC for classical regression and DIC for Bayesian regression. However, AIC or BIC cannot obtained from Bayesian regression and DIC cannot calculated from classical regression. Therefore mean standard error(mse) and mean absolute error(mae) have been applied to evaluate the forecast accuracy of each model. MAE = Mean Absolute Error = N i=1 Actual i Predicted i N (3) MSE = Mean Squared Error = N i=1 (Actual i Predicted i ) 2 N (4) In the equations as described above, N indicates sample size, and it amounts to 266 herein. The results of evaluating forecast accuracy are outlined in Table 4. 9

Table 4: Comparison between the forecasting models Paramters Method OLS Bayesian Intercept(γ 0i ) MSE 2.048069 2.048113 Intercept(γ 0i ) MAE 1.144300 1.144390 Slope (γ 1i ) MSE 3.986199 3.986285 Slope (γ 1i ) MAE 1.604464 1.605046 As shown in this table, the OLS is superior to that for total revenue in the level of average absolute error and average square error. However the difference is ignorable. To be short, it indicates that the method of forecasting the patterns of weekly box-office revenue two models yield the same accuracy. Furthermore the parameter estimates are almost identical with the exception of intercepts for both the first week sales and decaying sales revenue over times. 5 Conclusions We forecast box-office revenue of a motion picture given its characteristics such as genre, reviews by movie critics, star power of main actor/actress, directors. review, advertising expenditure and so on. Our results show that advertisements(tv and newspaper) has positive effect on the revenue for the opening week and life cycle of movies as well as positive review. The analysis shows that SF movies has no competitiveness over other movie genre because their weekly sales decay quicker than others. We tried Bayesian regression in addition to classical analysis. The two 10

method have equivalent accuracy in estimating the total admission to movies in the first week and sales patters of movies over time. For the forecast of the first week revenue we have fairly good estimates (Adjusted R 2 =0.621). The forecast for sales over time, however, is more unpredictable (Adjusted R 2 =0.367). It might be due to the fact that we did not take account for competition effect after opening of movies. A movies weekly sales would drop quickly if the same genre movie or blockbuster often while it is on screen. Introducing new measurements for competition into model will improve accuracy of forecast. 11

APPENDIX A R Code and R2OpenBUGS code setwd( "C:/Users/THPyo/Documents/1Class/Compt in Stat/Project/analysis" ) view <- function( dat,k ){ message <- paste( "First",k,"rows" ) krows <- dat[1:k,] cat( message,"\n","\n" ) print( krows ) } ## data import weeksales <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/weeksales.csv",sep = "," ) colnames(weeksales) <- c("movieid", 1:(ncol(weeksales)-1)) #view(weeksales, 10) ad <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/ad.csv",sep = ",", header=t ) #view(ad, 10) director <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/director.csv",sep = ",", header=t ) #view(director, 10) genre <- read.csv( "C:/Users/THPyo/Documents/1Class 12

/Compt in Stat/Project/data/genre.csv",sep = ",", header=t ) #view(genre, 10) review <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/review.csv",sep = ",", header=t ) #view(review, 10) season <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/season.csv",sep = ",", header=t ) #view(season, 10) star <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/star.csv",sep = ",", header=t ) #view(star, 10) ### Data Transformation #### # data rearrangement dat.func <- function( x ) { dat <- rep( 0, ncol( x )*nrow( x )*3 ) dim(dat) <- c( ncol( x ), 3, nrow( x ) ) comb <- NULL a <- c( 0:( ncol( x )-1 ) ) ## change data interval for ( j in 1:nrow( x ) ) { for ( i in 1:( sum(!is.na( x[j,] ) == TRUE ))) { dat[i,1,j] <- j dat[i,2,j] <- a[i] if ( is.na(x[j, i+1] )) dat[i,3,j]=1 else { dat[i,3,j] <- x[j, i+1] 13

} comb <-rbind( comb, dat[i,,j] ) } } comb <- data.frame( comb ) list( comb=comb ) colnames( comb ) <- c( "movieid", "Week", "sales" ) ##change for new variable names print( comb ) } rearranged <-dat.func( weeksales ) # log transformation lnsales <- log( rearranged$sales ) newdata <- cbind( rearranged[,c(1,2)], lnsales ) #view(newdata, 30) ### regression for each obs reg.out <- by( newdata, newdata$movieid, function( x ) lm( lnsales ~ Week, data = x) ) out <- sapply( reg.out, coef ) out <- matrix( out, ncol=nrow(out), byrow=t ) out <- cbind( weeksales$movieid, out ) colnames(out) <- c( "movieid", "int", "coef" ) estimate <- data.frame( out ) 14

#view(estimate, 10) # data combine revi <- ( review$rev1+review$rev2+review$rev12 )/3 movie <- data.frame( estimate$movieid, ad$tv_fre, ad$news_fre, director$dir_mean, genre$acad, genre$sf, revi, season$dum_sea, star$star1_mean, estimate$int, estimate$coef ) colnames(movie) <- c( "ID", "tv_fre", "news_fre", "dir_mean", "ACAD", "SF", "revi", "season", "star", "int", "coef" ) #view(movie, 10) #### regression analysis summary( reg.int <- lm(movie$int ~ movie$tv_fre + movie$news_fre + movie$dir_mean + movie$acad + movie$sf + movie$revi + movie$season + movie$star )) summary( reg.coef <- lm(movie$coef ~ movie$tv_fre + movie$news_fre + movie$dir_mean + movie$acad + movie$sf + movie$revi + movie$season + movie$star )) #### Run Bayesian Regression setwd( "C:/Users/THPyo/Documents/1Class/ Compt in Stat/Project/analysis" ) library(r2winbugs) N <- nrow( movie );v <- ncol( movie )-3 15

mtv <- mean( movie$tv_fre ) ;mnews <- mean( movie$news_fre ) ;mdir_mean <- mean( movie$dir_mean ); macad <- mean( movie$acad );msf <- mean( movie$sf );mrevi <- mean( movie$revi );mseason <- mean( movie$season );mstar <- mean( movie$star );tv_fre <- movie$tv_fre;news_fre <- movie$news_fre;dir_mean <- movie$dir_mean;acad <- movie$acad;sf <- movie$sf;revi <- movie$revi;season <- movie$season;star <- movie$star;int <- movie$int;coef <- movie$coef ## Bayesian Reg on Intercept (R2OpenBugs) intercept <- list ( "N", "v", "mtv", "mnews", "mdir_mean", "macad", "msf", "mrevi", "mseason", "mstar", "tv_fre", "news_fre", "dir_mean", "ACAD", "SF", "revi", "season", "star", "int" ) inits <- function() {list (a = rnorm( 1,0,100 ), b = rnorm( v,0,100 ), prec = rgamma( 1, 0.001, 0.001 ))} # or prec parameters <- c( "a", "b","mean", "prec" ) # c("a", "b", "prec", "mean") intercept.sim <- bugs ( intercept, inits, parameters, "inter_r.txt", n.chains=3, n.iter=1000, debug = TRUE, program = "openbugs" ) # delete progrma = "openbus" then use winbug print ( intercept.sim, digits=10 ) 16

plot ( intercept.sim ) ## Bayesian Reg on Coefficients (R2OpenBugs) coefficient <- list ( "N", "v", "mtv", "mnews", "mdir_mean", "macad", "msf", "mrevi", "mseason", "mstar", "tv_fre", "news_fre", "dir_mean", "ACAD", "SF", "revi", "season", "star", "coef" ) coefficient.sim <- bugs ( coefficient, inits, parameters, "coef_r.txt", n.chains=3, n.iter=1000, debug = TRUE, program = "openbugs" ) print ( coefficient.sim, digits=8 ) plot ( coefficient.sim ) #### Model Comparison Using MSE # caculating MSE for Intercepts pred.intercept.reg <- predict( reg.int ) pred.int <- cbind( movie$int, pred.intercept.reg, intercept.sim$mean$mean ) colnames( pred.int ) <- c( "orig", "reg", "sim" ) Int.MSE.reg <- var( pred.int[,1]-pred.int[,2] ) Int.MSE.sim <- var( pred.int[,1]-pred.int[,3] ) compare.int <- c( Int.MSE.reg, Int.MSE.sim ) names( compare.int ) <- c( "MSE for Reg on Intercept", 17

"MSE for Sim on Intercept" ) compare.int # caculating MSE for Coefficients pred.coef.reg <- predict( reg.coef ) pred.coef <- cbind(movie$coef, pred.coef.reg, coefficient.sim$mean$mean) colnames( pred.coef ) <- c( "orig", "reg", "sim" ) coef.mse.reg <- var( pred.coef[,1]-pred.coef[,2] ) coef.mse.sim <- var( pred.coef[,1]-pred.coef[,3] ) compare.coef <- c( coef.mse.reg, coef.mse.sim) names( compare.coef ) <- c( "MSE for Reg on Coefficient", "MSE for Sim on Coefficient" ) compare.coef # caculating MAE for Intercepts library(msbvar) Int.MAE.reg <- mae( pred.int[,1],pred.int[,2] ) Int.MAE.sim <- mae( pred.int[,1],pred.int[,3] ) MAE.compare.int <- c( Int.MAE.reg, Int.MAE.sim ) names( MAE.compare.int ) <- c( "MAE for Reg on Intercept", "MAE for Sim on Intercept" ) MAE.compare.int 18

# caculating MAE for Coefficients coef.mae.reg <- mae( pred.coef[,1],pred.coef[,2] ) coef.mae.sim <- mae( pred.coef[,1],pred.coef[,3] ) MAE.compare.coef <- c( coef.mae.reg, coef.mae.sim) names( MAE.compare.coef ) <- c( "MAE for Reg on Coefficient", "MAE for Sim on Coefficient" ) MAE.compare.coef ################### END OF CODE ########################### B OpenBUGS Code for The Opening Week model { for (i in 1:N) { int[i] ~ dnorm(mean[i], prec) mean[i] <- a + b[1]*(tv_fre[i]-mtv) + b[2]*(news_fre[i]-mnews) + b[3]*(dir_mean[i]-mdir_mean) + b[4]*(acad[i]-macad) + b[5]*(sf[i]-msf) + b[6]*(revi[i]-mrevi) + b[7]*(season[i]-mseason) + b[8]*(star[i]-mstar) } for (i in 1:v) { 19

} b[i] ~ dnorm(0, 1.0E-6) } a ~ dnorm(0, 1.0E-6) prec ~ dgamma(0.001, 0.001) C Graphs from R2OpenBUGS 20

Figure 2: R2OpenBUGS Output for the Intercept(γ 0i ) Bugs model at "inter_r.txt", fit using OpenBUGS, 3 chains, each with 1000 iterations (first 500 discarded) a b[1] [2] [3] [4] [5] [6] [7] [8] * mean[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] prec 80% interval for each chain R hat 10 10 0 0 10 10 * array truncated for lack of space 20 20 1 1.5 2+ 1 1.5 2+ medians and 80% intervals 10.2 10.1 a 10 9.9 1 0.5 b 0 0.5 1 2 3 4 5 6 7 8 20 15 * mean 10 5 1 2 3 4 5 6 7 8 910 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 0.55 0.5 prec 0.45 0.4 965 960 deviance955 950 945 21

Figure 3: R2OpenBUGS Output for the slopes(γ 1i ) Bugs model at "coef_r.txt", fit using OpenBUGS, 3 chains, each with 1000 iterations (first 500 discarded) a b[1] [2] [3] [4] [5] [6] [7] [8] * mean[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] prec 80% interval for each chain R hat 10 5 0 5 1 1.5 2+ medians and 80% intervals 4.2 a 4.4 4.6 2 b 0 2 1 2 3 4 5 6 7 8 5 0 * mean 5 10 1 2 3 4 5 6 7 8 910 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 0.3 prec 0.25 0.2 1140 1135 deviance 1130 10 5 0 5 * array truncated for lack of space 1 1.5 2+ 1125 22