Forecast Model for Box-office Revenue of Motion Pictures
|
|
- Cecilia Cox
- 8 years ago
- Views:
Transcription
1 Forecast Model for Box-office Revenue of Motion Pictures Jae-Mook Lee, Tae-Hyung Pyo December 7, Introduction The main objective of the paper is to develop an econometric model to forecast box-office revenue of motion pictures. Considering the importance demand for new products, marketing researches have developed various demand forecasting models. However, these models forecast future demands based on either several months of initial sales data after new product introduction or the survey data on customer purchase intention. Different from these model, we forecast future demands of new products based on the analysis of historical sales patterns of similar products, Even thought we apply our model to the case of revenue forecasting for motion pictures, it can easily be applied to forecast future demands for several other industries such as books, music albums, video and pharmaceutical products. 1
2 2 Development of Forecast Model for Box- Office Revenue Econometric forecast model for box-office revenue before new release can be divided into two categories. The first type of forecast model is a little more traditional revenue forecasting method to estimate total revenue of each film directly. This model can be represented in more detail as follows: Q i = β 0 + β 1 X 1i β K X Ki + ǫ i (1) where Q i indicates the revenue of film i, X Ki indicates the value of Kth independent variable for film i, β is the parameter to be estimated and ǫ i means the error term of this model. The above equation indicates a typical regression model, and often uses total sales or revenue of box-office as independent Q i. Here, apply Equation (1) to a series of previous film data based on which we can determine boxoffice revenue, and estimate parameter β. Then substitute X, an independent variable of a film under forecast for it. As a result, we can estimate box-office revenue of the film. The second method to forecast the box-office revenue of motion pictures is forecasting weekly box-office revenue/sales pattern of each film, i.e. summing up weekly forecast values from the first to the last week in each month to estimate total revenue. First, it is necessary to set up a step for estimating various box-office patterns from existing data on weekly revenue/sales of each film, followed by grouping those patterns into several parameters. 2
3 According to the result of determined the patterns in the number of weekly sales over various films, this study may suggest the following model: log Q it = γ 0i + γ 1i t + ǫ it (2) where Q it indicates the box-office revenue in the time frame of t after new release of film i, t means the interval of data collection, which also shows how long weeks go by after release herein, indicates the parameter to be estimated, and ǫ it means the error term of model. As already used by previous marketing researchers, Equation (2) shows a forecast model on the assumption that the patterns of weekly box-office revenue for a film follows the function of exponential decay, i.e. becomes gradually declined after new release. In addition, in view of characteristics of Equation (2), it is interesting that γ 0i is a parameter that summarizes the information on sales in the 1st week after release, while γ 1i is a parameter that summarizes the information on the decaying rate of spectator number after release. To apply Equation (2), we need the data on weekly box-office revenue ranging from release to the end of each film. Moreover, γ is a parameter for estimation in each film. So the estimation of this regression equation comprises a course of grouping data on weekly revenue/sales into two parameters, i.e. γ 0i and γ 1i. First, estimate γ 0i and γ 1i respectively in each film. Then apply a regressive model using each of these parameters as dependent variables, while using film characteristics shown in Equation (1) as independent variables. In other words, replace only dependent variables in Equation (1) by γ 0i and γ 1i to estimate a desired regression model herein. 3
4 Estimate the second regression equation, and then substitute X, the value of independent variables of a film under forecast for the equation, so that we can estimate γ 0i and γ 1i of the film in question. Furthermore, substitute these two parameters for Equation (2), so that we can forecast the whole trend of box-office revenue over the time frame of a film ranging from its release to the end. Of course, total revenue of the film can be determined by summing up weekly box-office revenues ranging from new release to the end. We also applied Bayesian regression on these two parameters using a package R2OpenBUGS and compared two methods in terms of MSE(mean square error) and MAE(mean absolute error). 3 The Data The data under analysis covers 266 films released nationwide in the year 2000 in Seoul, Korea. Total sales of each movie can be replaced with total numbers of ticket sales since the prices for a movie are the same for all movies. Advertising expenditure data was obtained from the movie advertising agent, Dave, in Seoul. Dave monitored four major media (TV, radio, newspaper, and magazine) and estimated the expenditure according to each movie s advertisement frequency and time. Other data was collected from the Korean Film Archive ( 4
5 Variables lnsales tv fre news fre dir mean star mean ACAD SF season revi Table 1: Summery statistics of variables Attributes of variables Natural log on total number of admission on each movie Number of TV advertisement Number of newspaper advertisement Average of previous movie admission for the director Average of previous movie admissions for the main actor Genre: Dummy variable for action and adventure movie Genre: Dummy variable for scientific fiction movie Dummy variable for high demand season for movie Review on movies 4 Estimation and Comparison of Models 4.1 How to Forecast the Weekly Revenue of Movies As described above, this approach attempts to forecast revenue of movies over two steps: The first step is to draw certain sales patterns from weekly data for each film. Here, Equation (2) is applied to the data on sales for each film. In other words, this study attempts to estimate 200 regression equations, because the number of films in estimation sample reached 200. The frequency distribution for 200 values of γ 0i and γ 1i as drawn in this course can be outlined as shown in Figure 1. In view of characteristics of Equation (2), γ 0i must have positive values, because it is a parameter that summarizes the information on sales in the first week after release. Total 266 5
6 estimated values of γ 0i were all positive. On the other hand, γ 1i is a parameter that summarizes the information on the slope of weekly sales revenue after new release of a film. So in order that exponential function assumed in Equation (2) may have significance, the value of γ 1i must be negative. It was estimated that total 266 estimated values of γ 0i were all negative. It implies that the models suggested herein have more or less validity. Figure 1: Frequency of intercepts(γ 0i ) and slopes(γ 1i ) Frequency Frequency Estimates of Intercept Estimates of Slope Next procedure refers to modeling the relationships between two parameters(γ 0i and γ 1i ) summarizing information on weekly revenue and the attribute variables of film. The results of regression equation using γ 0i as dependent variable are listed in Table 2, while the results of regression equation using γ 1i as dependent variable are outlined in Table 3. 6
7 First, Table 2 shows that adjust R 2 reaches 0.621, and only 3 variables such as tv fre, news fre, and revi were significant out of 8 variables at p = It is very interesting that newspaper advertisement had the most significant impact on revenue in the first week after new release of a film. Also it was found that other variables such as tv fre and revi had significant effects on revenue in the first week after new release. Table 2: Relation between the intercept(γ 0i ) and variables Variables OLS of Bayesian Intercept 6.639(0.24)** (0.090)** tv fre 0.018(0.005)** 0.018(0.005)** news fre 0.043(0.005)** 0.043(0.005)** dir mean 7.38e-07(7.560e-07) 6.96e-07(7.6e-07) ACAD (0.213) (0.215) SF 0.133(0.363) 0.144(0.374) revi 0.650(0.073)** 0.648(0.072)** season 0.350(0.238) 0.343(0.246) star (6.60e-07) 2.73e-07(6.70e-07) Adjusted-R **significant at p = 0.05 Additionally, Table 2 shows the result of applying Bayesian regression analysis to 8 independent variables. The estimate converged quickly. The Gelman-Rubin diagnostics and confidence interval for estimates are summarized in figure1 of appendix. The result is almost identical to OLS model except the Bayesian regression gave larger value for intercept. The signifi- 7
8 cant coefficients and their estimates are not much different. Likewise, Table 3 outlines the results of regression equation using γ 1i as dependent variable. Adjusted-R 2 reached 0.367, and 4 variables such as tv fre, news fre, SF, and revi are significant at p = Here, it is noteworthy that SF became significant in estimating weekly sales and it is negatively related to them, which implies that the movie admission for SF movies drops quickly then other genre. Moreover, it was found that advertisement on TV and newspaper, and positive review from professional film reviewers also played an effective role in keeping film running over a long period. Table 3: Relation between the slope(γ 1i ) and variables Variables OLS of Bayesian Intercept (0.328)** (0.125)** tv fre 0.013(0.006)** 0.013( 0.006)** news fre 0.040(0.007)** 0.040( 0.007)** dir mean 5.2e-07(1.05e-06) 4.6e-07(1.1e-06) ACAD (0.297) ( 0.300) SF (0.506)** ( 0.522)** revi 0.453(0.102)** 0.449( 0.101)** season 0.646(0.332) 0.636( 0.344) star 1.8e-07(9.2e-07) 1.8e-07(9.4e-07) Adjusted-R **significant at p = 0.05 Table 3 shows the results of applying Bayesian regression analysis to 8 8
9 independent variables. As shown in figure 2 of appendix the model has no sign of divergence. Similar to previous estimates on the first week sales(intercept), parameter estimates are almost identical except the intercepts. We will then compare the accuracy of two method using MSE and MAE. 4.2 Comparison of Two Forecast Models in the Level of Precision A formulated assessment scale is required to determine which one of two models as suggested herein is superior to the other in the level of forecast precision. The typical methods for measuring fitness of model are AIC and BIC for classical regression and DIC for Bayesian regression. However, AIC or BIC cannot obtained from Bayesian regression and DIC cannot calculated from classical regression. Therefore mean standard error(mse) and mean absolute error(mae) have been applied to evaluate the forecast accuracy of each model. MAE = Mean Absolute Error = N i=1 Actual i Predicted i N (3) MSE = Mean Squared Error = N i=1 (Actual i Predicted i ) 2 N (4) In the equations as described above, N indicates sample size, and it amounts to 266 herein. The results of evaluating forecast accuracy are outlined in Table 4. 9
10 Table 4: Comparison between the forecasting models Paramters Method OLS Bayesian Intercept(γ 0i ) MSE Intercept(γ 0i ) MAE Slope (γ 1i ) MSE Slope (γ 1i ) MAE As shown in this table, the OLS is superior to that for total revenue in the level of average absolute error and average square error. However the difference is ignorable. To be short, it indicates that the method of forecasting the patterns of weekly box-office revenue two models yield the same accuracy. Furthermore the parameter estimates are almost identical with the exception of intercepts for both the first week sales and decaying sales revenue over times. 5 Conclusions We forecast box-office revenue of a motion picture given its characteristics such as genre, reviews by movie critics, star power of main actor/actress, directors. review, advertising expenditure and so on. Our results show that advertisements(tv and newspaper) has positive effect on the revenue for the opening week and life cycle of movies as well as positive review. The analysis shows that SF movies has no competitiveness over other movie genre because their weekly sales decay quicker than others. We tried Bayesian regression in addition to classical analysis. The two 10
11 method have equivalent accuracy in estimating the total admission to movies in the first week and sales patters of movies over time. For the forecast of the first week revenue we have fairly good estimates (Adjusted R 2 =0.621). The forecast for sales over time, however, is more unpredictable (Adjusted R 2 =0.367). It might be due to the fact that we did not take account for competition effect after opening of movies. A movies weekly sales would drop quickly if the same genre movie or blockbuster often while it is on screen. Introducing new measurements for competition into model will improve accuracy of forecast. 11
12 APPENDIX A R Code and R2OpenBUGS code setwd( "C:/Users/THPyo/Documents/1Class/Compt in Stat/Project/analysis" ) view <- function( dat,k ){ message <- paste( "First",k,"rows" ) krows <- dat[1:k,] cat( message,"\n","\n" ) print( krows ) } ## data import weeksales <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/weeksales.csv",sep = "," ) colnames(weeksales) <- c("movieid", 1:(ncol(weeksales)-1)) #view(weeksales, 10) ad <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/ad.csv",sep = ",", header=t ) #view(ad, 10) director <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/director.csv",sep = ",", header=t ) #view(director, 10) genre <- read.csv( "C:/Users/THPyo/Documents/1Class 12
13 /Compt in Stat/Project/data/genre.csv",sep = ",", header=t ) #view(genre, 10) review <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/review.csv",sep = ",", header=t ) #view(review, 10) season <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/season.csv",sep = ",", header=t ) #view(season, 10) star <- read.csv( "C:/Users/THPyo/Documents/1Class /Compt in Stat/Project/data/star.csv",sep = ",", header=t ) #view(star, 10) ### Data Transformation #### # data rearrangement dat.func <- function( x ) { dat <- rep( 0, ncol( x )*nrow( x )*3 ) dim(dat) <- c( ncol( x ), 3, nrow( x ) ) comb <- NULL a <- c( 0:( ncol( x )-1 ) ) ## change data interval for ( j in 1:nrow( x ) ) { for ( i in 1:( sum(!is.na( x[j,] ) == TRUE ))) { dat[i,1,j] <- j dat[i,2,j] <- a[i] if ( is.na(x[j, i+1] )) dat[i,3,j]=1 else { dat[i,3,j] <- x[j, i+1] 13
14 } comb <-rbind( comb, dat[i,,j] ) } } comb <- data.frame( comb ) list( comb=comb ) colnames( comb ) <- c( "movieid", "Week", "sales" ) ##change for new variable names print( comb ) } rearranged <-dat.func( weeksales ) # log transformation lnsales <- log( rearranged$sales ) newdata <- cbind( rearranged[,c(1,2)], lnsales ) #view(newdata, 30) ### regression for each obs reg.out <- by( newdata, newdata$movieid, function( x ) lm( lnsales ~ Week, data = x) ) out <- sapply( reg.out, coef ) out <- matrix( out, ncol=nrow(out), byrow=t ) out <- cbind( weeksales$movieid, out ) colnames(out) <- c( "movieid", "int", "coef" ) estimate <- data.frame( out ) 14
15 #view(estimate, 10) # data combine revi <- ( review$rev1+review$rev2+review$rev12 )/3 movie <- data.frame( estimate$movieid, ad$tv_fre, ad$news_fre, director$dir_mean, genre$acad, genre$sf, revi, season$dum_sea, star$star1_mean, estimate$int, estimate$coef ) colnames(movie) <- c( "ID", "tv_fre", "news_fre", "dir_mean", "ACAD", "SF", "revi", "season", "star", "int", "coef" ) #view(movie, 10) #### regression analysis summary( reg.int <- lm(movie$int ~ movie$tv_fre + movie$news_fre + movie$dir_mean + movie$acad + movie$sf + movie$revi + movie$season + movie$star )) summary( reg.coef <- lm(movie$coef ~ movie$tv_fre + movie$news_fre + movie$dir_mean + movie$acad + movie$sf + movie$revi + movie$season + movie$star )) #### Run Bayesian Regression setwd( "C:/Users/THPyo/Documents/1Class/ Compt in Stat/Project/analysis" ) library(r2winbugs) N <- nrow( movie );v <- ncol( movie )-3 15
16 mtv <- mean( movie$tv_fre ) ;mnews <- mean( movie$news_fre ) ;mdir_mean <- mean( movie$dir_mean ); macad <- mean( movie$acad );msf <- mean( movie$sf );mrevi <- mean( movie$revi );mseason <- mean( movie$season );mstar <- mean( movie$star );tv_fre <- movie$tv_fre;news_fre <- movie$news_fre;dir_mean <- movie$dir_mean;acad <- movie$acad;sf <- movie$sf;revi <- movie$revi;season <- movie$season;star <- movie$star;int <- movie$int;coef <- movie$coef ## Bayesian Reg on Intercept (R2OpenBugs) intercept <- list ( "N", "v", "mtv", "mnews", "mdir_mean", "macad", "msf", "mrevi", "mseason", "mstar", "tv_fre", "news_fre", "dir_mean", "ACAD", "SF", "revi", "season", "star", "int" ) inits <- function() {list (a = rnorm( 1,0,100 ), b = rnorm( v,0,100 ), prec = rgamma( 1, 0.001, ))} # or prec parameters <- c( "a", "b","mean", "prec" ) # c("a", "b", "prec", "mean") intercept.sim <- bugs ( intercept, inits, parameters, "inter_r.txt", n.chains=3, n.iter=1000, debug = TRUE, program = "openbugs" ) # delete progrma = "openbus" then use winbug print ( intercept.sim, digits=10 ) 16
17 plot ( intercept.sim ) ## Bayesian Reg on Coefficients (R2OpenBugs) coefficient <- list ( "N", "v", "mtv", "mnews", "mdir_mean", "macad", "msf", "mrevi", "mseason", "mstar", "tv_fre", "news_fre", "dir_mean", "ACAD", "SF", "revi", "season", "star", "coef" ) coefficient.sim <- bugs ( coefficient, inits, parameters, "coef_r.txt", n.chains=3, n.iter=1000, debug = TRUE, program = "openbugs" ) print ( coefficient.sim, digits=8 ) plot ( coefficient.sim ) #### Model Comparison Using MSE # caculating MSE for Intercepts pred.intercept.reg <- predict( reg.int ) pred.int <- cbind( movie$int, pred.intercept.reg, intercept.sim$mean$mean ) colnames( pred.int ) <- c( "orig", "reg", "sim" ) Int.MSE.reg <- var( pred.int[,1]-pred.int[,2] ) Int.MSE.sim <- var( pred.int[,1]-pred.int[,3] ) compare.int <- c( Int.MSE.reg, Int.MSE.sim ) names( compare.int ) <- c( "MSE for Reg on Intercept", 17
18 "MSE for Sim on Intercept" ) compare.int # caculating MSE for Coefficients pred.coef.reg <- predict( reg.coef ) pred.coef <- cbind(movie$coef, pred.coef.reg, coefficient.sim$mean$mean) colnames( pred.coef ) <- c( "orig", "reg", "sim" ) coef.mse.reg <- var( pred.coef[,1]-pred.coef[,2] ) coef.mse.sim <- var( pred.coef[,1]-pred.coef[,3] ) compare.coef <- c( coef.mse.reg, coef.mse.sim) names( compare.coef ) <- c( "MSE for Reg on Coefficient", "MSE for Sim on Coefficient" ) compare.coef # caculating MAE for Intercepts library(msbvar) Int.MAE.reg <- mae( pred.int[,1],pred.int[,2] ) Int.MAE.sim <- mae( pred.int[,1],pred.int[,3] ) MAE.compare.int <- c( Int.MAE.reg, Int.MAE.sim ) names( MAE.compare.int ) <- c( "MAE for Reg on Intercept", "MAE for Sim on Intercept" ) MAE.compare.int 18
19 # caculating MAE for Coefficients coef.mae.reg <- mae( pred.coef[,1],pred.coef[,2] ) coef.mae.sim <- mae( pred.coef[,1],pred.coef[,3] ) MAE.compare.coef <- c( coef.mae.reg, coef.mae.sim) names( MAE.compare.coef ) <- c( "MAE for Reg on Coefficient", "MAE for Sim on Coefficient" ) MAE.compare.coef ################### END OF CODE ########################### B OpenBUGS Code for The Opening Week model { for (i in 1:N) { int[i] ~ dnorm(mean[i], prec) mean[i] <- a + b[1]*(tv_fre[i]-mtv) + b[2]*(news_fre[i]-mnews) + b[3]*(dir_mean[i]-mdir_mean) + b[4]*(acad[i]-macad) + b[5]*(sf[i]-msf) + b[6]*(revi[i]-mrevi) + b[7]*(season[i]-mseason) + b[8]*(star[i]-mstar) } for (i in 1:v) { 19
20 } b[i] ~ dnorm(0, 1.0E-6) } a ~ dnorm(0, 1.0E-6) prec ~ dgamma(0.001, 0.001) C Graphs from R2OpenBUGS 20
21 Figure 2: R2OpenBUGS Output for the Intercept(γ 0i ) Bugs model at "inter_r.txt", fit using OpenBUGS, 3 chains, each with 1000 iterations (first 500 discarded) a b[1] [2] [3] [4] [5] [6] [7] [8] * mean[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] prec 80% interval for each chain R hat * array truncated for lack of space medians and 80% intervals a b * mean prec deviance
22 Figure 3: R2OpenBUGS Output for the slopes(γ 1i ) Bugs model at "coef_r.txt", fit using OpenBUGS, 3 chains, each with 1000 iterations (first 500 discarded) a b[1] [2] [3] [4] [5] [6] [7] [8] * mean[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] prec 80% interval for each chain R hat medians and 80% intervals 4.2 a b * mean prec deviance * array truncated for lack of space
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
More informationA Latent Variable Approach to Validate Credit Rating Systems using R
A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler
More informationMultivariate Logistic Regression
1 Multivariate Logistic Regression As in univariate logistic regression, let π(x) represent the probability of an event that depends on p covariates or independent variables. Then, using an inv.logit formulation
More informationWeek 5: Multiple Linear Regression
BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School
More information2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or
Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus
More informationComparison of sales forecasting models for an innovative agro-industrial product: Bass model versus logistic function
The Empirical Econometrics and Quantitative Economics Letters ISSN 2286 7147 EEQEL all rights reserved Volume 1, Number 4 (December 2012), pp. 89 106. Comparison of sales forecasting models for an innovative
More informationSimple Methods and Procedures Used in Forecasting
Simple Methods and Procedures Used in Forecasting The project prepared by : Sven Gingelmaier Michael Richter Under direction of the Maria Jadamus-Hacura What Is Forecasting? Prediction of future events
More informationForecasting in STATA: Tools and Tricks
Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be
More informationEmpirical Project, part 2, ECO 672, Spring 2014
Empirical Project, part 2, ECO 672, Spring 2014 Due Date: 12 PM, May 12, 2014 Instruction: This is part 2 of the empirical project, which is worth 15 points. You need to work independently on this project.
More informationKey highlights Entertainment & Media Outlook in Italy 2015-2019
www.pwc.com/it/mediaoutlook Key highlights Entertainment & Media Outlook in Italy 2015-2019 Filmed entertainment Italy s total filmed entertainment revenue will rise at 1.0bn 4.1% 1.3bn Global total filmed
More informationE(y i ) = x T i β. yield of the refined product as a percentage of crude specific gravity vapour pressure ASTM 10% point ASTM end point in degrees F
Random and Mixed Effects Models (Ch. 10) Random effects models are very useful when the observations are sampled in a highly structured way. The basic idea is that the error associated with any linear,
More informationChapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )
Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples
More informationLogistic Regression (a type of Generalized Linear Model)
Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge
More informationGamma Distribution Fitting
Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics
More informationAP Physics 1 and 2 Lab Investigations
AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks
More informationCross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models.
Cross Validation techniques in R: A brief overview of some methods, packages, and functions for assessing prediction models. Dr. Jon Starkweather, Research and Statistical Support consultant This month
More informationNIKE Case Study Solutions
NIKE Case Study Solutions Professor Corwin This case study includes several problems related to the valuation of Nike. We will work through these problems throughout the course to demonstrate some of the
More informationTesting for Granger causality between stock prices and economic growth
MPRA Munich Personal RePEc Archive Testing for Granger causality between stock prices and economic growth Pasquale Foresti 2006 Online at http://mpra.ub.uni-muenchen.de/2962/ MPRA Paper No. 2962, posted
More informationA Primer on Forecasting Business Performance
A Primer on Forecasting Business Performance There are two common approaches to forecasting: qualitative and quantitative. Qualitative forecasting methods are important when historical data is not available.
More information11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More informationOutline: Demand Forecasting
Outline: Demand Forecasting Given the limited background from the surveys and that Chapter 7 in the book is complex, we will cover less material. The role of forecasting in the chain Characteristics of
More informationADVANCED FORECASTING MODELS USING SAS SOFTWARE
ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting
More informationTIME SERIES ANALYSIS
TIME SERIES ANALYSIS Ramasubramanian V. I.A.S.R.I., Library Avenue, New Delhi- 110 012 ram_stat@yahoo.co.in 1. Introduction A Time Series (TS) is a sequence of observations ordered in time. Mostly these
More informationDEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9
DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9 Analysis of covariance and multiple regression So far in this course,
More informationWeek TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480
1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500
More informationCharacteristics of Global Calling in VoIP services: A logistic regression analysis
www.ijcsi.org 17 Characteristics of Global Calling in VoIP services: A logistic regression analysis Thanyalak Mueangkhot 1, Julian Ming-Sung Cheng 2 and Chaknarin Kongcharoen 3 1 Department of Business
More informationForecasting the sales of an innovative agro-industrial product with limited information: A case of feta cheese from buffalo milk in Thailand
Forecasting the sales of an innovative agro-industrial product with limited information: A case of feta cheese from buffalo milk in Thailand Orakanya Kanjanatarakul 1 and Komsan Suriya 2 1 Faculty of Economics,
More informationECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2
University of California, Berkeley Prof. Ken Chay Department of Economics Fall Semester, 005 ECON 14 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE # Question 1: a. Below are the scatter plots of hourly wages
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationNonlinear Regression Functions. SW Ch 8 1/54/
Nonlinear Regression Functions SW Ch 8 1/54/ The TestScore STR relation looks linear (maybe) SW Ch 8 2/54/ But the TestScore Income relation looks nonlinear... SW Ch 8 3/54/ Nonlinear Regression General
More informationStatistical Models in R
Statistical Models in R Some Examples Steven Buechler Department of Mathematics 276B Hurley Hall; 1-6233 Fall, 2007 Outline Statistical Models Structure of models in R Model Assessment (Part IA) Anova
More informationTime series Forecasting using Holt-Winters Exponential Smoothing
Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract
More informationSTATISTICA Formula Guide: Logistic Regression. Table of Contents
: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationRidge Regression. Patrick Breheny. September 1. Ridge regression Selection of λ Ridge regression in R/SAS
Ridge Regression Patrick Breheny September 1 Patrick Breheny BST 764: Applied Statistical Modeling 1/22 Ridge regression: Definition Definition and solution Properties As mentioned in the previous lecture,
More informationGeneralized Linear Models
Generalized Linear Models We have previously worked with regression models where the response variable is quantitative and normally distributed. Now we turn our attention to two types of models where the
More informationSection A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
More informationDiscussion Section 4 ECON 139/239 2010 Summer Term II
Discussion Section 4 ECON 139/239 2010 Summer Term II 1. Let s use the CollegeDistance.csv data again. (a) An education advocacy group argues that, on average, a person s educational attainment would increase
More informationForecast. Forecast is the linear function with estimated coefficients. Compute with predict command
Forecast Forecast is the linear function with estimated coefficients T T + h = b0 + b1timet + h Compute with predict command Compute residuals Forecast Intervals eˆ t = = y y t+ h t+ h yˆ b t+ h 0 b Time
More informationModel selection in R featuring the lasso. Chris Franck LISA Short Course March 26, 2013
Model selection in R featuring the lasso Chris Franck LISA Short Course March 26, 2013 Goals Overview of LISA Classic data example: prostate data (Stamey et. al) Brief review of regression and model selection.
More informationPremaster Statistics Tutorial 4 Full solutions
Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for
More informationGLM I An Introduction to Generalized Linear Models
GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial
More informationObjectives of Chapters 7,8
Objectives of Chapters 7,8 Planning Demand and Supply in a SC: (Ch7, 8, 9) Ch7 Describes methodologies that can be used to forecast future demand based on historical data. Ch8 Describes the aggregate planning
More informationThe Bass Model: Marketing Engineering Technical Note 1
The Bass Model: Marketing Engineering Technical Note 1 Table of Contents Introduction Description of the Bass model Generalized Bass model Estimating the Bass model parameters Using Bass Model Estimates
More informationForecasting in supply chains
1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the
More informationMGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal
MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims
More informationA Basic Introduction to Missing Data
John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item
More informationR2MLwiN Using the multilevel modelling software package MLwiN from R
Using the multilevel modelling software package MLwiN from R Richard Parker Zhengzheng Zhang Chris Charlton George Leckie Bill Browne Centre for Multilevel Modelling (CMM) University of Bristol Using the
More informationDemand Forecasting LEARNING OBJECTIVES IEEM 517. 1. Understand commonly used forecasting techniques. 2. Learn to evaluate forecasts
IEEM 57 Demand Forecasting LEARNING OBJECTIVES. Understand commonly used forecasting techniques. Learn to evaluate forecasts 3. Learn to choose appropriate forecasting techniques CONTENTS Motivation Forecast
More informationTime Series Analysis with R - Part I. Walter Zucchini, Oleg Nenadić
Time Series Analysis with R - Part I Walter Zucchini, Oleg Nenadić Contents 1 Getting started 2 1.1 Downloading and Installing R.................... 2 1.2 Data Preparation and Import in R.................
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationExamples. David Ruppert. April 25, 2009. Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.
Cornell University April 25, 2009 Outline 1 2 3 4 A little about myself BA and MA in mathematics PhD in statistics in 1977 taught in the statistics department at North Carolina for 10 years have been in
More informationChapter 13 Introduction to Linear Regression and Correlation Analysis
Chapter 3 Student Lecture Notes 3- Chapter 3 Introduction to Linear Regression and Correlation Analsis Fall 2006 Fundamentals of Business Statistics Chapter Goals To understand the methods for displaing
More informationUsing simulation to calculate the NPV of a project
Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial
More informationIAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results
IAPRI Quantitative Analysis Capacity Building Series Multiple regression analysis & interpreting results How important is R-squared? R-squared Published in Agricultural Economics 0.45 Best article of the
More information2. Linear regression with multiple regressors
2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions
More informationPreholiday Returns and Volatility in Thai stock market
Preholiday Returns and Volatility in Thai stock market Nopphon Tangjitprom Martin de Tours School of Management and Economics, Assumption University Bangkok, Thailand Tel: (66) 8-5815-6177 Email: tnopphon@gmail.com
More informationExperiment #1, Analyze Data using Excel, Calculator and Graphs.
Physics 182 - Fall 2014 - Experiment #1 1 Experiment #1, Analyze Data using Excel, Calculator and Graphs. 1 Purpose (5 Points, Including Title. Points apply to your lab report.) Before we start measuring
More informationCorrelation. What Is Correlation? Perfect Correlation. Perfect Correlation. Greg C Elvers
Correlation Greg C Elvers What Is Correlation? Correlation is a descriptive statistic that tells you if two variables are related to each other E.g. Is your related to how much you study? When two variables
More informationRob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1
Rob J Hyndman Forecasting using 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Outline 1 Regression with ARIMA errors 2 Example: Japanese cars 3 Using Fourier terms for seasonality 4
More informationCentre for Central Banking Studies
Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics
More informationCausal Forecasting Models
CTL.SC1x -Supply Chain & Logistics Fundamentals Causal Forecasting Models MIT Center for Transportation & Logistics Causal Models Used when demand is correlated with some known and measurable environmental
More informationUSE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY
Paper PO10 USE OF ARIMA TIME SERIES AND REGRESSORS TO FORECAST THE SALE OF ELECTRICITY Beatrice Ugiliweneza, University of Louisville, Louisville, KY ABSTRACT Objectives: To forecast the sales made by
More informationPsychology 205: Research Methods in Psychology
Psychology 205: Research Methods in Psychology Using R to analyze the data for study 2 Department of Psychology Northwestern University Evanston, Illinois USA November, 2012 1 / 38 Outline 1 Getting ready
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationhp calculators HP 50g Trend Lines The STAT menu Trend Lines Practice predicting the future using trend lines
The STAT menu Trend Lines Practice predicting the future using trend lines The STAT menu The Statistics menu is accessed from the ORANGE shifted function of the 5 key by pressing Ù. When pressed, a CHOOSE
More informationRegression and Programming in R. Anja Bråthen Kristoffersen Biomedical Research Group
Regression and Programming in R Anja Bråthen Kristoffersen Biomedical Research Group R Reference Card http://cran.r-project.org/doc/contrib/short-refcard.pdf Simple linear regression Describes the relationship
More informationNCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )
Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates
More informationChapter 27 Using Predictor Variables. Chapter Table of Contents
Chapter 27 Using Predictor Variables Chapter Table of Contents LINEAR TREND...1329 TIME TREND CURVES...1330 REGRESSORS...1332 ADJUSTMENTS...1334 DYNAMIC REGRESSOR...1335 INTERVENTIONS...1339 TheInterventionSpecificationWindow...1339
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationSupplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
More informationINCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT
58 INCREASING FORECASTING ACCURACY OF TREND DEMAND BY NON-LINEAR OPTIMIZATION OF THE SMOOTHING CONSTANT Sudipa Sarker 1 * and Mahbub Hossain 2 1 Department of Industrial and Production Engineering Bangladesh
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationAnalysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
More informationEstimation of σ 2, the variance of ɛ
Estimation of σ 2, the variance of ɛ The variance of the errors σ 2 indicates how much observations deviate from the fitted surface. If σ 2 is small, parameters β 0, β 1,..., β k will be reliably estimated
More informationBasic Statistical and Modeling Procedures Using SAS
Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom
More informationInteraction between quantitative predictors
Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationA Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data
A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt
More informationSection Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini
NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building
More informationIndustry Environment and Concepts for Forecasting 1
Table of Contents Industry Environment and Concepts for Forecasting 1 Forecasting Methods Overview...2 Multilevel Forecasting...3 Demand Forecasting...4 Integrating Information...5 Simplifying the Forecast...6
More informationTIME SERIES ANALYSIS
TIME SERIES ANALYSIS L.M. BHAR AND V.K.SHARMA Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-0 02 lmb@iasri.res.in. Introduction Time series (TS) data refers to observations
More informationIntegrated Resource Plan
Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1
More informationImputing Missing Data using SAS
ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are
More informationEarnings Announcement and Abnormal Return of S&P 500 Companies. Luke Qiu Washington University in St. Louis Economics Department Honors Thesis
Earnings Announcement and Abnormal Return of S&P 500 Companies Luke Qiu Washington University in St. Louis Economics Department Honors Thesis March 18, 2014 Abstract In this paper, I investigate the extent
More informationLin s Concordance Correlation Coefficient
NSS Statistical Software NSS.com hapter 30 Lin s oncordance orrelation oefficient Introduction This procedure calculates Lin s concordance correlation coefficient ( ) from a set of bivariate data. The
More informationProduction Planning. Chapter 4 Forecasting. Overview. Overview. Chapter 04 Forecasting 1. 7 Steps to a Forecast. What is forecasting?
Chapter 4 Forecasting Production Planning MRP Purchasing Sales Forecast Aggregate Planning Master Production Schedule Production Scheduling Production What is forecasting? Types of forecasts 7 steps of
More informationMultiple Linear Regression
Multiple Linear Regression A regression with two or more explanatory variables is called a multiple regression. Rather than modeling the mean response as a straight line, as in simple regression, it is
More informationThis unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.
Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course
More informationDynamics of Knowledge Based Industries in Korea
Dynamics of Knowledge Based Industries in Korea Associate Professor, Taehoon Moon Department of Urban and Regional Planning, Chung Ang University San 0- Daeduck Myun Ahnsung City, Kyung Ki Do Republic
More informationA COMPARISON OF REGRESSION MODELS FOR FORECASTING A CUMULATIVE VARIABLE
A COMPARISON OF REGRESSION MODELS FOR FORECASTING A CUMULATIVE VARIABLE Joanne S. Utley, School of Business and Economics, North Carolina A&T State University, Greensboro, NC 27411, (336)-334-7656 (ext.
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationInternational Statistical Institute, 56th Session, 2007: Phil Everson
Teaching Regression using American Football Scores Everson, Phil Swarthmore College Department of Mathematics and Statistics 5 College Avenue Swarthmore, PA198, USA E-mail: peverso1@swarthmore.edu 1. Introduction
More informationPlease follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software
STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used
More informationSouth Carolina College- and Career-Ready (SCCCR) Algebra 1
South Carolina College- and Career-Ready (SCCCR) Algebra 1 South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR) Mathematical Process
More informationStatistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY
Statistics 104 Final Project A Culture of Debt: A Study of Credit Card Spending in America TF: Kevin Rader Anonymous Students: LD, MH, IW, MY ABSTRACT: This project attempted to determine the relationship
More informationSUMAN DUVVURU STAT 567 PROJECT REPORT
SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.
More information1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
More informationThe Basic Two-Level Regression Model
2 The Basic Two-Level Regression Model The multilevel regression model has become known in the research literature under a variety of names, such as random coefficient model (de Leeuw & Kreft, 1986; Longford,
More information