Bayesian Forecasting of an Inhomogeneous Poisson Process with Applications to Call Center Data
|
|
|
- Chad Rogers
- 10 years ago
- Views:
Transcription
1 Bayesian Forecasting of an Inhomogeneous Poisson Process with Applications to Call Center Data Jonathan Weinberg, Lawrence D. Brown, and Jonathan R. Stroud June 26, 26 Abstract A call center is a centralized hub where customer and other telephone calls are dealt with by an organization. In today s economy, they have become the primary point of contact between customers and businesses. Accurate prediction of the call arrival rate is therefore indispensable for call center practitioners to staff their call center efficiently and cost effectively. This article proposes a multiplicative model for modeling and forecasting within-day arrival rates to a US commercial bank s call center. Markov chain Monte Carlo sampling methods are used to estimate both latent states and model parameters. One-day-ahead density forecasts for the rates and counts are provided. The calibration of these predictive distributions is evaluated through probability integral transforms. Furthermore, we provide one-day-ahead forecasts comparisons with classical statistical models. Our predictions show significant improvements of up to 25% over these standards. A sequential Monte Carlo algorithm is also proposed for sequential estimation and forecasts of the model parameters and rates. Keywords: Autoregressive models; Bayesian forecasting; call centers; cubic smoothing spline; inhomogeneous Poisson process; Markov chain Monte Carlo; multiplicative models; sequential Monte Carlo; state space models. Jonathan Weinberg is a Doctoral Student of Statistics, Lawrence D. Brown is the Miers Busch Professor of Statistics and Jonathan R. Stroud is Assistant Professor of Statistics, The Wharton School, University of Pennsylvania, 373 Walnut Street, 4 Huntsman Hall, Philadelphia, PA , [email protected], [email protected], [email protected],
2 1 Introduction A call center is a centralized hub that exists solely for the purpose of making or attending calls to or from customers or prospective customers. In today s economy, call centers have not only become the primary point of contact between customers and businesses but also a major investment for many organizations. According to some recent industry estimates, there are approximately 2.86 million operator positions in over 5, call centers in the US with some locations employing over 1 agents. Due to the magnitude of these operations, call center supervisors need to staff their organization efficiently in order to provide a satisfactory level of service at reasonable costs (see Gans et al. 23). Proper management of such a center requires estimation of several operational primitives, which combined with queuing theory considerations, determine appropriate staffing levels. Accurate prediction of the level of customer arrivals is the most difficult of the primitives to assess. In the past 15 years, major advances have been made in modeling and predicting arrivals to a telephone call center. Early models describing the arrival process include ARIMA processes and transfer functions (see Bianchi et al and Andrews and Cunningham 1995). Recent empirical studies have revealed stylized dynamics of the arrival process. Jongbloed and Koole (21) remark that the process follows an inhomogeneous Poisson process where the arrival rate is stochastic. Furthermore, Avramides et al. (24) remark that the arrival rate varies considerably throughout the day and within their model it appears that call volumes within short periods exhibit strong serial autocorrelation. Finally, Brown et al. (25) observe persistence in the day-to-day dynamics of the arrival rate. In this paper, we extend on the work of Avramides and co-authors (24) and Brown et al. (25) and consider an inhomogeneous Poisson model where the arrival rate incorporates both strong within-day patterns and day-to-day dynamics to forecast future arrival rates. The temporal structure of the arrival rate has therefore a two way character, since there is day-to-day variation and also intraday variation. We model these two types of variation separately before combining them in a multiplicative model. Furthermore we provide a fast and efficient Bayesian Markov chain Monte Carlo (MCMC) estimation algorithm. Recent MCMC methods proposed for inhomogeneous Poisson processes include Shephard and Pitt (1997) and Gamerman (1998) who consider the class of exponential families and generalized linear models, respectively. Soyer and Tarimcilar (24) provide a Bayesian analysis of an inhomogeneous Cox process to model call center data but focus on the impact of advertising campaigns on arrival rates. A rather different approach is considered here which relies on taking a slightly modified square root of the binned point-process counts and then treating these via variations of multiplicative Gaussian time series models. The outline of this paper is as follows. In Section 2, we describe the call center analyzed in the 2
3 study and the data provided to us. In Section 3, we propose a model for predicting the call arrival rate which is essential in predicting the workload and consequently the staffing of the call center. In addition, we discuss the choice of priors and present the Markov chain Monte Carlo algorithm used to estimate the parameters and latent states in the model. Section 4 gives result from the fitted model to the call center data. A comparative study of competing models based on one day ahead forecasting performance is also presented. A sequential Monte Carlo algorithm, proposed to sequentially estimate the current rate, is presented in Section 4.3. In Section 5, we discuss the advantages of the method developed in this paper, and outline possible extensions to our work. 2 The Data In this paper we consider data provided by a large North American commercial bank. A detailed summary of each call handled by the call center is provided from March to October 23. The data was extracted using DATA-MOCCA (see Trofimov et al. 25) and can be found on the authors website at www-stat.wharton.upenn.edu/ weinber2/data.txt. The path followed by a call through the call center is as follows. A customer places a call to the bank s call center. The customer dials a specific phone number according to the service he desires. Once the call reaches the call center, the customer is greeted by a voice response unit (VRU), a computer automated machine which offers various types of information such as the banks opening hours and the user s account information. The user is prompted by the machine to select one of a variety of options. Approximately 8% of customers receive the required information through the VRU and hang up. The remaining 2% of the customers advance to the service queue. There the customer waits until the next available agent becomes available. The service queue does not adhere to the usual first in, first out (FIFO) principle as there are premium customers who are prioritized in the queue. If more than one agent is available, the call is routed to the appropriate agent who has been idle for the longest time. While waiting in the queue, a small fraction of customers run out of patience and hang up before actually speaking to an agent. The bank has various branches of operations such as retail banking, consumer lending and private banking for premium clients, just to mention a few. The call arrival pattern is distinctively different for each type of service. For this reason, we restrict our attention to calls handled by the retail banking division which accounts for approximately 68% of the calls. Although the call center is open 24 hours a day, we concentrate our effort on calls received between 7am and 9:5pm as this is when the call center is most active. Not surprisingly, the pattern of call arrivals is very different for weekdays and for weekends. Furthermore, the weekend opening hours are very different from the weekday hours of operation. For these reasons and to simplify notation, we focus our attention on weekdays in this paper. We do however give results of the analysis that includes weekends in section The salary of the telephone service operators accounts for almost all the operational costs of a 3
4 call center (about 7% of their budget). For this reason, accurate predictions of the arrival rate to the service queue are necessary for supervisors to staff their centers effectively. These predictions coupled with a precise estimate of service times lead to a good forecast of the load using Little s Law (Little 1961) Monday Tuesday Wednesday Thursday Friday Volume x /3/3 4/18/3 6/4/3 7/21/3 9/4/3 1/21/3 Day Figure 1: Time series of daily retail banking call volumes handled on non-holiday weekdays between 7am and 9:5pm from March 3 to October 24, 23. The data used in the study are summarized in Figures 1 and 2. Figure 1 shows a time series of the daily volumes of calls that reach the call center s service queue from March 3 to October 24, 23. A number of patterns emerge from this plot: Monday is the busiest day of the week; there is then a gradual decrease in calls from Tuesday to Thursday; the volume increases again on Friday. Furthermore, the plot reveals one significant outlier which lands quite strangely on a Tuesday. After closer inspection, we realize that this corresponds to the day after Labor day. The closure of the call center on that holiday explains the unusually high volume on this particular day. Since this is in effect the first day of the week, we will proceed by modeling this day as a Monday. This is the only such day in our data set hence we don t have enough data to handle this issue in a completely satisfactory manner. Our best guess is that if we did have extensive additional data that it would suggest handling days after Monday holidays as a Monday, as we have decided to do, or (if we had experience from enough such days) that it would suggest making this a new, additional category of day. This is thus an issue about which our data is not sufficient to supply a definitive answer. 4
5 Monday Tuesday Wednesday Thursday Friday Normalized Arrivals : 9: 11: 13: 15: 17: 19: 21: Time of Day Figure 2: A smoothing spline fit of the Retail banking call arrivals to the service queue normalized by the daily volumes: March 3rd - October 24th 23 Figure 2 displays the average within-day call arrival patterns for each weekday from March 3 through October 24, 23. The five curves are obtained by fitting the smooth.spline function in S-plus to the volumes of calls that reach the service queue every 5 minutes, normalized by the daily volumes. From this plot, we can infer three distinct within-day features. First, the call arrivals increase sharply between 7am and 1am, at which time the call center is at its busiest. The volume then decreases linearly as the day progresses. At around 5pm, as the work day finishes, we notice a sudden downward slope in the number of calls. Although the within-day patterns share these common trends, there are sufficient differences across weekdays that we model the within-day pattern for each day separately. In particular, we observe that Monday morning starts off slower than the four other weekdays. In addition, there is an unusually quick dropoff in the number of calls on Friday afternoon. 3 The Model Classical queuing theory assumes that the arrivals to the service queue follow a Poisson process with constant rate. Figure 2 clearly illustrates that the standard theory cannot be applied in this context as there are distinct differences in call arrival rates within a day. To overcome this modeling issue, call center practitioners assume a constant rate for short time intervals and consequently apply the 5
6 standard queueing methodology on these shorter time intervals. Jongbloed and Koole (21) argue that the rate is not constant over these short time frames and proceed by modeling the rate of arrivals in these periods using a Poisson mixture to account for the overdispersion in the data. In their recent work, Brown et al. (25) remark that the arrival pattern follows a time inhomogeneous Poisson process where the rate evolves smoothly through the day. Furthermore, they state that the Poisson arrival rates, which we will refer to as λ(t), should be modeled as a stochastic process due to the overdispersed data, rather than a deterministic function of time of day and day of week. Extending on this work, we construct a model to estimate and predict future λ(t) s. Let N jk denote the number of arrivals to the queue on day j = 1,...,J during the time interval [t k 1,t k ] where k = 1,...,K is the kth period in the day. In our study, we have J = 164 days and K = 169 time periods since we are using five minute intervals and t k = k/169. In addition, let the weekday corresponding to day j be denoted by d j (for example, d j = 1 signifies that day j is a Monday). We first consider the following model N jk Poiss(λ jk ), λ jk = R dj (t k ) v j + ɛ jk (1a) where λ jk is the arrival rate for day j and period k, R dj (t k ) is the proportion of daily calls on day j that are handled during the time interval [t k 1,t k ), v j is a proxy for the daily volume on day j, and ɛ jk is a random error. The assumption here is that every weekday has a different within-day pattern. Furthermore R i is a density and therefore K R dj (t k ) = 1 for d j = 1,...,5. k=1 (1b) Next, we follow the work of Brown et al. (25), and use the variance stabilizing transformation for Poisson data, which is based on the following result (Brown et al., 21): If N is Poiss(λ), then Y = N has approximately a mean λ and variance 1 4. In addition, as λ, Y is approximately normal. This approximation is very accurate for the dataset used in our analysis since the arrival counts are fairly large ( 3). Using the approximation discussed above and defining y jk = N jk + 1 4, we obtain the model y jk = g dj (t k )x j + ɛ jk, ɛ jk iid N(,σ 2 ). (2) The connection between (1a) and (2) is established by realizing that g dj (t k ) = R dj (t k ) and x j = vj. We take into account the correlation between daily volumes by assuming an autoregressive structure for the daily random effect x j, adjusting for the type of day. The model for this can be 6
7 written as x j α dj = β(x j 1 α dj 1 ) + η j, η j iid N(,ψ 2 ) (3) where α dj denote the intercept for day d j. It should be noted that other models that incorporated linear and quadratic trend or higher order autocorrelation were considered but did not substantially improve the model fit. Following remarks made on Figure 2, a different daily pattern of call arrivals for each day of the week is assumed and is denoted by g dj. In addition, we incorporate smoothness in the within-day pattern through the following model d 2 g dj (t k ) dt 2 k = τ dj dw dj (t k ) dt k (4a) where K g dj (t k ) 2 = 1, for d j = 1,...,5. (4b) k=1 Here W dj (t) are independent Wiener processes with W dj () = and var {W dj (t)} = t. Since g refers to the within-day pattern, the restriction (4b) is equivalent to the constraint (1b) on the R s. This assumption is also needed for identifiability purposes. The unconstrained prior on the g s, given by (4a), has close connections to cubic smoothing splines. For more information on the equivalence between this prior distribution and the cubic smoothing spline, see Wahba (1983). Following the work of Kohn and Ansley (1987), we can define z dj (t k ) = {g dj (t k ),dg dj (t k )/dt k }, and rewrite the prior on the g s as a vector autoregressive process on the z dj s. This leads to the following model y jk = h z dj (t k )x j + ɛ jk, ɛ jk N(,σ 2 ) (5) x j α dj = β(x j 1 α dj 1 ) + η j, η j N(,ψ 2 ) (6) z dj (t k ) = F(δ) z dj (t k 1 ) + u k, u k N(,τ 2 d j U(δ)) (7) where ɛ jk,η j and u k are mutually independent, δ = t k t k 1 and h = [1,]. The matrices F(δ) and U(δ) are defined as ( 1 δ ) ( δ 3 /3 δ 2 /2 ) F(δ) = 1, U(δ) = δ 2 /2 δ. We assume diffuse distributions for the initial states x 1 and z dj (t 1 ) for d j = 1,...,5. Equations (5)-(7) correspond to a multiplicative model with two latent states which evolve on different time 7
8 scales. The formulation above also facilitates computation, as conditional on each latent state variable, the model can be cast into linear state space form. The efficient forward-filtering backwardsampling (FFBS) algorithm proposed by Carter and Kohn (1994) and Frühwirth-Schnatter (1994) can consequently be implemented to carry out Bayesian inference on the model. It should be noted that the Bayesian methodology developed by Shephard and Pitt (1997) and Gamerman (1998) for dynamic generalized linear models could have been applied to the original Poisson model. However, we believe that the root-unroot methodology has several distinct advantages here. First, the normal approximation is very accurate in this study. Furthermore, a conjugate multivariate normal prior with a wide variety of covariance structures such as a moving average or autoregressive process can be imposed on x and g. An equivalent conjugate prior on the Poisson rates is much harder to derive and is the subject of current research. The methodology developed by Chen and Fomby (1999) for stable seasonal pattern models could also have been used here. Incorporating the within-day dependencies through smoothness would however be quite complicated with their methods. To complete the Bayesian specification of the model, we need to specify the prior distributions on the parameters. 3.1 Prior Selection For notational convenience, let α = (α 1,α 2,α 3,α 4,α 5 ), α = 4 α i, τ 2 = (τ1 2,τ2 2,τ2 3,τ2 4,τ2 5 ) and θ = (α,β,ψ 2,τ 2,σ 2 ). After some careful sensitivity analysis, the following priors were selected for the parameters of the model described above. Independent priors are imposed on all parameters except on the individual α i. Hence, p(θ) can be written as the product of the following priors ( 4 p(α) i=1 (α i α) 2 ) 1 i=1 p(β) = U[, 1] p(τi 2 ) = IG(a τ i,b τi ) i = 1,...,5 p(σ 2 ) = IG(a σ,b σ ) p(ψ 2 ) = IG(a ψ,b ψ ). Several remarks should be made on the choice of priors. The traditional multivariate normal prior is not used on α. Instead, a flat prior on the overall mean α (Lindley and Smith, 1972) coupled with the harmonic prior on the differences α i α is our preferred choice due to their good minimax and shrinkage properties (Stein, 1981). The specific construction of the prior leads to shrinkage towards the overall mean rather than the origin. For modeling purposes, we also assume that x j α dj follows a stationary process with positive correlation and therefore impose a uniform 8
9 U(, 1) prior on the autoregressive coefficient. Finally, conjugate inverse gamma priors are used on the variance components. The hyperparameters are specific to the data analyzed and therefore will be presented in the case study section. 3.2 Posterior Inference The goal of our analysis is to carry out exact inference on the joint posterior distribution p(x,z,θ Y 1:J ) p(y 1:J x,z,θ) p(x θ) 5 p(z i θ) p(θ) i=1 where Y 1:J = (y 11,...,y 1K,...,y J1,...,y JK ), x = (x 1,...,x J ), z = (z 1,z 2,z 3,z 4,z 5 ) and z dj = (z i (t 1 ),...,z dj (t K )) for d j = 1,...,5. We sample the parameters and latent states using a hybrid MCMC algorithm developed in the Appendix that utilizes both Gibbs sampling steps and random walk Metropolis steps (see Robert and Casella 24 and Metropolis et al. 1953). The algorithm recursively cycles between sampling from p(x z,θ,y 1:J ), p(z x,θ,y 1:J ) and p(θ x,z,y 1:J ). We benefit from the forward-filtering backward-sampling algorithm to draw directly the whole path of x s from p(x z,θ,y 1:J ). The same algorithm is also used to draw the whole path of z dj s from p(z dj x,θ,y 1:J ) for each day of the week. This algorithm enables fast mixing of the Markov chain. We should point out one slight complication in the model. The Gaussian prior selected for the g dj (d j = 1,...,5) corresponds to a cubic smoothing spline constrained to lie on the sphere k g2 d j (t k ) = 1. This quadratic constraint complicates posterior simulation since the posterior is also a constrained Gaussian. A Metropolis algorithm would be infeasible to simulate from this distribution as we would either have to evaluate the prior on the sphere which would involve a 169-dimensional integral, or use the uninformative constrained prior as the proposal distribution which would lead to rejecting all proposed moves. To avoid this problem, we sample from the unconstrained posterior using FFBS and then renormalize the draws. In our application, the unconstrained samples of k g2 d j (t k ) fall between.99 and 1.1, indicating that the normalized draws are approximately from the true conditional posterior distribution. If the constraint set were affine, the argument would be exact because in this case conditioning is the same as projecting onto a line. Since the posterior is highly concentrated around a point on the sphere, and the sphere is locally linear, the approximation is very accurate. 4 Case Study Our case study consists of three steps. First, the MCMC algorithm is applied to the whole data set to estimate latent states and parameters and check model adequacy. A careful assessment of convergence is also conducted. We then perform an out-of-sample forecasting exercise using 9
10 the same model. We compare our forecasting performance to seasonal regression models. The competing models and the results will be discussed in Section 4.2. Finally, a sequential Monte Carlo algorithm for within day learning and forecasting is proposed. Consequently, we measure whether incorporating morning data greatly improves the afternoon and evening forecasts. 4.1 Posterior Estimation As stated above, the dataset used in this study consists of 164 days. Within each day, the number of retail banking call arrivals per 5 minute interval is recorded between 7am and 9:5pm, leaving us with 169 time periods. The algorithm ran according to the following specifications. The chain is run for 49, iterations after a burn-in period of 1. The analysis of the autocorrelation function of each parameter and latent state led us to save every 1th iteration leaving 4899 (approximately) independent samples for posterior inference. In what follows, we will refer to the MCMC sample size as M. The prior on θ described in Section 3.1 is used with the following hyperparameters: a σ =.5, b σ =.5, a ψ =.5, b ψ =.5, a τi =.5 and b τi =.5 for i = 1,...,5, which corresponds to very diffuse priors. Furthermore x 1 and z i (t 1 ) are drawn from the following diffuse priors, x 1 N(,1 5 ) and z i (t 1 ) N(,1 5 I). An interesting feature of the model arises when performing a prior sensitivity analysis. After some careful analysis on simulated data, we observe that for small values of β, the variance components σ 2 and ψ 2 become non-identifiable and more peaked priors are needed. However, due to the high persistence of the autoregressive process in our application, we are able to place diffuse priors on the two variances. The marginal posterior distributions of x and z are not sensitive to the choice of hyperparameters due to the vast quantities of data available in the analysis. We ran the algorithm from multiple starting values concluding that the choice of initial values didn t affect the convergence of the Markov chain. The results described below are based on the following initial parameter values: α () = (19,18,175,175, 18), β () =.65, (σ 2 ) () =.25, (ψ 2 ) () = and (τ 2 ) () = (25,25,25,25,25). Careful analysis of the traceplots and autocorrelation function (ACF) of each parameter was performed to check the convergence of the Markov chain to its stationary distribution. Figures 3 and 4 display the marginal posterior distributions p(θ Y 1:J ). Several conclusions can be made from these plots. First, we should point out that the posterior mean of σ 2 is.347. This is a very encouraging result. If the model accounted for all the variance in the data, σ 2 would be approximately.25 but no smaller, according to the root-unroot theory. We initially considered a model with the same within-day pattern g for all days. The estimated σ 2 was.432 which is substantially larger than the model considered here. The autoregressive coefficient β is highly significant with a posterior mean of.68. The analysis also confirms that there is an obvious difference in daily random effects 1
11 16 Histogram of β 16 Histogram of ψ 2 16 Histogram of σ Figure 3: Posterior histograms of β, ψ 2 and σ Day of the Week Effect Smoothing Parameters α 1 α 2 α 3 α 4 α 5 τ 1 2 τ 2 2 τ 3 2 τ 4 2 τ 5 2 Figure 4: Boxplots displaying the posterior distributions of the weekday effects and the smoothing parameter for the cubic splines for each day of the week. 11
12 across weekdays, ranging from a posterior mean of 175 for Thursdays to 19 for Mondays. The posterior mean of τ 2 ranges from.66 for Wednesday to 1.8 for Friday. This indicates that the within-day patterns are very smooth as the cubic spline penalty factors for each of the five days (i = 1,...,5) defined as σ 2 (τi 2 x 2 j ) 1, are very small. {j : d j =i} 4.2 One-Day-Ahead Forecasting The previous section concentrated on the in-sample performance of the model and estimation procedure. This exercise is indispensable to check model adequacy but it is not of primary interest to call center managers who need good forecasts of call arrival rates in order to plan ahead and accurately staff their center. A one-day- ahead prediction exercise was therefore conducted to compare the out-of-sample performance of the model with industry standards. Ideally, call center managers also require forecasts at longer time horizons. Unfortunately, based on our estimated AR(1) coefficient, we are unable to accurately predict call volumes at horizons greater than a week. Additional covariates, not provided in this dataset, would therefore be needed to enhance our predictions Forecast Densities for Rates and Volumes To simplify notation, let λ j. = (λ j1,...,λ jk ) and N j. = (N j1,...,n jk ) denote the rates and counts on day j, and define Ω = (x j 1,z,θ). The one-day-ahead predictive density for the rates is given by p(λ j. Y 1:j 1 ) = p(λ j. x j,z dj ) p(x j x j 1,θ) p(x j 1,z dj,θ Y 1:j 1 ) dx j 1 dθ. Note that the third term in the integrand is the joint posterior distribution of the model parameters and states on day j 1, the second term is the AR(1) Gaussian transition density while the first term is a degenerate distribution since λ is a deterministic function of x and z. The one-day ahead predictive density for the counts on day j is given by p(n j. Y 1:j 1 ) = p(n j. λ j.,θ) p(λ j.,θ Y 1:j 1 ) dλ j. dθ. These predictive densities are not available in closed form as the rates and counts are nonlinear functions of x j and Ω. We therefore approximate these distributions by a large sample drawn using the algorithm outlined below. 12
13 Algorithm 1: One-Day-Ahead Forecasts of Rates and Counts Step : Start with an MCMC sample, Ω (1),...,Ω (M), drawn from p(ω Y 1:j 1 ). ( ) Step 1: Draw x (i) j N α (i) d j + β (i) (x (i) j 1 α(i) d j 1 ),(ψ 2 ) (i) for each i = 1,...,M. Step 2: For each period k = 1,...,K and each i = 1,...,M. ( ) Step 2a: Set λ (i) jk = x (i) 2 j g dj (t k ) (i). ( ) Step 2b: Draw y (i) jk N λ (i) jk, (σ2 ) (i). Step 2c: Set N (i) jk = ( y (i) jk ) Steps 2a and 2c of the algorithm described above provide samples from p(λ j. Y 1:j 1 ) and p(n j. Y 1:j 1 ) respectively. Since the arrival rate is an unobservable quantity, we focus our attention on call volumes in order to compare the out-of-sample performance of various models Competing Forecasts In what follows, we will refer to the approach described above as Model 1. For comparative purposes, we will consider two simple alternatives which we refer to as Model 2 and Model 3. The first model is a linear additive model on the transformed data with a day of the week and time of day as covariates. In the second model, we also add an interaction between the day of the week and the time of day effects. The description of the two seasonal linear models follows Model 2: y jk = µ + α dj + β k + ɛ jk ɛ jk N(,σ 2 ) Model 3: y jk = µ + α dj + β k + γ dj k + ɛ jk ɛ jk N(,σ 2 ) where y jk = N jk We fit the models on historical data using least squares. The normality assumption enables us to also obtain prediction intervals for future observations. The model proposed by Brown et al. (25) was originally considered as a competing model. Unfortunately, the model in their analysis does not incorporate all the dynamics of the data analyzed here such as day of the week effects and different within-day patterns for each weekday. A reformulation of their model adjusting for all these new features would result in a representation similar to the one presented in this paper and therefore would not be subject to fair comparison One-day-ahead forecast analysis The one-day-ahead forecasting exercise is performed in the following manner. For the 64 days ranging from July 25 to October 24, 23, 13
14 Consider the 1 preceding days as the historical dataset. Estimate the parameters and latent states for the model described in the paper using the same recipe described in Section 4.1. For the competing models, compute the the maximum likelihood estimator for each parameter. For all three models, perform a one-day-ahead forecast of call volumes for period k. Compute the forecast root mean square error (RMSE) and average percent error (APE), defined for each day j as follows RMSE j = 1 K K (N jk N jk ) 2, APE j = 1 K k=1 K k=1 N jk N jk N jk. Compute the 95% coverage probability (COVER) and the average 95% forecast interval width (WIDTH), defined for each day j as follows COVER j = 1 K K I( k=1 N 2.5 jk < N jk < N 97.5 jk ), WIDTH j = 1 K K ( k=1 N 97.5 jk N 2.5 jk ) Here I( ) is the indicator function. Njk and N Q jk are the mean and Qth quantile of the forecast distribution. For Model 1, these quantities are computed using the Monte Carlo sample, for Model 2 and 3, they are based on the maximum likelihood estimates and the assumption of normality. RMSE APE M1 M2 M3 M1 M2 M3 Min th th Mean th Max Table 1: Summary of one-day ahead forecasting performance of the three competing models: Model 1 (M1), Model 2 (M2), Model 3 (M3). Summaries are based on data between 7am and 9:5pm. Table 1 summarizes the distribution of the RMSE and APE for the all three models over the 64-day forecast period. Several remarks can be made based on these results. The model developed in this paper clearly outperforms the two competing models: The median forecast RMSE is 2.8% 14
15 and 13.5% lower than those of Model 2 and 3 respectively. We can conclude that the autoregressive feature of the model presented in this paper greatly improves the accuracy of our forecasts. Coverage Average Probability Width M1 M2 M3 M1 M2 M3 Min th th Mean th Max Table 2: Summary of 95% one-day-ahead forecast intervals for all three competing models. Summaries are based on calls handled between 7am and 9:5pm. Table 2 summarizes the distribution of the 64 coverage probabilities and average interval widths for all three models. We note that the empirical coverage of the 95% prediction intervals are extremely accurate with mean coverages of 94.7%, 93.7% and 94.1% respectively. We should note that the prediction intervals for Model 1 are on average 16.2% and 9.4% narrower than Model 2 and 3. While still obtaining coverage probabilities close to the nominal value, Model 1 produces more precise prediction intervals. The rather wide prediction intervals are mainly due to the inherent Poisson variation which is proportional to the arrival rate One-day-ahead forecast analysis with weekends In this section, we discuss the results of the one-step-ahead forecasting exercise when we include both weekdays and weekends. The number of days J in our case study increases to 231 days ranging from July 25 to October 24, 23. In this organization, the weekend opening hours are shorter than the weekday operating hours (9am to 5pm) leaving us with only 18 periods within a day for both Saturday and Sunday and 169 periods for each weekday. We perform the same analysis as discussed in the previous section using a 1 day moving window. The main difference, when including the weekends, is that the day-to-day autocorrelation β is smaller with a posterior mean of.62. Table 3 summarizes the distribution of the RMSE and APE for the all three models over the 131-day forecast period. Again, we see that Model 1 outperforms the two other models with a median forecast RMSE which is approximately 4% and 8% lower than those of Model 2 and 3 respectively. 15
16 RMSE APE M1 M2 M3 M1 M2 M3 Min th th Mean th Max Table 3: Summary of one-day ahead forecasting performance of the three competing methods: Model 1 (M1), Model 2 (M2), Model 3 (M3). Summaries are based calls handled between 7am and 9:5pm for Monday through Friday and between 9am and 5pm for Saturday and Sunday The empirical coverage of the 95% prediction intervals, not shown here, are also very accurate with mean coverages within 1% of the nominal value for all three models. We should also note that the prediction intervals for Model 1 are on average approximately 53% and 5% narrower than Model 2 and 3. These results confirm that our model is robust when including further within-day patterns in our analysis Forecast Calibration One advantage of the Bayesian framework is that it provides the entire forecast density for the rates and counts, fully accounting for uncertainty in the latent states and model parameters. These densities can be used to provide an alternative measure of forecast performance, the probability integral transform (PIT) (Rosenblatt 1952), which is defined for each day j and period k by PIT jk = 1 M M ( ) I N jk < N (i) jk i=1 where N jk is the actual observation and N (i) jk are Monte Carlo samples from the forecast density. This measure has been used extensively in econometrics (see Shephard 1994 and Diebold et al. 1998) and in weather forecasting (see Gel et al. 24). Assuming the predictive densities of the counts are properly calibrated, the marginal distribution of the PIT should be uniform U[, 1] across all j and k. However, we expect the PIT to be correlated across periods k due to the within-day dependence on the forecast estimate of x j. Due to this, the within-day distribution of the PIT will tend to show higher variability than an iid sample. On the other hand, when combining all 64 days of data, we would expect the distribution to be more uniform. Figure 5 summarizes the forecast performance of our model for the week of August 18, 23. The first column displays the 95% predictive intervals for the rates and counts for each day. Columns 16
17 3 Monday 8/18/3 4 2 Forecast Residuals Probability Integral Transform : 1: 13: 16: 19: 7: 1: 13: 16: 19: Tuesday 8/19/3 4 2 Forecast Residuals Probability Integral Transform : 1: 13: 16: 19: 7: 1: 13: 16: 19: Wednesday 8/2/3 4 2 Forecast Residuals Probability Integral Transform : 1: 13: 16: 19: 7: 1: 13: 16: 19: Thursday 8/21/3 4 2 Forecast Residuals Probability Integral Transform : 1: 13: 16: 19: 7: 1: 13: 16: 19: Friday 8/22/3 4 2 Forecast Residuals Probability Integral Transform : 1: 13: 16: 19: 7: 1: 13: 16: 19: Figure 5: Forecast performance for the week of August 8, 23. Left: One-day-ahead forecast means and 95% intervals for the rates and counts. Points denote the observed counts. Center: Forecast residuals (observed counts minus forecast mean). Right: Probability integral transform for the observed counts based on the Monte Carlo samples. 17
18 Probability Integral Transform Theoretical CDF Empirical CDF Empirical CDF Figure 6: Distribution and QQ-plot of the probability integral transform for the observed counts based on the Monte Carlo samples combining the 64 predicted days. 2 and 3 display the forecast residuals (defined as the observed counts minus the forecast mean), and the histogram of the PIT for each day. The results are very encouraging although the coverage probabilities for that week are slightly high. For the most part, the forecast residual plots show no obvious bias. The histogram of the PIT confirms that our forecasts are well calibrated since the histograms are fairly symmetric and close to uniform. The distribution of the combined PIT over all 64 days is shown in Figure 6. The histogram and the QQ-plot for the PIT exhibit a near-perfect uniform distribution, confirming that the forecasts are very well calibrated. 4.3 Within-Day Learning and Forecasting The previous section focused on predictions of call volume based on information available at the end of the previous day. However, as new data arrive throughout the day, a manager may want to reevaluate his or her forecast for the remainder of the day. For example, in the bank that provided the data, telephone agents can be called up with as little as three hours notice. Prediction of the afternoon volume based on the morning information would therefore be of use to this organization. Re-estimating the latent states and parameters every few minutes using the algorithm proposed in section 3.2 would be infeasible, as the full MCMC simulation took us approximately 3 minutes to run, even when efficiently implemented in C. Therefore, a fast algorithm is needed to recursively update the posterior density p(x j,z,θ Y 1:j 1,y j1,...,y jk ) for each time period k. At the end of day j 1, we have available an MCMC sample from the historical posterior distribution p(x j 1,z,θ Y 1:j 1 ). One possible way to proceed is to simulate x j from the transition density p(x j x j 1,θ) and reweight the draws {x j,z,θ} according to the likelihood p(y jk x j,z,θ) 18
19 as new data arrive. One problem with this procedure is that, in the presence of outliers, a small number of draws would receive most of the weight, leading to degeneracy of the algorithm. In what follows, we propose a slightly different approach which considerably alleviates this problem. We notice that we can analytically integrate out x j and use the reweighting scheme described above on {z,θ} alone. Then, conditioning on {z,θ}, we can draw x j directly from its conditional posterior distribution. In what follows, we will assume dependence on the historical data Y 1:(j 1) but suppress it from the notation. Let us also define Y jk = (y j1,...,y jk ) as the data on day j up to period k. Recalling that Ω = {x j 1,z,θ}, all relevant inference can be obtained from the joint posterior distribution p(ω,x j Y jk ), which can be decomposed as follows p(ω,x j Y jk ) = p(ω Y jk ) p(x j Ω,Y jk ). (8) The marginal posterior density for Ω is approximated by the discrete distribution p(ω Y jk ) M I i=1 ( Ω = Ω (i)) w (i) k, (9) where Ω (1),...,Ω (M) are the samples from the historical posterior p(ω Y 1:(j 1) ) and w (1) k,...,w(m) k are the normalized weights. The conditional posterior for x j given Ω is normal, given by p (x j Ω,Y jk ) = N(m k,v k ). (1) The recursive formulas for the weights, means and variances {w k,m k,v k } are defined below. The normalized weights in Equation (9) are initialized to w (i) = 1/M, and updated according to w (i) k w (i) k 1 p(y jk Ω (i),y j(k 1) ) where M i=1 w (i) k = 1. This follows from the fact that p(ω Y jk ) p(ω Y j(k 1) ) p(y jk Ω,Y j(k 1) ) where p(y jk Ω,Y j(k 1) ) = p(y jk Ω,x j,y j(k 1) ) p(x j Ω,Y j(k 1) ) dx j is Gaussian with moments given in Step 2a of Algorithm 2. The mean and variance in Equation (1) are initialized based on the Gaussian AR(1) transition density and are updated recursively according to Step 2b in Algorithm 2. The posterior normality and the moment recursions follow from the fact that p(x j Ω,Y jk ) p(x j Ω,Y j(k 1) ) p(y jk x j,ω), where both densities on the right side are normal. We are therefore able to express the joint posterior density p(ω,x j Y jk ) as a mixture of normals. 19
20 Hence, our within-day learning algorithm sequentially updates the weights, means and variances of the normal mixture as new data are observed. The algorithm, described below, is equivalent to the mixture Kalman filter proposed by Liu and Chen (2) where the state variable is static. Algorithm 2: Within-Day Learning Step : Start with an MCMC sample, Ω (1),...,Ω (M), drawn from p(ω Y 1:j 1 ). Step 1: Initialize the mixture weights, means and variances for each i = 1,..., M: ( ) w (i) = M 1, m (i) = α (i) d j + β (i) x (i) j 1 α(i) d j 1, v (i) = ( ψ 2) (i). Step 2: For each period k = 1,...,K and each i = 1,...,M: Step 2a: Update the mixture weights: w (i) k ( ( w (i) k 1 φ y jk g dj (t k ) (i) m (i) k 1, g dj (t k ) (i)) 2 (i) v k 1 + ( ) σ 2) (i). Step 2b: Update the mixture means and variances: m (i) k = v (i) k ( (i) m k 1 v (i) + y ) 1 jk g dj (t k ) (i) (σ 2 ) (i), v (i) k = k 1 ( 1 v (i) k 1 + ( gdj (t k ) (i)) 2) 1. (σ 2 ) (i) In Step 2a, φ(x m,v) is a normal density evaluated at x with mean m and variance v, and the weights are normalized so that M i=1 w(i) k = 1. The algorithm provides a closed form representation of the posterior density at each period k. Given this information, the prediction densities for a future period k, conditioning on the historical data Y 1:(j 1), are given by p(λ jk Y jk ) = p(n jk Y jk ) = p(λ jk x j,z dj (t k )) p(x j x j 1,θ) p(z dj (t k ) x j,θ,y jk ) dx j 1 dθ p(n jk λ jk,θ) p(λ jk,θ Y jk ) dλ jk dθ. As noted previously, these densities are not available in closed form as the rates and counts are nonlinear functions of x j and Ω. Therefore we propose a resampling algorithm in order to generate draws from these distributions which is given below. 2
21 Algorithm 3: Within-Day Forecasting at Period k Step : Start with {Ω (1),w (1) k,m(1) k,v(1) k },..., {Ω(M),w (M) Step 1: Draw the indices l 1,...,l M Mult(M;w (1) Step 2: For each index l i draw x (i) j Step 3a: Set λ (i) jk = N k ( m (l i) k,v(l i) k ). k,m (M) k,v (M) k } from Algorithm 2.,...,w(M) k ). Step 3: For each period k = k + 1,...,K and each i = 1,...,M, ( x (i) Step 3b: Draw y (i) jk N 2 j g dj (t k ) i)) (l. ( λ (i) jk, ( σ 2) (l i ) ( ) Step 3c: Set N (i) jk = y (i) 2 jk.25. ). Steps 3a and 3c of this algorithm provide samples from p(λ jk Y 1:j 1,Y jk ) and p(n jk Y 1:j 1,Y jk ) respectively. Note also that this algorithm is equivalent to Algorithm 1 at period k = if we set l i = i for all i. We now investigate the impact of incorporating morning information on within-day predictions The Importance of Within-Day Information for Forecasting The comparison is performed as follows. For each day j from July 25 to October 24 (64 days), we run the full MCMC algorithm described in Section 4.1 using data from the previous 1 days. Based on this sample, one-day-ahead forecasts of the arrival rates and call volumes are computed using the procedure from Section Starting at the end of day j 1, we then run the withinday learning algorithm described in Section 4.3 through noon of day j. At 1am and 12pm, we produce forecasts for the rest of the day. In order to provide a true out-of-sample comparison, we evaluate the forecasts using only data after 12pm (18 time periods). The analysis presented below compares the predictive densities of the rates and counts on day j conditioning on three different information sets: Information at the end of day j 1: p( Y (j 1):(j 1) ). Information up to 1am on day j: p( Y (j 1):(j 1),Y j(37) ). Information up to 12pm on day j: p( Y (j 1):(j 1),Y j(61) ). Table 4 summarizes the results of the forecasting exercise, from which several encouraging points emerge. First, the empirical coverage probabilities of the prediction intervals are very close to the nominal value for all three information sets, as the mean coverage across all 64 days is 92.6%, 93.8% and 95.3%, respectively. As expected, the average width of the forecast interval decreases as 21
22 more data are observed, the average width of the forecast interval goes from 67.3 at the end of the previous day to 61.1 to 6.8 for the 1am and 12pm forecasts, respectively. The same improvements are not observed in the RMSE. This is partly due to three days where unexpected peaks in the data between 9:3am and 12pm offset the forecasts and consequently result in an overprediction of the call volume for the whole afternoon and evening. RMSE Coverage Average Probability Width PD 1am 12pm PD 1am 12pm PD 1am 12pm Min th th Mean th Max Table 4: Summary of predicted call volumes handled by the call center between 12:5pm and 9:5pm at three different times of day: at the end of the previous day (PD), on the same day at 1am, on the same day at 12pm. Figure 7 provides a graphical summary of the predictive distribution on September 2. The top two plots display the mean and 95% equal-tailed interval for the rates and counts after 12pm. As expected, the same-day forecast of the rates have much narrower intervals than those of the previous day. The tightening of the predictive intervals for the counts is not as striking. This is because the inherent Poisson variability dominates the uncertainty about the rates. We observe a dramatic shift in the forecasts after incorporating same-day observations. The reason for this is apparent as September 2 corresponds to the day after Labor day. Consequently, the call center experienced an unusually high call volume that morning, leading to a significant upward shift in the 1am and 12pm estimates. An even larger shift would have been observed if we had not modeled that day as a Monday. Looking closer at the forecasts, the credible intervals seem nearly parallel across different prediction times. The shift suggests that the estimate of x j is more accurate having observed the early morning data but that the added information has not modified the estimates of g or θ significantly. The predictive densities of the rates and counts at 2pm are displayed in the bottom two plots of Figure 7. These were obtained by applying the density function in S-plus to the Monte Carlo samples. The upward shift and narrowing of the distribution for the rates are clearly seen. In addition, we note that the predictive densities for the rates and counts are both quite well approximated by a normal distribution. 22
23 4 Predicted Rates for Tuesday 9/2/3 4 Predicted Volumes for Tuesday 9/2/ day ahead at 1am at 12pm 3 1 day ahead at 1am at 12pm rates 25 2 volume : 15: 17: 19: 21: 13: 15: 17: 19: 21: Time of day Time of day.8 Predicted Rates at 2 PM on 9/2/3.2 Predicted Volumes at 2 PM on 9/2/3.6 1 day ahead at 1am at 12pm.15 1 day ahead at 1am at 12pm Density.4 Density rate volume Figure 7: Forecasts of the Poisson rates and call volumes on September 2 using three different information sets. Top Left: Forecast mean and 95% intervals for the Poisson rates between 12:5pm and 9:5pm. Top Right: Forecast mean and 95% intervals for the call volumes between 12:5pm and 9:5pm. Bottom Left: Forecast densities for the Poisson rate at 2:pm. Bottom Right: Forecast densities for the call volume at 2:pm. Arrow indicates the actual observation. 23
24 5 Conclusion In this article, we provide a multiplicative Gaussian model to measure and predict the arrival rates of an inhomogeneous Poisson process. We fitted this model to call center data provided to us by a North American commercial bank. Based on the results in this study, we find that the fitted model explains most of the variance in the data as the empirical results closely approximate the theory on which this model is based. In addition, the model proposed in this paper clearly outperforms two existing models used by practitioners in the industry when used to predict one-day-ahead arrival rates. With the Bayesian procedures used to fit this model, we are not only able to provide point estimates of the current and future arrival rates but also entire distributions on the parameters, state variables and observables. We believe this is a considerable contribution as call center practitioners need confidence levels on their estimates in order to staff their centers appropriately. Finally, we provide a within-day learning algorithm that enables sequential estimation of the rate as new data reaches the call center. This is particularly useful for managers as they can update their prediction of the rates for the afternoon based on the observed morning pattern of calls and restaff their center accordingly. Our findings extend the work of Brown et al. (25) who provided a model to predict arrival rates of an inhomogeneous Poisson process. We believe we provide a more statistically sound estimation procedure to the model presented in this paper. In addition, the Bayesian procedure offers several improvements over the iterative least squares procedure. First, as stated above, Bayesian methods automatically provide measures of uncertainty of parameter and latent state estimates. Furthermore, prior knowledge based on the past experience of the manager can be incorporated in the model. In addition, further unobserved components and covariates can be incorporated in the model with little or no complications. Finally, the model presented in this paper can provide k-day-ahead prediction of future arrival rates which is currently unavailable in the work presented by Brown et al. (25). Although the method proposed in this paper has numerous advantages, we are still left with a small complication that was mentioned previously in the paper. In the model, we impose a constraint on the cubic spline. We do not incorporate this constraint in the prior as the prior would no longer be Gaussian or closed form. We therefore draw from the unconstrained prior and renormalize the posterior so that the constraint is satisfied. Our belief is that the posterior distribution from this procedure is very close to the posterior distribution if we had incorporated the constraint in the prior. A more general issue arising from the analysis is the problem of distinguishing the two variance components in an AR(1) plus noise model if the autoregressive coefficient is close to zero. To our knowledge, an objective prior for this model has not yet been derived. This is a very interesting problem as this is one of most widely used models in time series analysis due to its relation with 24
25 the state space representation. Investigation of this problem is ongoing but still at a very early stage. Finally, the root-unroot methodology holds several advantages due to its relation with a Gaussian model. As we have seen in this paper, conjugate Gaussian priors with the required covariance structure can be used to model the time series dynamics. Conjugate gamma priors with the equivalent dynamics are currently being investigated in order to model the original counts rather than performing a transformation. References [1] Andrews, B. H. and Cunningham S. M. (1995), L.L. Bean improves call-center forecasting, Interfaces, 25, [2] Avramidis, A. N., Deslauriers, A., L Ecuyer, P. (24), Modeling daily arrivals to a telephone call center, Management Science, 5, [3] Bianchi, L., Jarrett, J. and Choudary Hanumara, R. (1993), Forecasting incoming calls to telemarketing centers, The Journal of Business Forecasting Methods and Systems, 12, [4] Brown, L. D., Zhang, R., Zhao, L. (21), Root un-root methodology for non parametric density estimation, Technical Report, University of Pennsylvania. [5] Brown, L. D., Gans, N., Mandelbaum, A., Sakov, A., Shen, H., Zeltyn, S., Zhao, L. H. (25), Statistical analysis of a telephone call center: a queueing-science perspective, Journal of the American Statistical Association, 1, [6] Carter, C. K., Kohn, R. (1994), On Gibbs sampling for state space models, Biometrika, 81, [7] Chen, R., Liu, J.S (2), Mixture Kalman Filters, Journal of the Royal Statistical Society. Series B, 62, [8] Chen, R., Fomby, T. (1999), Forecasting with stable seasonal pattern models with an application of Hawaiian tourist data, Journal of Business and Economic Statistics, 17, [9] Diebold, F. X., Gunther, T., Tay, A. (1998), Evaluating density forecasts, with applications to financial risk management, International Economic Review, 39, [1] Frühwirth-Schnatter, S. (1994), Data augmentation and dynamic linear models, Journal of Time Series Analysis, 15, [11] Gamerman, D. (1998), Markov chain Monte Carlo for dynamic generalized linear models, Biometrika, 85,
26 [12] Gans, N., Koole, G., and Mandelbaum, A. (23), Telephone call centers: tutorial, review and research prospects, Manufacturing and Service Operations Management, 5, [13] Gel, Y., Raftery, A. E., Gneiting, T. (24), Calibrated probabilistic mesoscale weather field forecasting: The geostatistical output perturbation (GOP) method, Journal of the American Statistical Association, 99, [14] Jongbloed G., Koole, G. (21), Managing Uncertainty in Call Centers using Poisson Mixtures, Applied Stochastic Models in Business and Industry, 17, [15] Kohn, R. and Ansley, C. F. (1987), A new algorithm for spline smoothing based on smoothing a stochastic process, SIAM J. Sci. Statist. Comput., 8, [16] Lindley D. V., Smith, A.F.M. (1972), Bayes estimates for the linear model, Journal of the Royal Statistical Society. Series B, 34, [17] Little J. (1961), A proof of the theorem L = λ W, Operations Research, 9, [18] Metropolis, N., Rosenbluth, A. W., Rosenbluth, M.N., Teller, A.H, and Teller, E. (1953), Equation of state calculations by fast computing machines, Journal of Chemical Physics 21, [19] Robert, C. P., Casella G. (24), Monte Carlo statistical methods, Springer-Verlag, New York. [2] Rosenblatt, M. (1952), Remarks on a multivariate transformation, Annals of Mathematical Statistics, 23, [21] Shephard, N. (1994), Partial non Gaussian state space, Biometrika, 81, [22] Shephard, N. and Pitt, M.K. (1997), Likelihood analysis of non-gaussian measurement time series, Biometrika, 84, [23] Soyer, R. and Tarimcilar, M. (24), Modeling and analysis of call center arrival data: a Bayesian approach, Submitted to Management Science. [24] Stein, C. M. (1981), Estimating the Mean of a Multivariate Normal Distribution, The Annals of Statistics, 9, [25] Trofimov, V., Feigin, P., Mandelbaum, A. and Ishay E. (25), DATA-MOCCA, Data Model for Call Center Analysis. [26] Wahba, G. (1983), Bayesian confidence intervals for the cross-validated smoothing spline, Journal of the Royal Statistical Society, Series B., 45,
27 APPENDIX: MCMC algorithm A combination of the Gibbs sampler and the Metropolis algorithm are used to draw from the posterior distribution p(x,z,α,β,τ 2,σ 2,ψ 2 Y 1:J ). The algorithm outline followed by a description of the full conditional posterior distributions are given below. 1: Initialize x,α,β,ψ 2,τ 2,σ 2. 2: Sample z i from z i x,τi 2,σ2,Y 1:J, for i = 1,...,5. 3: Sample x from x z,α,β,ψ 2,σ 2,Y 1:J. 4: Sample α from α x,β,ψ 2. 5: Sample β from β x,α,ψ 2. 6: Sample ψ 2 from ψ 2 x,α,β. 7: Sample τi 2 from τi 2 z i, for i = 1,...,5. 8: Sample σ 2 from σ 2 x,z,y 1:J. 9: Go to Step 2. Step 2: Sampling z i : i = 1,...,5 Conditional on the x s, the model can be cast into state space form: w i (t k ) = h z i (t k ) + e k e k N(,ν 2 k ) z i (t k ) = F(δ) z i (t k 1 ) + u k u k N(,τi 2 U(δ)) k = 1,...,K where ν 2 k = σ2 ( {j : d j =i} x2 j) 1 and wi (t k ) = ν2 k σ 2 {j : d j =i} y jk x j. The FFBS algorithm can be used to draw from the required posterior density p(z i τ 2 i,σ2,y 1:J ). Given that z i (t k ) = {g i (t k ),dg i (t k )/dt k }, we consequently obtain a draw of g i (t k ). We then normalize the posterior ( K ) 1 draws by setting g i (t k ) = g i (t k ) k=1 g i(t k ) 2 2 so that the restriction K k=1 g i(t k ) 2 = 1 is satisfied. Step 3: Sampling x Conditional on z, the model can be cast into state space form: v j = x j + ζ j ζ j N(,γ 2 j ) x j = α dj + β(x j 1 α dj 1 ) + η j η j N(,ψ 2 ) ( where γj 2 = K ) 2 σ2 k=1 g d j (t k ) and vj = γ2 j σ 2 used to draw from the required posterior density. K k=1 y jk g dj (t k ). The FFBS algorithm is then 27
28 Step 4: Sampling α The full conditional posterior distribution for α is given by p(α x,β,ψ 2 ) p(x α,β,ψ 2 ) p(α) N ((X ( ) 1 X) 1 X W, (X X) 1 ψ 2) 4 (α i α) 2 i=1 where X = β 1 β 1 β 1 β 1 1 β..... and W = x 2 βx 1 x 3 βx 2. x J βx J 1. The random walk Metropolis algorithm is applied to draw α since the prior is not conjugate. Given the current value α (q 1), we draw α ( ) from the proposal density N(α (q 1),.5 I) and then accept α ( ) with probability min { 1, } p(α ( ) x,β,ψ 2 ) p(α (q 1) x,β,ψ 2. ) The variance.5 I of the proposal density was chosen to provide an acceptance rate of approximately 3%. Step 5: Sampling β Under the prior specified in section 3.2, the full conditional density of β is given by p(β x,α,ψ 2 ) p(x α,β,ψ 2 ) p(β) ( J 1 j=1 N (x j α dj )(x j+1 α dj+1 ) J 1 j=1 (x, j α dj ) 2 J 1 j=1 ) ψ 2 ( ) 2 xj α dj U[,1]. The random walk Metropolis algorithm is employed to sample β since the prior is not conjugate. Given the current value of β (q 1), draw β ( ) from the proposal density N(β (q 1),.1) and accept with probability min { 1, } p(β ( ) x,α,ψ 2 ) p(β (q 1) x,α,ψ 2. ) 28
29 The variance.1 of the proposal density was chosen to ensure an acceptance rate of approximately 3%. Step 6: Sampling ψ 2 Assuming the standard conjugate inverse gamma prior IG(a ψ,b ψ ), ψ 2 is sampled from p(ψ 2 x,α,β) = IG Step 7: Sampling τ 2 i : i = 1,...,5 a ψ + J 1, b ψ J ( xj α dj β(x j 1 α dj 1 ) ) 2. Following Ansley and Kohn (1985), if we assume a conjugate inverse gamma prior IG(a τi,b τi ), then the full conditional posterior for τ 2 i is p(τ 2 i z i) = IG where Γ i (t k ) = z i (t k ) F(δ)z i (t k 1 ). ( j=2 a τi + K 2, b τi ) K Γ i (t k ) U(δ) 1 Γ i (t k ) k=2 Step 8: Sampling σ 2 Assuming the standard conjugate inverse gamma prior IG(a σ,b σ ), σ 2 is sampled from p(σ 2 z,x,y 1:J ) = IG a σ + JK 2, b σ J K ( ) 2 yjk g dj (t k )x j. j=1 k=1 29
FORECASTING CALL CENTER ARRIVALS: A COMPARATIVE STUDY
FORECASTING CALL CENTER ARRIVALS: A COMPARATIVE STUDY by Rouba Ibrahim and Pierre L Ecuyer Department of Computer Science and Operations Research University of Montreal {ibrahiro, lecuyer}@iro.umontreal.ca
FORECASTING CALL CENTER ARRIVALS: FIXED-EFFECTS, MIXED-EFFECTS, AND BIVARIATE MODELS
FORECASTING CALL CENTER ARRIVALS: FIXED-EFFECTS, MIXED-EFFECTS, AND BIVARIATE MODELS Rouba Ibrahim Desautels Faculty of Management McGill University Pierre L Ecuyer Department of Operations Research University
Centre for Central Banking Studies
Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics
11. Time series and dynamic linear models
11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach
Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
BayesX - Software for Bayesian Inference in Structured Additive Regression
BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich
Analysis of Bayesian Dynamic Linear Models
Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main
5. Multiple regression
5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful
A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH [email protected]
A Regime-Switching Model for Electricity Spot Prices Gero Schindlmayr EnBW Trading GmbH [email protected] May 31, 25 A Regime-Switching Model for Electricity Spot Prices Abstract Electricity markets
1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Parallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
Accurate forecasting of call arrivals is critical for staffing and scheduling of a telephone call center. We
MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 10, No. 3, Summer 2008, pp. 391 410 issn 1523-4614 eissn 1526-5498 08 1003 0391 informs doi 10.1287/msom.1070.0179 2008 INFORMS Interday Forecasting and
Forecasting in supply chains
1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the
Supplement to Call Centers with Delay Information: Models and Insights
Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290
Time Series Analysis
Time Series Analysis Forecasting with ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos (UC3M-UPM)
Probabilistic Forecasting of Medium-Term Electricity Demand: A Comparison of Time Series Models
Fakultät IV Department Mathematik Probabilistic of Medium-Term Electricity Demand: A Comparison of Time Series Kevin Berk and Alfred Müller SPA 2015, Oxford July 2015 Load forecasting Probabilistic forecasting
Analysis of algorithms of time series analysis for forecasting sales
SAINT-PETERSBURG STATE UNIVERSITY Mathematics & Mechanics Faculty Chair of Analytical Information Systems Garipov Emil Analysis of algorithms of time series analysis for forecasting sales Course Work Scientific
Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
Statistics in Retail Finance. Chapter 6: Behavioural models
Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural
Integrated Resource Plan
Integrated Resource Plan March 19, 2004 PREPARED FOR KAUA I ISLAND UTILITY COOPERATIVE LCG Consulting 4962 El Camino Real, Suite 112 Los Altos, CA 94022 650-962-9670 1 IRP 1 ELECTRIC LOAD FORECASTING 1.1
Threshold Autoregressive Models in Finance: A Comparative Approach
University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Informatics 2011 Threshold Autoregressive Models in Finance: A Comparative
Statistics 104: Section 6!
Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: [email protected] Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC
Bayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Univariate and Multivariate Methods PEARSON. Addison Wesley
Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston
IBM SPSS Forecasting 22
IBM SPSS Forecasting 22 Note Before using this information and the product it supports, read the information in Notices on page 33. Product Information This edition applies to version 22, release 0, modification
Analysis of Financial Time Series
Analysis of Financial Time Series Analysis of Financial Time Series Financial Econometrics RUEY S. TSAY University of Chicago A Wiley-Interscience Publication JOHN WILEY & SONS, INC. This book is printed
Least Squares Estimation
Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David
More details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
An Autonomous Agent for Supply Chain Management
In Gedas Adomavicius and Alok Gupta, editors, Handbooks in Information Systems Series: Business Computing, Emerald Group, 2009. An Autonomous Agent for Supply Chain Management David Pardoe, Peter Stone
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
How To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
State Space Time Series Analysis
State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State
CHAPTER 3 CALL CENTER QUEUING MODEL WITH LOGNORMAL SERVICE TIME DISTRIBUTION
31 CHAPTER 3 CALL CENTER QUEUING MODEL WITH LOGNORMAL SERVICE TIME DISTRIBUTION 3.1 INTRODUCTION In this chapter, construction of queuing model with non-exponential service time distribution, performance
A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking
174 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 50, NO. 2, FEBRUARY 2002 A Tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking M. Sanjeev Arulampalam, Simon Maskell, Neil
STATISTICAL ANALYSIS OF CALL-CENTER OPERATIONAL DATA: FORECASTING CALL ARRIVALS, AND ANALYZING CUSTOMER PATIENCE AND AGENT SERVICE
STATISTICAL ANALYSIS OF CALL-CENTER OPERATIONAL DATA: FORECASTING CALL ARRIVALS, AND ANALYZING CUSTOMER PATIENCE AND AGENT SERVICE HAIPENG SHEN Department of Statistics and Operations Research, University
Marketing Mix Modelling and Big Data P. M Cain
1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
Geostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras [email protected]
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
Linear Models and Conjoint Analysis with Nonlinear Spline Transformations
Linear Models and Conjoint Analysis with Nonlinear Spline Transformations Warren F. Kuhfeld Mark Garratt Abstract Many common data analysis models are based on the general linear univariate model, including
Penalized regression: Introduction
Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood
Studying Achievement
Journal of Business and Economics, ISSN 2155-7950, USA November 2014, Volume 5, No. 11, pp. 2052-2056 DOI: 10.15341/jbe(2155-7950)/11.05.2014/009 Academic Star Publishing Company, 2014 http://www.academicstar.us
Java Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
Probabilistic Methods for Time-Series Analysis
Probabilistic Methods for Time-Series Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive
Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago
Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Advanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
Tutorial on Markov Chain Monte Carlo
Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,
Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I
Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting
Discrete Frobenius-Perron Tracking
Discrete Frobenius-Perron Tracing Barend J. van Wy and Michaël A. van Wy French South-African Technical Institute in Electronics at the Tshwane University of Technology Staatsartillerie Road, Pretoria,
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
3. Regression & Exponential Smoothing
3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a
Analysis of a Production/Inventory System with Multiple Retailers
Analysis of a Production/Inventory System with Multiple Retailers Ann M. Noblesse 1, Robert N. Boute 1,2, Marc R. Lambrecht 1, Benny Van Houdt 3 1 Research Center for Operations Management, University
Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
Introduction to Time Series Analysis. Lecture 1.
Introduction to Time Series Analysis. Lecture 1. Peter Bartlett 1. Organizational issues. 2. Objectives of time series analysis. Examples. 3. Overview of the course. 4. Time series models. 5. Time series
> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) quantile diff (error)
> plot(exp.btgpllm, main = "treed GP LLM,", proj = c(1)) > plot(exp.btgpllm, main = "treed GP LLM,", proj = c(2)) 0.4 0.2 0.0 0.2 0.4 treed GP LLM, mean treed GP LLM, 0.00 0.05 0.10 0.15 0.20 x1 x1 0.4
Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?
Forecasting Methods What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Prod - Forecasting Methods Contents. FRAMEWORK OF PLANNING DECISIONS....
Maximum likelihood estimation of mean reverting processes
Maximum likelihood estimation of mean reverting processes José Carlos García Franco Onward, Inc. [email protected] Abstract Mean reverting processes are frequently used models in real options. For
Load Balancing and Switch Scheduling
EE384Y Project Final Report Load Balancing and Switch Scheduling Xiangheng Liu Department of Electrical Engineering Stanford University, Stanford CA 94305 Email: [email protected] Abstract Load
BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
Sales forecasting # 2
Sales forecasting # 2 Arthur Charpentier [email protected] 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting
The Value of Weekend Leads Unveiled:
EXECUTIVE SUMMARY This study, performed in collaboration with QuinStreet, one of Velocify s key lead provider partners, examines the value of weekend-generated mortgage leads and helps draw tactical conclusions
Financial TIme Series Analysis: Part II
Department of Mathematics and Statistics, University of Vaasa, Finland January 29 February 13, 2015 Feb 14, 2015 1 Univariate linear stochastic models: further topics Unobserved component model Signal
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document
Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts
Page 1 of 20 ISF 2008 Models for Product Demand Forecasting with the Use of Judgmental Adjustments to Statistical Forecasts Andrey Davydenko, Professor Robert Fildes [email protected] Lancaster
Chapter 6: Multivariate Cointegration Analysis
Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Time Series Analysis: Basic Forecasting.
Time Series Analysis: Basic Forecasting. As published in Benchmarks RSS Matters, April 2015 http://web3.unt.edu/benchmarks/issues/2015/04/rss-matters Jon Starkweather, PhD 1 Jon Starkweather, PhD [email protected]
MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group
MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs
Why High-Order Polynomials Should Not be Used in Regression Discontinuity Designs Andrew Gelman Guido Imbens 2 Aug 2014 Abstract It is common in regression discontinuity analysis to control for high order
Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization
Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Jean- Damien Villiers ESSEC Business School Master of Sciences in Management Grande Ecole September 2013 1 Non Linear
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
Markov Chain Monte Carlo Simulation Made Simple
Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical
From the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
16 : Demand Forecasting
16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement
Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
Master s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
Bayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
On the mathematical theory of splitting and Russian roulette
On the mathematical theory of splitting and Russian roulette techniques St.Petersburg State University, Russia 1. Introduction Splitting is an universal and potentially very powerful technique for increasing
CHAPTER 2 Estimating Probabilities
CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a
Integrating Financial Statement Modeling and Sales Forecasting
Integrating Financial Statement Modeling and Sales Forecasting John T. Cuddington, Colorado School of Mines Irina Khindanova, University of Denver ABSTRACT This paper shows how to integrate financial statement
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
Master s thesis tutorial: part III
for the Autonomous Compliant Research group Tinne De Laet, Wilm Decré, Diederik Verscheure Katholieke Universiteit Leuven, Department of Mechanical Engineering, PMA Division 30 oktober 2006 Outline General
