Analysis of Bayesian Dynamic Linear Models

Size: px
Start display at page:

Download "Analysis of Bayesian Dynamic Linear Models"

Transcription

1 Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main area in which DLMs are used is in modeling observations collected over time for the purposes of forecasting or detection of model shifts. The model utilized by DLMs is actually a sequence of models which are updated at each time step, justifying the word dynamic. The characteristic of interest or unknown parameters in the time series is modeled as θ, possibly a vector of parameters, and the analysis of the evolution of θ most commonly the main goal. Forecasting is also a common goal, but this also depends on how θ is behaving over time. A model is fit to the parameter(s) of interest at time t 1 and using this model a value is forecasted for time t. Then, an observation is received at time t, compared to that which was forecasted, and the model is then updated given this new information as well as any newly obtained and relevant outside information. The Bayesian analysis is the most natural to allow for this updating of information at each time, t. The DLM is specified by the following observation equation and system equation: y t = F t θ t + ν t ν t (0, V t ) (1) θ t = G t θ t 1 + ω t ω t (0, W t ) (2) The observation equation (1) above models the observation vector at time t. These values are modeled with F t, a design matrix of known values of the independent variable(s). This is multiplied by the state or system vector θ t and then added to ν t which represents the observation error assumed to have mean 0. The system equation (2) models the state vector as the sum of the zero mean system/evolution errors, ω t, and the product of the state vector at the previous time, θ t 1, and the matrix G t, known as the evolution, system, transfer, or state matrix. The observation and evolution errors are considered independent of each other and internally independent. More concisely a DLM can be characterized by the following quadruple {F t, G t, V t, W t }, each of which could or could not be dependent upon time. For example, the quadruple {1, 1, V, W } represents a purely random walk if the distribution of the error terms is assumed to be normal. In the textbook problems, all four values of this quadruple are 1

2 considered known. Clearly, F t and G t are chosen by the modeler in accordance to the design of the model. In practice, the evolution variance, W t, is typically a value chosen and it is only the V t which is often unknown and sometimes needed to be estimated from the data. Because a DLM is dynamic, it is only locally appropriate in time. The model in Equation 2 is appropriate at time, t, until an observation, y t, is observed and the model is updated according to the new information. The amount of information at available at time t will be designated as D t. This new information can consist of information other than just the observation. For example, if the observations represent the amount of sales of a company during month t, and it is known that a rival company is going out of business at month t + 1, this expected increase in sales can also be included. In this project, however, a model will attempted to be fit to simulated data, thus the information matrix will be closed to external information, i.e. D t = {Y t, D t 1 }. In addition symbolically, the initial information available before observation of the process, i.e. at time t = 0, will be represented at D o. According to West, Harrison, and Migon [3], a key feature of the Bayesian analysis of DLMs is the use of a conjugate prior and posterior distribution for the parameters. Typically this distribution is taken to be the normal distribution. This is the distribution that will be considered for all models examined in this project. This leads to the following restatement of the general system and observation equations in Equation 2. (Y t θ t ) N[F t θ t, V t ] (3) (θ t θ t 1 ) N[G t θ t 1, W t ] (4) In addition, the initial information, displayed through the prior distribution of the parameter, is assumed normal, with initial estimates of the mean, µ o, and variance, C o. (θ o D o ) N[m o, C o ] (5) It is also assumed that the observation and system errors are independent of the initial information. Sequentially, this leads to another key to the analysis argued by West and Harrison in 2 is that given the present, the future is independent of the past. At each time increment, the following distributions describe how each is updated with respect to the new information. Posterior at t 1: (θ t 1 D t 1 ) N(m t 1, C t 1 ) Prior at t: (θ t, D t 1 ) N(a t, R t ) One-step forecast: (Y t D t 1 ) N(f t, Q t ) Posterior at t: (θ t, D t ) N(m t, C t ) The mean of the prior for θ t is updated from the mean of the posterior of θ t 1 by a t = G t m t 1 and its variance follows as R t = G t C t 1 G t + W t. The 2

3 one-step forecast mean is the computed as f t = F t a t and the variance of the distribution again follows as Q t = F t R t F t +V t. Finally, the mean of the posterior is updated through the equation m t = a t + A t e t where a t is the mean of the posterior, A t = R t F + Q 1 t is known as the adaptive vector, and e t = Y t f t is the one step forecast error, and the variance of the posterior is updated at C t = R t F t /Q t. This posterior for θ t then becomes the prior for θ t and the process is repeated for time t + 1. The general system and observation equations in Equation 4 will be used to simulate a data set from three different types of models; a random walk, a dynamic straight line with intercept located at the origin, and a dynamic linear regression. Through the use of the forecasting and recurrence distributional relations above, a dynamic linear model will attempted to be fit to each simulated dataset. The one-step forecast distribution will be plotted against the simulated values in an effort to judge the accuracy of the forecasts. Other interesting features of the analysis will also be explored. 2 Random Walk The first data set to which a Bayesian analysis of a DLM will be applied will be the simplest: a random walk. This constant model takes the form of F t = 1 and G t = 1 with observation and system error variance independent of time. The particular model that will be simulated here will take these values to be V = 4 and W = Therefore, the quadruple that describes this model is {1, 1, 4, 0.25} which leads to the system and observation equations as the following: y t = θ t + ν t ν t (0, 4) θ t = θ t 1 + ω t ω t (0, 0.25) The simulation process to obtain a random walk dataset is as follows: 1. Start with an initial system observation θ o. Here this will be taken to be Simulate the first system error, ω 1, from N(0, 0.25). 3. Calculate θ 1 = θ o + ω Simulate the first observation error, ν 1, from N(0, 4). 5. Calculate the first observation y 1 = θ 1 + ν Repeat steps 2-5 for (n-1) times. The above algorithm was run resulting in a random walk with 50 values. These values can be seen plotted against time in Figure 1. 3

4 Figure 1: Simulated Random Walk Next, it is desired to forecast this time series using the forecasting and recurrence relations of the Bayesian analysis of the DLM. The values of interest that will be examined in further detail are the mean and variance of the forecast distribution, f t and Q t, the adaptive coefficient, A t, the error between the mean of the forecast distribution and the actual value, e t, and lastly, the posterior distribution of θ t at each time step characterized by the mean and variance, m t and C t. These values will be computed at each step using the following algorithm/formulas: 1. Start with initial values for the distribution of theta with mean, m 0, and variance, C 0 and estimates for the observation error variance, V and system error variance, W. 2. Compute forecast mean f 1 = m Compute forecast variance Q 1 = R 1 + V 4. Compute the Adaptive coefficient A 1 = R 1 Q 1 = C 0 + W Q 1 4

5 Month Forecast Distr Adaptive Coef Datum Error Posterior Info t Q t f t A t y t e t m t C t Table 1: Various components of the one-step forecasting and updating recurrence relations for the random walk data. 5. Compute the forecast error e 1 = Y 1 f 1, where Y 1 is the first value in the random walk sequence. 6. Compute the posterior mean m 1 = m 0 + A 1 e Compute the posterior variance C 1 = A 1 V. 8. Repeat steps 2-7 (n 1) times. This algorithm will first be used to forecast the random walk dataset shown in Figure 1 by cheating, somewhat. The assumed known values of m 0, C 0, V and W will be taken to be those values used to simulate the data. The results of this DLM is shown plotted against the original values in Figure 2. The solid red line represents the forecast mean, f t, while the dashed red lines represent 95% confidence bands, computed with the usual normal assumption equation of f t ± 1.96 Q t. For the most part, the forecasted values react to the peaks of the observations, typically one time step behind the peak. The forecast also does not demonstrate as many random fluctuations as the original observations. The forecasted values as a sequence appear to be a one-time delayed smoother version of the original data. Although the forecasted values did not demonstrate as much variability as the original values, at most of the time steps the original value was contained within the 95% confidence band. For the first nine time steps, the values of the components of interest are shown in a table similar to that reported by West and Harrison in Table 1. It can be seen that the adaptive coefficient converges rapidly to One interpretation of the adaptive coefficient is the prior regression coefficient of θ t upon y t. This rapid convergence implies that the prior information from the previous step is given about the same weight in forecasting the next point for all times, t. 5

6 Figure 2: Simulated Random Walk with forecasted values and 95% confidence bands. The above analysis relied on the fact that the prior information was the true values used to simulate the data; so, it is not surprising that the forecast was relatively good and the confidence bands relatively narrow. A question that may be asked is, how do these initial values affect the subsequent forecast, particularly the errors? A second analysis of the same random walk data was performed, but instead of the initial values being the true values used for simulation, the known values in the model were estimated from the data. Therefore, the initial forecast mean, m 0, was taken to be the first observed value, m 0 = The value of C 0 was again taken to be 1. The variance of all 50 values from the random walk was computed to be s Therefore, the estimated values of the observation error variance, V, and system error variance, W, were taken to each be half of the estimated variance. So, V = 9.5 and W = 9.5 were used for the second analysis. Initially of interest in comparing the two DLMs with differing initial values is that of the forecast errors, e t ; so how well do they do at predicting new values. The errors for each model are shown against time in Figure 3. Here only the first 6

7 25 values are plotted to distinguish differences and because the pattern continues for the remaining 25 values of time. At time t = 1, the forecast error of the model with estimated initial prior information is 0 since the initial value of the realized sequence was taken to be the prior mean. It can be seen that the errors between the two methods are dissimilar for the first five time values, and then converge and are similar for the remaining 20. This can possibly be explained by the values of the adaptive coefficient. At the time t = 8 the coefficient of m 0 when computing m 8 is (1 A t )(1 A t 1 )... (1 A 1 ). Therefore, the initial value contributes only about 17% to the estimated value at step 8 and this value decreases as time increases. The noticeable difference between using the known values for the prior and those estimated from the data can be seen by comparing Figure 4 to Figure 2. When using the data to estimate initial values, the forecast errors react more to the fluctuations of the observations and the uncertainty associated with the forecast distribution is much larger. This could be due to the fact that, in reality the observation error variance was 16 times larger than that of the system error variance, but in our naive estimation procedure, these variances were taken to be equal to each other. 3 Dynamic straight line through (0,0) The model in the previous section was that of a time invariant random walk. The observation and system error variance were constant throughout the process and the matrices of G t and F t were both identically 1. In this section a slightly more complex model will be explored, that which models the local relationship as a straight line through the origin, with values of the slope that vary with time. Again, a constance error variance model will be used. Therefore the quadruple that describes this model is {F t, 1, V, W } which results in the following observation and system equations y t = F t β t + ν t ν t (0, V ) β t = β t 1 + ω t ω t (0, W ) The covariate used here will be time, so F t = (1, 2,..., n). To simulate from this model, an algorithm similar to that used to simulate from the random walk will be used. The only different will occur to the equation in Step 5 which will become y t = t θ t +ν t. The initial slope used to simulate the data was β 0 = 3.2. The observational variance error was taken to be V = 4 and the system variance W = 0.5. Results of simulating 50 values in this way is shown in Figure 5. The increasing trend is explained by the use of time as the covariate. The algorithm to estimate the DLM for a straight line through the origin is similar to that used for the random walk in that the same values will need to be computed. However, for the random walk the matrix F t was not time dependent, so some of the formulas will be modified to adapt to this changing value. The updated algorithm is shown below. 7

8 Figure 3: Comparison of forecast errors using the known values for the prior information versus using values estimated from the data. 1. Start with initial values for the distribution of the slope, β 0, and variance, C 0 and estimates for the observation error variance, V and system error variance, W. 2. Compute forecast mean f 1 = F 1 m Compute forecast variance Q 1 = F 2 1 R 1 + V = F 2 1 (C 0 + W ) + V 4. Compute the Adaptive coefficient A 1 = R 1F 1 Q 1 = (C 0 + W ) F 1 Q 1 5. Compute the forecast error e 1 = Y 1 f 1, where Y 1 is the first value in the random walk sequence. 6. Compute the posterior mean m 1 = m 0 + A 1 e Compute the posterior variance C 1 = RtV Q t. 8. Repeat steps 2-7 (n 1) times. 8

9 Figure 4: Random walk with predictions and 95% error bands using values estimated from the data as the prior information. The algorithm was used to compute the forecasting and recurrence distributions of the simulated data in Figure 5. Cheating was again implemented as the prior information was taken to be that which was used to simulate the data; therefore, β 0 = 3.2, C 0 = 1, V = 4,and W = 0.5 were used in the algorithm above. The resulting prediction and 95% confidence bands are shown in Figure 6. The prediction means were similar to that seen with the random walk in that the fluctuations of the forecast means lag behind that the observed values by one point in time. A big difference between the confidence bands of the random walk and those seen here are that that width of the confidence bands are increasing with time. This is due to the fact that the variance of the 1-step forecast distribution, Q t, is dependent upon the value of the covariate, as seen in step 3 in the algorithm above. Since time is the covariate, this pattern makes sence. Another difference between the predictions from the first analysis of the random walk and that of the dynamic straight line is that the forecast mean responds more to the apparent random fluctuations of the observations. Where 9

10 Figure 5: Dynamic straight line through (0,0) the DLM analysis for the random walk appeared to be more of a smoother version of the observations, the dynamic straight line analysis does not appear to smooth the original data at all. This could be partially, at least explained by the comparison of the adaptive coefficient, A t and the posterior variance, C t by examining Table 1 for the random walk and the first five columns of Table 2. The range of A t values for the random walk analysis is between 0.22 and 0.23, while for the dynamic straight line analysis it ranges from 0.27 to This implies that for the analysis of this section much less weight is place on the forecast error when computing the posterior mean. Also, the range of the posterior variance for the random walk analysis was only 0.88 to 1, while for the dynamic straight line it ranged from to This implies that for the straight line analysis the forecast variance becomes dominated by the fixed system variance and there is not much contribution from the posterior variance of θ t as time increases. Instead of examining what the results would have been if the prior information has been estimated from the data, this analysis will instead be compared to that of a static model. Table 2 compares the distribution of the posterior mean 10

11 Figure 6: Simulated dynamic straight line through (0,0) with forecasted values and 95% confidence bands. for β t through the mean m t and variance C t. Also included in the table is the adaptive coefficient, A t, for each analysis. These values are shown for the first five observations as well as the last five observations. The most obvious difference between the two models is the values of the posterior mean, m t, at large values of t. Because time was used as the covariate, the observations increased as time increased. The dynamic model accounted for this by incorporating the increasing covariate into the estimation. The static model, however, held F t fixed at 1, so to account for the increasing trend, the value of β t increased. In the static model, the posterior variance of β held steady throughout at around , while that of the dynamic model decreased to This difference can be easily seen in the formulas for computing the posterior variance in each model, with the dynamic model taking into consideration the increasing covariate. Lastly, the adaptive coefficient for the dynamic model converges to a value of 0.02, while the static model converges to an order of magnitude larger at Because the adaptive coefficient in the dynamic model is closer to 0, this implies that the prior distribution is more concentrated than the likelihood. So, 11

12 Dynamic Model Static Model F t y t m t C t A t m t C t A t Table 2: Analysis of dynamic straight line through (0,0) using both a dynamic and a static model. the static model is more sensitive to the latest value of the observation. With the decreasing values of A t and Q t for the dynamic model, this implies that the model responds less and less to the most recent data point. 4 Dynamic Linear Regression The last set of DLM that will be analyzed in this report will be that with a slope and intercept that are time dependent. Therefore, θ t = (α t, β t ) will be bi-dimensional and the observation and state equations will now become y t = α t + t β t + ν t ν t (0, V ) α t = α t 1 + ω α,t ω α,t N(0, W α,t ) β t = β t 1 + ω β,t ω β,t (0, W β,t ) The above equations imply that the quadruple that characterizes the model is {F t, 1, V, W t } where F t = (1, t) and ( ) W t Wα,t 0 = 0 W β,t Here the observation error variance will again be assumed constant, but now the system error variances will vary with time. An algorithm similar to that used to simulate the observations for the random walk and the dynamic line with intercept 0, is used to simulate a dynamic linear regression dataset. The main difference is that two independent system error variances must be simulated to obtain one observation. Here again, time 12

13 will be used as the only covariate. Fifty observations were simulated using α 0 = 12, β 0 = 1.6, V = 3.8, W α,t = t0.35 and W β,t = 0.03/t. The resulting values are shown in Figure 7. Similar to the values simulated in the previous section, the decreasing trend is due to using a negative slope and time as the covariate. Figure 7: Simulated dynamic linear regression. The algorithm used to forecast this time series is again similar to those above, with added computations due to the two parameters that must be estimated. The necessary calculations are shown below. 1. Start with initial values of the intercept, α 0, slope, β 0, the initial covariance matrix of the joint distribution of the two, C 0, a 2x2 matrix with 0 s on the off diagonal, initial estimate of observation error variance, V, and a 2x2 matrix of the system error variance again with 0 s on the off diagonal. 2. Compute forecast mean f 1 = F 1a 1 = α 0 + (1)β Compute prior variance R 1 = C 0 + W 0. 13

14 4. Compute forecast variance Q 1 = F 1R 1 F 1 + V 5. Compute the adaptive coefficient A 1 = R 1 F 1 Q Compute the forecast error e 1 = y 1 f 1 7. Compute the posterior means α 1 = α 0 +A 1(1,1) e 1 and β 1 = β 0 +A 1(2,1) e Compute the posterior variance C 1 = R 1 A 1 Q 1 A 1. The algorithm was used on the previously simulated data using the known values as the prior information. Results of this algorithm can be seen in Figure 8. For the forecast means displayed by the red solid line, a similar pattern to that of the dynamic straight line with no intercept is seen. The predicted means appear very similar to the original data, but shift to the right by one time step. The predicted line seen here and in the previous section are so similar to the original data that it begs the question if this is a result of using the known values of the variances and initial estimates in the algorithms or is there something wrong with the prediction calculations. Although the latter cannot be ruled out, the former can definitely be accused for the analysis of the dynamic linear regression model as the true structure of the system variance, i.e. that W α,t = t0.35 and W β,t = 0.03/t were utilized in the calculations. Not surprisingly, the confidence bands around the forecasted means are very narrow; and they should be, the true state of nature is known and was used in the estimation procedure, so it would follow that there is little doubt about the estimates. With a complicated model as was used in this section, one may wonder how these initial values would be determined from only having access to the observed sequence. Currently, this is an area in which the author does not have any insight. 5 Conclusions Three different types of models were simulated and a Bayesian analysis of the resulting time series was attempted using dynamic linear models. The three types of models were a random walk, a dynamic straight line with intercept through the origin, and a dynamic linear regression, where the slope and intercept were allowed to vary with the covariate, taken here to be time. The main problem encountered with the analysis was that prior information of various parameters was needed in order to perform the necessary calculations. Since the data was simulated and did not represent any scientific process, the prior information available to the modeler was the actual values used to simulate the data. Intuitively, this led to very accurate results, but this information typically is unknown, and thus this procedure would not translate well to non-simulated data. The issue of not using the known values of simulation for estimation purposes was explored in the random walk example. Here the initial information was estimated from the data values themselves. It was seen that there was not a loss 14

15 Figure 8: Simulated dynamic linear regression with forecasted values and 95% confidence bands. in the prediction errors; however, the variance of the 1-step forecast distribution was significantly larger. In addition to the process of producing initial estimates, there are many other issues that could be addressed with respect to the Bayesian analysis of dynamic linear models. For all simulations in this report the observation variance, V, was assumed to be known, although this is the value that typcially must be estimated. An analysis which has to also model this parameter is another extension of the work of this report. Also, a model that utilizes a covariate other than just time also may be of interest. Lastly, analyzing non-simulated data with this procedure would be the true test of a modelers ability. 15

16 6 Bibliography 1. Petris, Giovanni. (2010). An R Package for Dynamic Linear Models. Journal of Statistical Software, 36(12), v36/i12/. 2. West, Mike and Jeff Harrison. Bayesian Forecasting and Dynamic Models. New York: Springer Verlang, West, Mike, Jeff Harrison, and Helio Migon. Dynamic Generalized Linear Models and Bayesian Forecasting. Journal of the American Statistical Association 80 (1985):

11. Time series and dynamic linear models

11. Time series and dynamic linear models 11. Time series and dynamic linear models Objective To introduce the Bayesian approach to the modeling and forecasting of time series. Recommended reading West, M. and Harrison, J. (1997). models, (2 nd

More information

17. SIMPLE LINEAR REGRESSION II

17. SIMPLE LINEAR REGRESSION II 17. SIMPLE LINEAR REGRESSION II The Model In linear regression analysis, we assume that the relationship between X and Y is linear. This does not mean, however, that Y can be perfectly predicted from X.

More information

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

More information

Linear regression methods for large n and streaming data

Linear regression methods for large n and streaming data Linear regression methods for large n and streaming data Large n and small or moderate p is a fairly simple problem. The sufficient statistic for β in OLS (and ridge) is: The concept of sufficiency is

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Observing the Changing Relationship Between Natural Gas Prices and Power Prices

Observing the Changing Relationship Between Natural Gas Prices and Power Prices Observing the Changing Relationship Between Natural Gas Prices and Power Prices The research views expressed herein are those of the author and do not necessarily represent the views of the CME Group or

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

1 Short Introduction to Time Series

1 Short Introduction to Time Series ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

More information

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1. Page 1 of 11. EduPristine CMA - Part I Index Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques... 1 EduPristine CMA - Part I Page 1 of 11 Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting

More information

Spam Filtering based on Naive Bayes Classification. Tianhao Sun

Spam Filtering based on Naive Bayes Classification. Tianhao Sun Spam Filtering based on Naive Bayes Classification Tianhao Sun May 1, 2009 Abstract This project discusses about the popular statistical spam filtering process: naive Bayes classification. A fairly famous

More information

System Identification for Acoustic Comms.:

System Identification for Acoustic Comms.: System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

On Correlating Performance Metrics

On Correlating Performance Metrics On Correlating Performance Metrics Yiping Ding and Chris Thornley BMC Software, Inc. Kenneth Newman BMC Software, Inc. University of Massachusetts, Boston Performance metrics and their measurements are

More information

AP Physics 1 and 2 Lab Investigations

AP Physics 1 and 2 Lab Investigations AP Physics 1 and 2 Lab Investigations Student Guide to Data Analysis New York, NY. College Board, Advanced Placement, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Gamma Distribution Fitting

Gamma Distribution Fitting Chapter 552 Gamma Distribution Fitting Introduction This module fits the gamma probability distributions to a complete or censored set of individual or grouped data values. It outputs various statistics

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Forecasting in supply chains

Forecasting in supply chains 1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the

More information

Centre for Central Banking Studies

Centre for Central Banking Studies Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

More information

Penalized regression: Introduction

Penalized regression: Introduction Penalized regression: Introduction Patrick Breheny August 30 Patrick Breheny BST 764: Applied Statistical Modeling 1/19 Maximum likelihood Much of 20th-century statistics dealt with maximum likelihood

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

7 Time series analysis

7 Time series analysis 7 Time series analysis In Chapters 16, 17, 33 36 in Zuur, Ieno and Smith (2007), various time series techniques are discussed. Applying these methods in Brodgar is straightforward, and most choices are

More information

Outline: Demand Forecasting

Outline: Demand Forecasting Outline: Demand Forecasting Given the limited background from the surveys and that Chapter 7 in the book is complex, we will cover less material. The role of forecasting in the chain Characteristics of

More information

Conditional guidance as a response to supply uncertainty

Conditional guidance as a response to supply uncertainty 1 Conditional guidance as a response to supply uncertainty Appendix to the speech given by Ben Broadbent, External Member of the Monetary Policy Committee, Bank of England At the London Business School,

More information

Forecast covariances in the linear multiregression dynamic model.

Forecast covariances in the linear multiregression dynamic model. Forecast covariances in the linear multiregression dynamic model. Catriona M Queen, Ben J Wright and Casper J Albers The Open University, Milton Keynes, MK7 6AA, UK February 28, 2007 Abstract The linear

More information

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.$ and Sales $: 1. Prepare a scatter plot of these data. The scatter plots for Adv.$ versus Sales, and Month versus

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Time series Forecasting using Holt-Winters Exponential Smoothing

Time series Forecasting using Holt-Winters Exponential Smoothing Time series Forecasting using Holt-Winters Exponential Smoothing Prajakta S. Kalekar(04329008) Kanwal Rekhi School of Information Technology Under the guidance of Prof. Bernard December 6, 2004 Abstract

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

State Space Time Series Analysis

State Space Time Series Analysis State Space Time Series Analysis p. 1 State Space Time Series Analysis Siem Jan Koopman http://staff.feweb.vu.nl/koopman Department of Econometrics VU University Amsterdam Tinbergen Institute 2011 State

More information

Time Series and Forecasting

Time Series and Forecasting Chapter 22 Page 1 Time Series and Forecasting A time series is a sequence of observations of a random variable. Hence, it is a stochastic process. Examples include the monthly demand for a product, the

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

How Much Equity Does the Government Hold?

How Much Equity Does the Government Hold? How Much Equity Does the Government Hold? Alan J. Auerbach University of California, Berkeley and NBER January 2004 This paper was presented at the 2004 Meetings of the American Economic Association. I

More information

Chapter 5 Estimating Demand Functions

Chapter 5 Estimating Demand Functions Chapter 5 Estimating Demand Functions 1 Why do you need statistics and regression analysis? Ability to read market research papers Analyze your own data in a simple way Assist you in pricing and marketing

More information

CHAPTER 7: OPTIMAL RISKY PORTFOLIOS

CHAPTER 7: OPTIMAL RISKY PORTFOLIOS CHAPTER 7: OPTIMAL RIKY PORTFOLIO PROLEM ET 1. (a) and (e).. (a) and (c). After real estate is added to the portfolio, there are four asset classes in the portfolio: stocks, bonds, cash and real estate.

More information

Forecasting in STATA: Tools and Tricks

Forecasting in STATA: Tools and Tricks Forecasting in STATA: Tools and Tricks Introduction This manual is intended to be a reference guide for time series forecasting in STATA. It will be updated periodically during the semester, and will be

More information

Forecasting methods applied to engineering management

Forecasting methods applied to engineering management Forecasting methods applied to engineering management Áron Szász-Gábor Abstract. This paper presents arguments for the usefulness of a simple forecasting application package for sustaining operational

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Forecasting with ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos (UC3M-UPM)

More information

Master s Thesis. A Study on Active Queue Management Mechanisms for. Internet Routers: Design, Performance Analysis, and.

Master s Thesis. A Study on Active Queue Management Mechanisms for. Internet Routers: Design, Performance Analysis, and. Master s Thesis Title A Study on Active Queue Management Mechanisms for Internet Routers: Design, Performance Analysis, and Parameter Tuning Supervisor Prof. Masayuki Murata Author Tomoya Eguchi February

More information

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes?

Forecasting Methods. What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Forecasting Methods What is forecasting? Why is forecasting important? How can we evaluate a future demand? How do we make mistakes? Prod - Forecasting Methods Contents. FRAMEWORK OF PLANNING DECISIONS....

More information

1 Maximum likelihood estimation

1 Maximum likelihood estimation COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

More information

9 Multiplication of Vectors: The Scalar or Dot Product

9 Multiplication of Vectors: The Scalar or Dot Product Arkansas Tech University MATH 934: Calculus III Dr. Marcel B Finan 9 Multiplication of Vectors: The Scalar or Dot Product Up to this point we have defined what vectors are and discussed basic notation

More information

5.5. Solving linear systems by the elimination method

5.5. Solving linear systems by the elimination method 55 Solving linear systems by the elimination method Equivalent systems The major technique of solving systems of equations is changing the original problem into another one which is of an easier to solve

More information

Investors and Central Bank s Uncertainty Embedded in Index Options On-Line Appendix

Investors and Central Bank s Uncertainty Embedded in Index Options On-Line Appendix Investors and Central Bank s Uncertainty Embedded in Index Options On-Line Appendix Alexander David Haskayne School of Business, University of Calgary Pietro Veronesi University of Chicago Booth School

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

More information

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

More information

Supplement to Call Centers with Delay Information: Models and Insights

Supplement to Call Centers with Delay Information: Models and Insights Supplement to Call Centers with Delay Information: Models and Insights Oualid Jouini 1 Zeynep Akşin 2 Yves Dallery 1 1 Laboratoire Genie Industriel, Ecole Centrale Paris, Grande Voie des Vignes, 92290

More information

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0.

Predictor Coef StDev T P Constant 970667056 616256122 1.58 0.154 X 0.00293 0.06163 0.05 0.963. S = 0.5597 R-Sq = 0.0% R-Sq(adj) = 0. Statistical analysis using Microsoft Excel Microsoft Excel spreadsheets have become somewhat of a standard for data storage, at least for smaller data sets. This, along with the program often being packaged

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

Algebra 1 Course Information

Algebra 1 Course Information Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

More information

APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

More information

Econometrics Simple Linear Regression

Econometrics Simple Linear Regression Econometrics Simple Linear Regression Burcu Eke UC3M Linear equations with one variable Recall what a linear equation is: y = b 0 + b 1 x is a linear equation with one variable, or equivalently, a straight

More information

Fitting Subject-specific Curves to Grouped Longitudinal Data

Fitting Subject-specific Curves to Grouped Longitudinal Data Fitting Subject-specific Curves to Grouped Longitudinal Data Djeundje, Viani Heriot-Watt University, Department of Actuarial Mathematics & Statistics Edinburgh, EH14 4AS, UK E-mail: vad5@hw.ac.uk Currie,

More information

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command

Forecast. Forecast is the linear function with estimated coefficients. Compute with predict command Forecast Forecast is the linear function with estimated coefficients T T + h = b0 + b1timet + h Compute with predict command Compute residuals Forecast Intervals eˆ t = = y y t+ h t+ h yˆ b t+ h 0 b Time

More information

RELEVANT TO ACCA QUALIFICATION PAPER P3. Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam

RELEVANT TO ACCA QUALIFICATION PAPER P3. Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam RELEVANT TO ACCA QUALIFICATION PAPER P3 Studying Paper P3? Performance objectives 7, 8 and 9 are relevant to this exam Business forecasting and strategic planning Quantitative data has always been supplied

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Determination of g using a spring

Determination of g using a spring INTRODUCTION UNIVERSITY OF SURREY DEPARTMENT OF PHYSICS Level 1 Laboratory: Introduction Experiment Determination of g using a spring This experiment is designed to get you confident in using the quantitative

More information

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications

More information

Machine Learning Logistic Regression

Machine Learning Logistic Regression Machine Learning Logistic Regression Jeff Howbert Introduction to Machine Learning Winter 2012 1 Logistic regression Name is somewhat misleading. Really a technique for classification, not regression.

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression

An analysis appropriate for a quantitative outcome and a single quantitative explanatory. 9.1 The model behind linear regression Chapter 9 Simple Linear Regression An analysis appropriate for a quantitative outcome and a single quantitative explanatory variable. 9.1 The model behind linear regression When we are examining the relationship

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

Time Series Forecasting Techniques

Time Series Forecasting Techniques 03-Mentzer (Sales).qxd 11/2/2004 11:33 AM Page 73 3 Time Series Forecasting Techniques Back in the 1970s, we were working with a company in the major home appliance industry. In an interview, the person

More information

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

More information

Review Jeopardy. Blue vs. Orange. Review Jeopardy

Review Jeopardy. Blue vs. Orange. Review Jeopardy Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?

More information

Using simulation to calculate the NPV of a project

Using simulation to calculate the NPV of a project Using simulation to calculate the NPV of a project Marius Holtan Onward Inc. 5/31/2002 Monte Carlo simulation is fast becoming the technology of choice for evaluating and analyzing assets, be it pure financial

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Joint models for classification and comparison of mortality in different countries.

Joint models for classification and comparison of mortality in different countries. Joint models for classification and comparison of mortality in different countries. Viani D. Biatat 1 and Iain D. Currie 1 1 Department of Actuarial Mathematics and Statistics, and the Maxwell Institute

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Introduction to Logistic Regression

Introduction to Logistic Regression OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction

More information

Time Series Analysis and Forecasting

Time Series Analysis and Forecasting Time Series Analysis and Forecasting Math 667 Al Nosedal Department of Mathematics Indiana University of Pennsylvania Time Series Analysis and Forecasting p. 1/11 Introduction Many decision-making applications

More information

Circuit Analysis using the Node and Mesh Methods

Circuit Analysis using the Node and Mesh Methods Circuit Analysis using the Node and Mesh Methods We have seen that using Kirchhoff s laws and Ohm s law we can analyze any circuit to determine the operating conditions (the currents and voltages). The

More information

Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model

Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model Finance 400 A. Penati - G. Pennacchi Market Micro-Structure: Notes on the Kyle Model These notes consider the single-period model in Kyle (1985) Continuous Auctions and Insider Trading, Econometrica 15,

More information

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 )

Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) Chapter 13 Introduction to Nonlinear Regression( 非 線 性 迴 歸 ) and Neural Networks( 類 神 經 網 路 ) 許 湘 伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples

More information

General Framework for an Iterative Solution of Ax b. Jacobi s Method

General Framework for an Iterative Solution of Ax b. Jacobi s Method 2.6 Iterative Solutions of Linear Systems 143 2.6 Iterative Solutions of Linear Systems Consistent linear systems in real life are solved in one of two ways: by direct calculation (using a matrix factorization,

More information

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data

A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt

More information

Operation Count; Numerical Linear Algebra

Operation Count; Numerical Linear Algebra 10 Operation Count; Numerical Linear Algebra 10.1 Introduction Many computations are limited simply by the sheer number of required additions, multiplications, or function evaluations. If floating-point

More information

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure? Harvey Motulsky hmotulsky@graphpad.com This is the first case in what I expect will be a series of case studies. While I mention

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Problem of Missing Data

Problem of Missing Data VASA Mission of VA Statisticians Association (VASA) Promote & disseminate statistical methodological research relevant to VA studies; Facilitate communication & collaboration among VA-affiliated statisticians;

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Note on the EM Algorithm in Linear Regression Model

Note on the EM Algorithm in Linear Regression Model International Mathematical Forum 4 2009 no. 38 1883-1889 Note on the M Algorithm in Linear Regression Model Ji-Xia Wang and Yu Miao College of Mathematics and Information Science Henan Normal University

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Time Series Analysis

Time Series Analysis Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:

More information

Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

More information