# AUTOCORRELATED RESIDUALS OF ROBUST REGRESSION

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 AUTOCORRELATED RESIDUALS OF ROBUST REGRESSION Jan Kalina Abstract The work is devoted to the Durbin-Watson test for robust linear regression methods. First we explain consequences of the autocorrelation of residuals on estimating regression parameters. We propose an asymptotic version of the Durbin-Watson test for regression quantiles and trimmed least squares and derive an asymptotic approximation to the exact null distribution of the test statistic, exploiting the asymptotic representation for both regression estimators. Further, we consider the least weighted squares estimator, which is a highly robust estimator based on the idea to down-weight less reliable observations. We compare various versions of the Durbin-Watson test for the least weighted squares estimator. The asymptotic test is derived using two versions of the asymptotic representation. Finally, we investigate a weighted Durbin-Watson test using the weights determined by the least weighted squares estimator. The exact test is described and also an asymptotic approximation to the distribution of the weighted statistic under the null hypothesis is obtained. Key words: linear regression, robust statistics, diagnostics, autocorrelation JEL Code: C52, C14, C12 1 Model In the whole paper, we consider the linear regression model = , = 1,,, (1) where,, are values of a continuous response variable and,, are random errors (disturbances). The task is to estimate the regression parameters = (,,, ). We assume the data in the model (1) to be observed in equidistant time intervals. This work 551

2 discusses the assumption of independence of the errors and its violation. In econometrics, this violation is most commonly formulated as the autocorrelation of the errors. The violation of the independence of the errors may have severe negative consequences. Regression parameters are not estimated efficiently. Denoting the least squares estimator of by = (,,, ), the classical estimator of var is biased. This disqualifies using classical hypothesis tests and confidence intervals for as well as the value of the coefficient of determination.diagnostic tools checking the assumption of equality of variances of the disturbances can be based on residuals =, = 1,,. (2) This work is devoted to robust statistical procedures for estimating in the model (1). Robust statistical methods have been developed as an alternative paradigm to data modeling tailormade to suppress the effect of outliers. Modern robust methods can be used not only as a diagnostic tool for classical methods, but can also fully exploited as reliable self-standing procedures. M-estimators are the most widely used robust statistical methods (Jurečková and Picek, 2005). However, M-estimators in linear regression do not possess a high breakdown point (only 1/ ), which is a statistical measure of sensitivity against noise or outliers in the data (Rousseeuw and Leroy, 1987). Highly robust methods are defined as methods with a high breakdown point, which has become a crucial concept in robust statistics. Moreover, original M-estimators of regression parameters require a simultaneous estimation of the error variance, which entails losing computational simplicity. 2 Durbin-Watson test for least squares The Durbin-Watson test is a classical test of the null hypothesis of homoscedasticity for the least squares estimator in the model (1). Econometricians most commonly consider the test against a one-sided alternative of a positive autocorrelation : = +, 0 < < 1, = 2,,, (3) where ρ is the autocorrelation coefficient of the first order and,, are independent random errors. Based on this notation, the null hypothesis of uncorrelated errors in the model (1) can be expressed as : = 0. Especially if is close to 1, the autocorrelation of the errors (3) may have severe negative consequences. 552

3 The test proposed by Durbin and Watson (1950) and later corrected by Durbin and Watson (1951) is based on the test statistic = ( ), (4) which can be expressed as =. (5) The distribution of d depends not only on p and n, but also on the design matrix X. However, lower and upper bounds for critical values have been tabulated. The notation ( ) and ( ) is used for the lower and upper bound of the test with level, respectively. The decision rule of the test can be described in the following way: Reject, if < ( ). Do not reject, if > ( ). No conclusion, if ( ) < < ( ). The values of ( ) and ( ) have been tabulated; see Kmenta (1986) or other econometric textbooks. There exist several approximations for the exact p-value for a large number of observations. However, every such approximation depends on the design matrix.durbin and Watson (1951) proposed to transform d to the interval (0,1); let us denote this transformed by. Then the distribution of can be approximated by a beta distribution. The authors warned that this approximation is very rough and can be recommended only if > 40.In a later paper, Durbin and Watson (1971) admitted that their approximation is more accurate that they expected and their original caution was excessive. Anyway they recommend first to use tables with critical values and only for an inconclusive result to use their approximation. There exists also an approximative Durbin-Watson test based on a large-sample normal approximation. This test can be used when the number of observations is so large that tables of ( ) and ( ) are not available. The idea is based on properties of the AR(1) process, which is an autoregressive process of the first order. The (theoretical) first-order autocorrelation coefficient in AR(1) model is equal to the coefficient. Thus can be estimated by the empirical first-order autocorrelation coefficient 553

4 =. (6) The test rejects if >. The Durbin-Watson test is a general test of misspecification of the model. Although it is sensitive to the autocorrelation of the disturbances, it can give a significant result also for a model with a missing variable. This can be for example the square of one of the independent variables in the model (1). A misspecification of the model can be revealed by a plot of residuals (1, ),,(, ),which should be examined carefully before running the Durbin- Watson test. If it is aimed to test against negative autocorrelation, the quantity 4 is usually used as a test statistic and then the decision rule is the same. A test against a two-sided alternative : 0 is straightforward to be carried out, but not much used in econometric practice. Cochrane and Orcutt (1949) proposed a method for estimating parameters in the model (1), which adjusts for the autocorrelation of the errors and can be recommended to be carried out if the Durbin-Watson test yields a significant result. 3 Durbin-Watson test for regression quantiles Regression quantiles represent a natural generalization of samples quantiles to the linear regression model (1). Their theory was studied by Koenker (1995) and their asymptotic representation was derived by Jurečková and Sen (1996). The estimator depends on a fixed value of the parameter α in the interval (0,1),which corresponds to dividing the disturbances to 100 % values below the regression quantile and the remaining (1 ) 100 % values above the regression quantile. Here we describe the asymptotic Durbin-Watson test for regression quantiles, which is derived from the asymptotic representation. The proof of the theorems follows from the asymptotic considerations of Kalina (2007). Theorem 1. Let the statistic of the Durbin-Watson test be computed using the residuals of the regression quantile with a parameter.then, assuming normal errors ~ (0, ) in (1), the statistic is asymptotically equivalent to / (7) 554

5 under,where = ( ) and is a unit matrix of size. The test statistic (7) is exactly the Durbin-Watson statistic (5) for the least squares regression. 4 Durbin-Watson test for trimmed least squares The trimmed least squares estimator (TLS) is a robust estimator in the linear regression model (1) based on regression quantiles. Its asymptotic representation derived by Jurečková and Sen (1996) allows to derive an asymptotic Durbin-Watson test for the TLS estimator. The estimator depends on two parameters. These are fixed values and between 0 and 1. Let us denote by ( ) and ( ) the regression quantiles corresponding to the arguments and.let us assume 0 < < < < 1.To define the TLS estimator, let us introduce weights = (,, ),where is equal to 1, if both conditions are fulfilled: 1. The fitted value of the -th observation by the quantile regression with argument is smaller than ; 2. The fitted value of the -th observation by the quantile regression with argument is larger than. Let denote the diagonal matrix with diagonal elements,,,where is an indicator of the random event that the i-th observation lies between the values ( ) and ( ).The TLS estimator (, ) in the model (1) is defined as (, ) = ( ). (8) The asymptotic representation requires the assumption that the errors,, come from a continuous distribution with a density function symmetric around 0. Theorem 2. Let the statistic of the Durbin-Watson test be computed using the residuals of the TLS estimator with parameters and.then, assuming normal errors ~ (0, ) in (1), the statistic is asymptotically equivalent to (7) under. 555

6 5 Least weighted squares A promising highly robust estimator in the linear regression context seems to be the least weighted squares estimator (Víšek, 2002). It does not remove outliers, but potential outliers are only down-weighted and not trimmed away completely. We will now recall the least weighted squares (LWS) estimator. First, the user needs to specify a sequence of magnitudes of weights, (9) which are assigned to individual observations only after some permutation. Let us denote the squared residuals corresponding to a given estimator of as ( ) ( ) ( ) ( ) ( )( ). (10) The LWS estimator is defined as arg min R ( )( ). (11) The computation of the LWS estimator is intensive and an approximative algorithm can be obtained as a weighted version of the FAST-LTS algorithm proposed for the least trimmed squares (LTS) regression. The latter represents a special case of least weighted squares with weights equal to zero or one only and its ability to detect outliers was investigated by Hekimoglu et al. (2009). Heteroscedasticity tests for the LWS were studied by Kalina (2011). The LWS estimator has a small local sensitivity compared to the LTS, which suffers from high sensitivity to small deviations near the center of the data. Properties of the LWS estimator were investigated by Víšek (2011) or Kalina (2012). 6 Least weighted squares: asymptotic Durbin-Watson statistic We recall the asymptotic Durbin-Watson test for the LWS estimator derived by Kalina (2007). Later in Sections 7 and 8, we will derive alternative approaches based on a weighted version of the Durbin-Watson test statistic. We use the notation = for the residuals of the LWS estimator. Remark 1. The test statistic (4) is invariant with respect to. 556

7 We assume independent normal errors ~ (0, ), = 1,,. (12) Under the null hypothesis, the value of is arbitrary in the spirit of Remark 1. We will use the asymptotic representation of the LWS estimator presented by Víšek (2011) and assume his technical Assumptions A. The weights,, are assumed to fulfil (9). We introduce the notation = ( ) +, (13) where coordinates of the remainder term = (,, ) are negligible in probability, denotes an indicator, δ is a constant and is the quantile function of the standard normal distribution. We introduce the notation = and =,where is known up to the value of the constant δ, while Kalina (2007) proposed its consistent estimator. Theorem 3. Under,the test statistic (4) is asymptotically equivalent in probability with (14) under Assumptions A. The proof is analogous with Kalina (2007). Theorem 4. The test statistic (4) is asymptotically equivalent in probability with where = ( ). /, (15) The proof is based on Theorem 3 using analogous steps as Kalina (2007). Finally we explain how to compute the p-value or critical value for the asymptotic test, based on simulating from the null distribution of the test statistics (14) or (15). Theorem 5. The approximative p-values of the Durbin-Watson test for the least weighted squares based on (10), or (11) respectively, assuming (12), defined as the probability >, (16) 557

8 respectively >, (17) converge to the p-value of the exact test for the LWS residuals for under Assumptions A, where,, are independent random variables with (0,1) distribution and,, are computed using the values,,. Remark 2. Without loss of generality, the unit variance of the random errors can be considered in Theorem 5, because the value of is scale-invariant, the vectors is scale-equivariant and the expressions (14) and (15) also scale-invariant. 7 Least weighted squares: exact weighted Durbin-Watson test We study the null distribution of the weighted Durbin-Watson statistic for the least weighted squares conditioning on fixed weights. Let,, denote the resulting weights corresponding to individual observations obtained as the output of the LWS estimator, i.e. the weight is assigned to the i-th observation for = 1,,.Let denote the diagonal matrix, which contains the weights,, on the main diagonal. The idea of the exact test is to express the LWS residuals = in the form =, where the matrix is defined as = ( ). (18) Theorem 6. The p-value of the Durbin-Watson test for the LWS estimator against the onesided alternative of positive autocorrelation based on the weighted Durbin-Watson statistic = ( ) = / / (19) assuming ~ (0, ) (20) is equal to the probability / /, (21) 558

9 where,, are independent random variables with (0,1) distribution,,, are positive eigenvalues of,,, are positive eigenvalues of. 8 Least weighted squares: asymptotic weighted Durbin-Watson test We use the asymptotic results described in Section 6 to obtain the asymptotic null distribution of the weighted Durbin-Watson statistic (19) computed from the LWS residuals. Again we use the notation =. Theorem 7. The approximative p-values of the Durbin-Watson test for the least weighted squares, assuming (20) and Assumptions A, defined as the probability >, (22) or using an additional approximation as >, (23) converge for to the p-value of the exact weighted Durbin-Watson test for the LWS residuals under, where,, are independent random variables with (0,1) distribution and,, are computed from the values,,,using = and (13). Acknowledgement The research was supported by the grant S of the Granty Agency of the Czech Republic (Robust procedures for non-standard procedures, their diagnostics and implementation). References 1. Cipra T.: Ekonometrie. Praha: SPN, In Czech. 2. Cochrane D., Orcutt G.H.: Application of least squares regression to relationships containing autocorrelated error terms. Journal of the American Statistical Association 44, 32-61, Durbin J., Watson G.S.: Testing for serial correlation in least squares regression I. Biometrika 37, ,

10 4. Durbin J., Watson G.S.: Testing for serial correlation in least squares regression II. Biometrika 38, , Durbin J., Watson G.S.: Testing for serial correlation in least squares regression III. Biometrika 58, 1-19, Hekimoglu S., Erenoglu R.C., Kalina J.: Outlier detection by means of robust regression estimators for use in engineering science. Journal of Zhejiang University Science A 10 (6), , Jurečková J., Picek J.: Robust statistical methods with R. Boca Raton: Chapman & Hall/CRC, Jurečková J., Sen P.K.: Robust statistical procedures: Asymptotics and interrelations. New York: Wiley, Kalina J.: On multivariate methods in robust econometrics. Prague Economic Papers 21 (1), 69-82, Kalina J.: On heteroscedasticity in robust regression. International Days of Statistics and Economics Conference Proceedings, September 22-23,2011, Prague: University of Economics, , ISBN Kalina J.: Asymptotic Durbin-Watson test for robust regression. Bulletin of the International Statistical Institute 62, , Koenker R.: Quantile regression. Cambridge: Cambridge University Press, Rousseeuw P.J., Leroy A.M.: Robust regression and outlier detection. New York: Wiley, Víšek J.Á.: Consistency of the least weighted squares under heteroscedasticity. Kybernetika 47, , Víšek J.Á.: The least weighted squares I. Bulletin of the Czech Econometric Society 9 (15), 31-56, Contact RNDr. Jan Kalina, Ph.D. Institute of Computer Science of the Academy of Sciences of the Czech Republic Pod Vodárenskou věží 2, , Praha 8, Czech Republic 560

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### 2. What are the theoretical and practical consequences of autocorrelation?

Lecture 10 Serial Correlation In this lecture, you will learn the following: 1. What is the nature of autocorrelation? 2. What are the theoretical and practical consequences of autocorrelation? 3. Since

### Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions What will happen if we violate the assumption that the errors are not serially

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

### Testing for serial correlation in linear panel-data models

The Stata Journal (2003) 3, Number 2, pp. 168 177 Testing for serial correlation in linear panel-data models David M. Drukker Stata Corporation Abstract. Because serial correlation in linear panel-data

### MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level of Significance

### Chapter 4: Vector Autoregressive Models

Chapter 4: Vector Autoregressive Models 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie IV.1 Vector Autoregressive Models (VAR)...

### Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

### Multiple Regression: What Is It?

Multiple Regression Multiple Regression: What Is It? Multiple regression is a collection of techniques in which there are multiple predictors of varying kinds and a single outcome We are interested in

### Department of Economics

Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

### Robustness of the Ljung-Box Test and its Rank Equivalent

Robustness of the Test and its Rank Equivalent Patrick Burns patrick@burns-stat.com http://www.burns-stat.com This Draft: 6 October 2002 Abstract: The test is known to be robust. This paper reports on

### Note 2 to Computer class: Standard mis-specification tests

Note 2 to Computer class: Standard mis-specification tests Ragnar Nymoen September 2, 2013 1 Why mis-specification testing of econometric models? As econometricians we must relate to the fact that the

### Regression analysis in practice with GRETL

Regression analysis in practice with GRETL Prerequisites You will need the GNU econometrics software GRETL installed on your computer (http://gretl.sourceforge.net/), together with the sample files that

### Simple linear regression

Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

### 5. Linear Regression

5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

### Estimation and Inference in Cointegration Models Economics 582

Estimation and Inference in Cointegration Models Economics 582 Eric Zivot May 17, 2012 Tests for Cointegration Let the ( 1) vector Y be (1). Recall, Y is cointegrated with 0 cointegrating vectors if there

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Econ 424/Amath 462 Hypothesis Testing in the CER Model

Econ 424/Amath 462 Hypothesis Testing in the CER Model Eric Zivot July 23, 2013 Hypothesis Testing 1. Specify hypothesis to be tested 0 : null hypothesis versus. 1 : alternative hypothesis 2. Specify significance

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### 9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Testing The Quantity Theory of Money in Greece: A Note

ERC Working Paper in Economic 03/10 November 2003 Testing The Quantity Theory of Money in Greece: A Note Erdal Özmen Department of Economics Middle East Technical University Ankara 06531, Turkey ozmen@metu.edu.tr

### Simple Linear Regression in SPSS STAT 314

Simple Linear Regression in SPSS STAT 314 1. Ten Corvettes between 1 and 6 years old were randomly selected from last year s sales records in Virginia Beach, Virginia. The following data were obtained,

### Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

### Module 5: Multiple Regression Analysis

Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

### Heteroskedasticity and Weighted Least Squares

Econ 507. Econometric Analysis. Spring 2009 April 14, 2009 The Classical Linear Model: 1 Linearity: Y = Xβ + u. 2 Strict exogeneity: E(u) = 0 3 No Multicollinearity: ρ(x) = K. 4 No heteroskedasticity/

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### An Alternative Route to Performance Hypothesis Testing

EDHEC-Risk Institute 393-400 promenade des Anglais 06202 Nice Cedex 3 Tel.: +33 (0)4 93 18 32 53 E-mail: research@edhec-risk.com Web: www.edhec-risk.com An Alternative Route to Performance Hypothesis Testing

### Chapter 6: Multivariate Cointegration Analysis

Chapter 6: Multivariate Cointegration Analysis 1 Contents: Lehrstuhl für Department Empirische of Wirtschaftsforschung Empirical Research and und Econometrics Ökonometrie VI. Multivariate Cointegration

### 2. Linear regression with multiple regressors

2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### 16 : Demand Forecasting

16 : Demand Forecasting 1 Session Outline Demand Forecasting Subjective methods can be used only when past data is not available. When past data is available, it is advisable that firms should use statistical

### The Delta Method and Applications

Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### FACULTY WORKING PAPER NO. 1021

330 3385 1021 COPY 2 STX '«««.2 3 J FACULTY WORKING PAPER NO. 1021 The Use of Linear Approximation to Nonlinear Regression Analysis Anil K. Bera an*** 6? r i ce anc Bus nes; Bureau of Economic and Business

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### The following postestimation commands for time series are available for regress:

Title stata.com regress postestimation time series Postestimation tools for regress with time series Description Syntax for estat archlm Options for estat archlm Syntax for estat bgodfrey Options for estat

### Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

### Probability and Statistics

CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

### Vector Time Series Model Representations and Analysis with XploRe

0-1 Vector Time Series Model Representations and Analysis with plore Julius Mungo CASE - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin mungo@wiwi.hu-berlin.de plore MulTi Motivation

### Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

: Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Instrumental Variables & 2SLS

Instrumental Variables & 2SLS y 1 = β 0 + β 1 y 2 + β 2 z 1 +... β k z k + u y 2 = π 0 + π 1 z k+1 + π 2 z 1 +... π k z k + v Economics 20 - Prof. Schuetze 1 Why Use Instrumental Variables? Instrumental

### Online Appendices to the Corporate Propensity to Save

Online Appendices to the Corporate Propensity to Save Appendix A: Monte Carlo Experiments In order to allay skepticism of empirical results that have been produced by unusual estimators on fairly small

### Pearson's Correlation Tests

Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

### Univariate Regression

Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

### Non Parametric Inference

Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

### What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling Jeff Wooldridge NBER Summer Institute, 2007 1. The Linear Model with Cluster Effects 2. Estimation with a Small Number of Groups and

### Centre for Central Banking Studies

Centre for Central Banking Studies Technical Handbook No. 4 Applied Bayesian econometrics for central bankers Andrew Blake and Haroon Mumtaz CCBS Technical Handbook No. 4 Applied Bayesian econometrics

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

### Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

### MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST

MODIFIED PARAMETRIC BOOTSTRAP: A ROBUST ALTERNATIVE TO CLASSICAL TEST Zahayu Md Yusof, Nurul Hanis Harun, Sharipah Sooad Syed Yahaya & Suhaida Abdullah School of Quantitative Sciences College of Arts and

### 2. What is the general linear model to be used to model linear trend? (Write out the model) = + + + or

Simple and Multiple Regression Analysis Example: Explore the relationships among Month, Adv.\$ and Sales \$: 1. Prepare a scatter plot of these data. The scatter plots for Adv.\$ versus Sales, and Month versus

### Regression. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.

Class: Date: Regression Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Given the least squares regression line y8 = 5 2x: a. the relationship between

### Sales forecasting # 2

Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Continued Fractions and the Euclidean Algorithm

Continued Fractions and the Euclidean Algorithm Lecture notes prepared for MATH 326, Spring 997 Department of Mathematics and Statistics University at Albany William F Hammond Table of Contents Introduction

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### Hypothesis testing - Steps

Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

### Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35

Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistiek-i/ John Nerbonne 1/35 t-tests To test an average or pair of averages when σ is known, we

### The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

### Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

### Multivariate Analysis of Ecological Data

Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology

### Basic concepts and introduction to statistical inference

Basic concepts and introduction to statistical inference Anna Helga Jonsdottir Gunnar Stefansson Sigrun Helga Lund University of Iceland (UI) Basic concepts 1 / 19 A review of concepts Basic concepts Confidence

### Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

### Convex Hull Probability Depth: first results

Conve Hull Probability Depth: first results Giovanni C. Porzio and Giancarlo Ragozini Abstract In this work, we present a new depth function, the conve hull probability depth, that is based on the conve

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

### MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

### 1 Example of Time Series Analysis by SSA 1

1 Example of Time Series Analysis by SSA 1 Let us illustrate the 'Caterpillar'-SSA technique [1] by the example of time series analysis. Consider the time series FORT (monthly volumes of fortied wine sales

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation

SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 19, 2015 Outline

### Robust Estimation of the Correlation Coefficient: An Attempt of Survey

AUSTRIAN JOURNAL OF STATISTICS Volume 40 (2011), Number 1 & 2, 147 156 Robust Estimation of the Correlation Coefficient: An Attempt of Survey Georgy Shevlyakov and Pavel Smirnov St. Petersburg State Polytechnic

### PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA

PARTIAL LEAST SQUARES IS TO LISREL AS PRINCIPAL COMPONENTS ANALYSIS IS TO COMMON FACTOR ANALYSIS. Wynne W. Chin University of Calgary, CANADA ABSTRACT The decision of whether to use PLS instead of a covariance

### MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL. by Michael L. Orlov Chemistry Department, Oregon State University (1996)

MULTIPLE LINEAR REGRESSION ANALYSIS USING MICROSOFT EXCEL by Michael L. Orlov Chemistry Department, Oregon State University (1996) INTRODUCTION In modern science, regression analysis is a necessary part

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

### Random Vectors and the Variance Covariance Matrix

Random Vectors and the Variance Covariance Matrix Definition 1. A random vector X is a vector (X 1, X 2,..., X p ) of jointly distributed random variables. As is customary in linear algebra, we will write

### Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

### Models for Count Data With Overdispersion

Models for Count Data With Overdispersion Germán Rodríguez November 6, 2013 Abstract This addendum to the WWS 509 notes covers extra-poisson variation and the negative binomial model, with brief appearances

### Multivariate Normal Distribution

Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

### Mantel Permutation Tests

PERMUTATION TESTS Mantel Permutation Tests Basic Idea: In some experiments a test of treatment effects may be of interest where the null hypothesis is that the different populations are actually from the

### Cointegration. Basic Ideas and Key results. Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board

Cointegration Basic Ideas and Key results Egon Zakrajšek Division of Monetary Affairs Federal Reserve Board Summer School in Financial Mathematics Faculty of Mathematics & Physics University of Ljubljana

### INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

### PITFALLS IN TIME SERIES ANALYSIS. Cliff Hurvich Stern School, NYU

PITFALLS IN TIME SERIES ANALYSIS Cliff Hurvich Stern School, NYU The t -Test If x 1,..., x n are independent and identically distributed with mean 0, and n is not too small, then t = x 0 s n has a standard

### Chapter G08 Nonparametric Statistics

G08 Nonparametric Statistics Chapter G08 Nonparametric Statistics Contents 1 Scope of the Chapter 2 2 Background to the Problems 2 2.1 Parametric and Nonparametric Hypothesis Testing......................

### TESTING THE ONE-PART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWO-PART MODEL

TESTING THE ONE-PART FRACTIONAL RESPONSE MODEL AGAINST AN ALTERNATIVE TWO-PART MODEL HARALD OBERHOFER AND MICHAEL PFAFFERMAYR WORKING PAPER NO. 2011-01 Testing the One-Part Fractional Response Model against