Analisi Statistica per le Imprese

Similar documents
Simple Linear Regression Inference

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

1 Another method of estimation: least squares

2. Linear regression with multiple regressors

Chapter 4: Vector Autoregressive Models

Econometrics Simple Linear Regression

Solución del Examen Tipo: 1

17. SIMPLE LINEAR REGRESSION II

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

Regression Analysis: A Complete Example

Chapter 6: Multivariate Cointegration Analysis

Simple linear regression

ECON 142 SKETCH OF SOLUTIONS FOR APPLIED EXERCISE #2

Chapter 2. Dynamic panel data models

Chapter 13 Introduction to Linear Regression and Correlation Analysis

What s New in Econometrics? Lecture 8 Cluster and Stratified Sampling

IAPRI Quantitative Analysis Capacity Building Series. Multiple regression analysis & interpreting results

Factors affecting online sales

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

4. Simple regression. QBUS6840 Predictive Analytics.

Part 2: Analysis of Relationship Between Two Variables

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

The Big Picture. Correlation. Scatter Plots. Data

APPLICATION OF LINEAR REGRESSION MODEL FOR POISSON DISTRIBUTION IN FORECASTING

SYSTEMS OF REGRESSION EQUATIONS

Chapter 3: The Multiple Linear Regression Model

Estimating Industry Multiples

GLM I An Introduction to Generalized Linear Models

Panel Data Econometrics

16 : Demand Forecasting

Simple Regression Theory II 2010 Samuel L. Baker

Introduction to Longitudinal Data Analysis

LOGIT AND PROBIT ANALYSIS

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

2013 MBA Jump Start Program. Statistics Module Part 3

Vector Time Series Model Representations and Analysis with XploRe

Trend and Seasonal Components

A General Approach to Variance Estimation under Imputation for Missing Survey Data

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Review of Bivariate Regression

The Basic Two-Level Regression Model

From the help desk: Bootstrapped standard errors

5. Multiple regression

The New Luxury World: l identità digitale nel lusso fa la differenza

Integrated Resource Plan

Introduction to Regression and Data Analysis

Regression and Correlation

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

CAPM, Arbitrage, and Linear Factor Models

Financial Risk Management Exam Sample Questions/Answers

Linear Models for Continuous Data

Module 3: Correlation and Covariance

Sample Size and Power in Clinical Trials

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Sales forecasting # 2

Introduction to Data Analysis in Hierarchical Linear Models

A Primer on Forecasting Business Performance

Notes on Applied Linear Regression

Qualitative Choice Analysis Workshop 76 LECTURE / DISCUSSION. Hypothesis Testing

Time Series Analysis

Main Points. File layout Directory layout

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.

Systematic Reviews and Meta-analyses

Correlation key concepts:

Regression III: Advanced Methods

State Space Time Series Analysis

Elements of statistics (MATH0487-1)

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

Introduction to General and Generalized Linear Models

Exercise 1.12 (Pg )

Testing for Granger causality between stock prices and economic growth

DETERMINANTS OF CAPITAL ADEQUACY RATIO IN SELECTED BOSNIAN BANKS

Wooldridge, Introductory Econometrics, 3d ed. Chapter 12: Serial correlation and heteroskedasticity in time series regressions

Exact Nonparametric Tests for Comparing Means - A Personal Summary

Business Cycles, Theory and Empirical Applications

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

Life Table Analysis using Weighted Survey Data

DEPARTMENT OF PSYCHOLOGY UNIVERSITY OF LANCASTER MSC IN PSYCHOLOGICAL RESEARCH METHODS ANALYSING AND INTERPRETING DATA 2 PART 1 WEEK 9

Chapter 7: Simple linear regression Learning Objectives

Statistics for Business Decision Making

Penalized regression: Introduction

1 Teaching notes on GMM 1.

3.3. Solving Polynomial Equations. Introduction. Prerequisites. Learning Outcomes

Is the Forward Exchange Rate a Useful Indicator of the Future Exchange Rate?

Lecture 15. Endogeneity & Instrumental Variable Estimation

Chapter 5: Bivariate Cointegration Analysis

Example: Boats and Manatees

WEB APPENDIX. Calculating Beta Coefficients. b Beta Rise Run Y X

Multiple Linear Regression

Nuovi domini di primo livello - Registra nuove estensioni con FabsWeb_HOST

The New Luxury World: l identità digitale nel lusso fa la differenza

Case Study in Data Analysis Does a drug prevent cardiomegaly in heart failure?

Regression Analysis (Spring, 2000)

Unit 26 Estimation with Confidence Intervals

Transcription:

Analisi Statistica per le Imprese Dip. Economia Politica e Statistica 4. La prospettiva dell'ecienza interna 1 / 27

Obiettivo Ricerca dell'ecienza interna: si tratta determinare il corretto utilizzo delle risorse, dato un volume produzione atteso. 2 / 27

Secondo il BS Gli interventi da eseguire dovranno legare ecienza interna, aumento della produzione/ricavi, investimenti e ricerca della sodsfazione del cliente. Quin secondo la visione rappresentata dal sistema descritto, l'ecienza interna è il primo step una stima simultanea dell'intero sistema. 3 / 27

Formalmente Consideriamo la prima delle relazioni del sistema dove la variabile X può rappresentare grandezze come: spese per il personale spese per le materie prime spese per gli investimenti (capitale) ovvero ciò che riguarda i fattori produttivi utili per ottenere un livello produzione pressato. Veamo quin quali meto statistici possono essere applicati per il raggiungimento dei nostri obiettivi 4 / 27

Deterministic relationship A deterministic relationship f between a dependent variable Y and an independent variables X is expressed as: Y = f (X) (1) In the most simple case, when the relationship is linear and k = 1, it is commonly expressed as follows: Y = β 0 + β 1 X (2) This relationship can be represented graphically on a Cartesian Coornate System as a line with intercept β 0 and slope β 1. 5 / 27

Stochastic relationship However deterministic relationships are very rare due to randomness. The true relationship between a dependent variable Y and an independent variables X is approximated by the stochastic regression model as follows: Y = f (X) + ε (3) The term ε is supposed to incorporate the size of the approximation error. The introduction of such element identies the stochastic relationship. If f is a linear function, the regression model is linear and is expressed as: Y = β 0 + β 1 X + ε (4) The terms (β 0,β 1 ) are the regression parameters or model coecients. 6 / 27

analysis is a method to study functional relationships between variables. The relationship is expressed as an equation or model that links the dependent (tted or prected) variable to one or more independent (prector) variable Example Si supponga aver classicato i clienti una banca secondo due caratteri: redto meo sponibile e frequenza mensile ricorso al servizio Bancomat. La relazione che si vuole vericare è se la frequenza mensile ricorso al servizio Bancomat Y (variabile pendente), vari al variare del redto X. Observing a sample of size n we will have n observations for each variable. 7 / 27

Simple Linear The relationship between the dependent variable and the independent variable is expressed by the following linear model: Y = β 0 + β 1 X + ε (5) The parameters or coecients of regression are (β 0,β 1 ) and ε is the random component of the model. For each observation the model can be written as: Y i = β 0 + β 1 X i + ε i i = 1,...,n (6) 8 / 27

Scatter plot y y (x 2,y 2 ) (x 1,y 1 ) (x 3,y 3 ) (x 4,y 4 ) (x 5,y 5 ) (x 6,y 6 ) x x 9 / 27

Scatter plot y Ŷ = β 0 + β 1 X Ŷ = a + bx Ŷ = c + dx x 10 / 27

Optimal Parameters One must then nd the equation that best approximates the observations with coornates (X, Y). This equation is specied by the following linear model: Ŷ = β 0 + β 1 X (7) Hence one must nd the optimal parameters ( β 0, β 1 ). 11 / 27

(OLS) The method of (OLS) minimizes the following auxiliary function: n i=1 ( ) 2 n ( Y i Ŷ i = Y i β 0 + β 2 1 X) (8) i=1 The function is minimized for the following parameter values: ( β 1 = n i=1 Xi X )( Y i Y ) ( Xi X ) = n i=1 x iy i 2 n (9) i=1 x2 i n i=1 β 0 = Y β 1 X (10) 12 / 27

(OLS) x i = ( X i X ) (11) y i = ( Y i Y ) (12) X = 1 n Y = 1 n n i=1 n i=1 X i (13) Y i (14) 13 / 27

(OLS) A stochastic model is much more realistic then a deterministic model when the observations are not perfectly aligned. The introduction of the ε introduces culties but enhances the results that are more useful and deeper in meaning. First consideration: Justication How can we justify the introduction of the stochastic component? 1 There may be errors in the model 2 The number of covariates is limited 3 Randomness from empirical sampling 4 Measurement errors 14 / 27

(OLS) Second consideration: Random Variable The introduction of the error term allows the identication of Y as a random variable (r.v.). This implies that every value expressed as a function of Y is also a r.v. 15 / 27

(OLS) Third consideration: Assumptions 1 The functional relationship must be linear 2 The independent variable has a deterministic nature 3 The error term has a null expected value E [ε i ] = 0 4 The error term is homoskedastic Var [ε i ] = σ 2 5 The error terms are not autocorrelated Cov[ε i ε j ] i j = 0 16 / 27

Creble assumptions 1 seems restrictive, but many non linear relationships can be expressed as linear relationships of the parameters through appropriate transformations. 2 is possibly the most unrealistic in socio-economic analysis, but guarantees important by E [X i ε i ] = X i E [ε i ] = 0 3 guarantees that the most likely error value is zero, as in this case the mean value is also the modal value. 4 can be unlikely in cross section observations, and can give serious culties. We will focus on this later. 5 can be unlikely in time series observations, and can give serious culties. 17 / 27

We will now examine the of the OLS s of the unknown parameters of the regression function. The estimates are obtained for both (Y, X) from n observations extracted with probabilistic sampling from a target population. If we extracted a new sample from the same population we would most likely obtain erent results, hence the parameters estimates would be erent. For this reason we can say that the estimates are tied to the r.v. When one writes β one means: 1 The slope of the regression line, estimated from a set of observations 2 the that has a specic probability stribution 18 / 27

Properties Given assumptions 1-5, OLS s are the Best Linear Unbiased Estimators (BLUE). This means that they are the most ecient (minimum variance) not systematically storted s within the class of linear s. Here we just prove that it is linear and unbiased. 19 / 27

Linear ( β 1 = n i=1 Xi X )( Y i Y ) n ( i=1 Xi X ) = n i=1 x iy i 2 n i=1 x2 i β 1 = x iy i x 2 i Similarly it can be proven for β 0. = w i y i w i = x i x 2 i (15) (16) w i = 0 w i x i = 1 (17) 20 / 27

Unbiased β 1 = w i y i = w i ( Yi Y ) = w i Y i Y w i }{{} 0 (18) β 1 = w i [β 0 + β 1 X i + ε i ] + 0 = w i β 0 + β 1 w i X i + w i ε i }{{}}{{} 0 1 (19) β 1 = 0 + β 1 + w i ε i (20) [ ] E β1 = E [ ] β 1 + w i ε i = β1 + w i E [ε i ] = β }{{} 1 (21) 0 Similarly it can be proven for β 0. 21 / 27

Let us add to the hypothesis 1-5 the following: The error term has Normal stribution so:ε i N(0,σ 2 ) so: y i N(βX i,σ 2 ) 22 / 27

OLS stribution As β is a weighted average of y and the y are normally stributed, β is also normally stributed. ( β 1 N β 1, σ ε 2 ( β 0 N β 0,σε 2 x 2 i X 2 i ) N x 2 i OLS s coincide with Maximum Likelihood (ML) s. The Central Limit supports that even if the y where not to be normally stributed, under a very nonrestrictive contions, the asymptotic stribution would still hold. ) (22) (23) 23 / 27

Residual Variance Estimator This, called the residual variance, can be obtained through Maximum Likelihood computations (here omitted). s 2 = σ 2 = 2 ε (Y i n 2 = i β 0 β ) 2 1 X i n 2 (24) ε i = Y i Ŷ i (Residual) (25) The residual variance is an unbiased and consistent of the variance of the error term. The denominator is n 2 because that is the number of degrees of freedom which are calculated as the number of observations minus the required parameters in the computation. In this case to calculate this one needs both (β 0,β 1 ). 24 / 27

Further observations ) Var ( β1 is a rect function of Var (ε i ) strongly variable errors lower precision and reliability ) Var ( β1 is an inrect function of the variability of X i covariates concentrated in a close interval worsen the quality of β 1 25 / 27

OLS Estimators standard error Once we have estimated the variance of the stochastic term in the regression model, we may substitute it in the OLS 's variance to obtain the standard errors (s.e.) of s. s 2 β1 = s2 x 2 i s 2 β0 = s 2 ( X 2 i n x 2 i ) (26) (27) 26 / 27

Pindyck R. and Rubinfeld D., 1998 Econometrics Models and Economic Forecasts (4-th Ed.) Bracalente, Cossignani, Mulas, 2009 Statistica Aziendale sec.4.1 27 / 27