4 Hypothesis testing in the multiple regression model



Similar documents
SIMPLE LINEAR CORRELATION

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

How To Calculate The Accountng Perod Of Nequalty

CHAPTER 14 MORE ABOUT REGRESSION

The OC Curve of Attribute Acceptance Plans

An Alternative Way to Measure Private Equity Performance

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Economic Interpretation of Regression. Theory and Applications

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

5 Multiple regression analysis with qualitative information

STATISTICAL DATA ANALYSIS IN EXCEL

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Recurrence. 1 Definitions and main statements

Can Auto Liability Insurance Purchases Signal Risk Attitude?

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

Calculation of Sampling Weights

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

Analysis of Premium Liabilities for Australian Lines of Business

Section 5.4 Annuities, Present Value, and Amortization

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

PRIVATE SCHOOL CHOICE: THE EFFECTS OF RELIGIOUS AFFILIATION AND PARTICIPATION

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank.

1. Measuring association using correlation and regression

HOUSEHOLDS DEBT BURDEN: AN ANALYSIS BASED ON MICROECONOMIC DATA*

What is Candidate Sampling

An Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35, , ,200,000 60, ,000

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

1 Example 1: Axis-aligned rectangles

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

Statistical Methods to Develop Rating Models

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

The Application of Fractional Brownian Motion in Option Pricing

DEFINING %COMPLETE IN MICROSOFT PROJECT

Criminal Justice System on Crime *

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

Implied (risk neutral) probabilities, betting odds and prediction markets

Trade Adjustment and Productivity in Large Crises. Online Appendix May Appendix A: Derivation of Equations for Productivity

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

DO LOSS FIRMS MANAGE EARNINGS AROUND SEASONED EQUITY OFFERINGS?

Gender differences in revealed risk taking: evidence from mutual fund investors

SUPPLIER FINANCING AND STOCK MANAGEMENT. A JOINT VIEW.

Production. 2. Y is closed A set is closed if it contains its boundary. We need this for the solution existence in the profit maximization problem.

Risk Model of Long-Term Production Scheduling in Open Pit Gold Mining

High Correlation between Net Promoter Score and the Development of Consumers' Willingness to Pay (Empirical Evidence from European Mobile Markets)

Extending Probabilistic Dynamic Epistemic Logic

Calibration and Linear Regression Analysis: A Self-Guided Tutorial

The impact of hard discount control mechanism on the discount volatility of UK closed-end funds

Traffic-light a stress test for life insurance provisions

The Mathematical Derivation of Least Squares

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Course outline. Financial Time Series Analysis. Overview. Data analysis. Predictive signal. Trading strategy

Lecture 3: Force of Interest, Real Interest Rate, Annuity

ANALYZING THE RELATIONSHIPS BETWEEN QUALITY, TIME, AND COST IN PROJECT MANAGEMENT DECISION MAKING

Efficient Project Portfolio as a tool for Enterprise Risk Management

BERNSTEIN POLYNOMIALS

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

ADVERSE SELECTION IN INSURANCE MARKETS: POLICYHOLDER EVIDENCE FROM THE U.K. ANNUITY MARKET *

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

IS-LM Model 1 C' dy = di

7.5. Present Value of an Annuity. Investigate

PERRON FROBENIUS THEOREM

Forecasting the Direction and Strength of Stock Market Movement

Meta-Analysis of Hazard Ratios

THE EFFECT OF PREPAYMENT PENALTIES ON THE PRICING OF SUBPRIME MORTGAGES

Capital efficiency and market value in knowledge and capitalintensive firms: an empirical study

Dynamics of Toursm Demand Models in Japan

Support Vector Machines

! # %& ( ) +,../ # 5##&.6 7% 8 # #...

Financial Instability and Life Insurance Demand + Mahito Okura *

Implementation of Deutsch's Algorithm Using Mathcad

Kiel Institute for World Economics Duesternbrooker Weg Kiel (Germany) Kiel Working Paper No. 1120

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

WORKING PAPERS. The Impact of Technological Change and Lifestyles on the Energy Demand of Households

Hedging Interest-Rate Risk with Duration

UK Letter Mail Demand: a Content Based Time Series Analysis using Overlapping Market Survey Statistical Techniques

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

Texas Instruments 30X IIS Calculator

Lecture 3: Annuity. Study annuities whose payments form a geometric progression or a arithmetic progression.

Problem Set 3. a) We are asked how people will react, if the interest rate i on bonds is negative.

Chapter 4 ECONOMIC DISPATCH AND UNIT COMMITMENT

Efficiency Test on Taiwan s Life Insurance Industry- Using X-Efficiency Approach

THE DETERMINANTS OF THE TUNISIAN BANKING INDUSTRY PROFITABILITY: PANEL EVIDENCE

1 De nitions and Censoring

Transcription:

4 Hypothess testng n the multple regresson model Ezequel Urel Unversdad de Valenca Verson: 9-13 4.1 Hypothess testng: an overvew 1 4.1.1 Formulaton of the null hypothess and the alternatve hypothess 4.1. Test statstc 4.1.3 Decson rule 3 4. Testng hypotheses usng the t test 5 4..1 Test of a sngle parameter 5 4.. Confdence ntervals 16 4..3 Testng hypothess about a sngle lnear combnaton of the parameters 17 4..4 Economc mportance versus statstcal sgnfcance 1 4.3 Testng multple lnear restrctons usng the F test. 1 4.3.1 Excluson restrctons 4.3. Model sgnfcance 6 4.3.3 Testng other lnear restrctons 7 4.3.4 Relaton between F and t statstcs 8 4.4 Testng wthout normalty 9 4.5 Predcton 3 4.5.1 Pont predcton 3 4.5. Interval predcton 3 4.5.3 Predctng y n a ln(y) model 34 4.5.4 Forecast evaluaton and dynamc predcton 34 Exercses 36 4.1 Hypothess testng: an overvew Before testng hypotheses n the multple regresson model, we are gong to offer a general overvew on hypothess testng. Hypothess testng allows us to carry out nferences about populaton parameters usng data from a sample. In order to test a hypothess n statstcs, we must perform the followng steps: 1) Formulate a null hypothess and an alternatve hypothess on populaton parameters. ) Buld a statstc to test the hypothess made. 3) Defne a decson rule to reect or not to reect the null hypothess. Next, we wll examne each one of these steps. 4.1.1 Formulaton of the null hypothess and the alternatve hypothess Before establshng how to formulate the null and alternatve hypothess, let us make the dstncton between smple hypotheses and composte hypotheses. The hypotheses that are made through one or more equaltes are called smple hypotheses. The hypotheses are called composte when they are formulated usng the operators "nequalty", "greater than" and "smaller than". It s very mportant to remark that hypothess testng s always about populaton parameters. Hypothess testng mples makng a decson, on the bass of sample data, on whether to reect that certan restrctons are satsfed by the basc assumed model. The restrctons we are gong to test are known as the null hypothess, denoted by H. Thus, null hypothess s a statement on populaton parameters. 1

Although t s possble to make composte null hypotheses, n the context of the regresson model the null hypothess s always a smple hypothess. That s to say, n order to formulate a null hypothess, whch shall be called H, we wll always use the operator equalty. Each equalty mples a restrcton on the parameters of the model. Let us look at a few examples of null hypotheses concernng the regresson model: a) H : 1 = b) H : 1 + = c) H : 1 = = d) H : + 3 =1 We wll also defne an alternatve hypothess, denoted by H 1, whch wll be our concluson f the expermental test ndcates that H s false. Although the alternatve hypotheses can be smple or composte, n the regresson model we wll always take a composte hypothess as an alternatve hypothess. Ths hypothess, whch shall be called H 1, s formulated usng the operator nequalty n most cases. Thus, for example, gven the H : H : 1 (4-1) we can formulate the followng H 1 : H : 1 1 (4-) whch s a two sde alternatve hypothess. The followng hypotheses are called one sde alternatve hypotheses H : 1 1 (4-3) H : 1 1 (4-4) 4.1. Test statstc A test statstc s a functon of a random sample, and s therefore a random varable. When we compute the statstc for a gven sample, we obtan an outcome of the test statstc. In order to perform a statstcal test we should know the dstrbuton of the test statstc under the null hypothess. Ths dstrbuton depends largely on the assumptons made n the model. If the specfcaton of the model ncludes the assumpton of normalty, then the approprate statstcal dstrbuton s the normal dstrbuton or any of the dstrbutons assocated wth t, such as the Ch-square, Student s t, or Snedecor s F. Table 4.1 shows some dstrbutons, whch are approprate n dfferent stuatons, under the assumpton of normalty of the dsturbances. TABLE 4.1. Some dstrbutons used n hypothess testng. Known 1 restrcton 1 or more restrctons N Ch-square Unknown Student s t Snedecor s F

The statstc used for the test s bult takng nto account the H and the sample data. In practce, as s always unknown, we wll use the dstrbutons t and F. 4.1.3 Decson rule We are gong to look at two approaches for hypothess testng: the classcal approach and an alternatve one based on p-values. But before seeng how to apply the decson rule, we shall examne the types of mstakes that can be made n testng hypothess. Types of errors n hypothess testng In hypothess testng, we can make two knds of errors: Type I error and Type II error. Type I error We can reect H when t s n fact true. Ths s called Type I error. Generally, we defne the sgnfcance level () of a test as the probablty of makng a Type I error. Symbolcally, Pr( Reect H H ) (4-5) In other words, the sgnfcance level s the probablty of reectng H gven that H s true. Hypothess testng rules are constructed makng the probablty of a Type I error farly small. Common values for are.1,.5 and.1, although sometmes.1 s also used. After we have made the decson of whether or not to reect H, we have ether decded correctly or we have made an error. We shall never know wth certanty whether an error was made. However, we can compute the probablty of makng ether a Type I error or a Type II error. Type II error We can fal to reect H when t s actually false. Ths s called Type II error. Pr( No reect H H ) (4-6) 1 In words, s the probablty of not reectng H gven that H 1 s true. It s not possble to mnmze both types of error smultaneously. In practce, what we do s select a low sgnfcance level. Classcal approach: Implementaton of the decson rule The classcal approach mples the followng steps: a) Choosng. Classcal hypothess testng requres that we ntally specfy a sgnfcance level for the test. When we specfy a value for, we are essentally quantfyng our tolerance for a Type I error. If =.5, then the researcher s wllng to falsely reect H 5% of the tme. b) Obtanng c, the crtcal value, usng statstcal tables. The value c s determned by. 3

The crtcal value (c) for a hypothess test s a threshold to whch the value of the test statstc n a sample s compared to determne whether or not the null hypothess s reected. c) Comparng the outcome of the test statstc, s, wth c, H s ether reected or not for a gven. The reecton regon (RR), delmted by the crtcal value(s), s a set of values of the test statstc for whch the null hypothess s reected. (See fgure 4.1). That s, the sample space for the test statstc s parttoned nto two regons; one regon (the reecton regon) wll lead us to reect the null hypothess H, whle the other wll lead us not to reect the null hypothess. Therefore, f the observed value of the test statstc S s n the crtcal regon, we conclude by reectng H ; f t s not n the reecton regon then we conclude by not reectng H or falng to reect H. Symbolcally, If s c reect H (4-7) If s c not reect H If the null hypothess s reected wth the evdence of the sample, ths s a strong concluson. However, the acceptance of the null hypothess s a weak concluson because we do not know what the probablty s of not reectng the null hypothess when t should be reected. That s to say, we do not know the probablty of makng a type II error. Therefore, nstead of usng the expresson of acceptng the null hypothess, t s more correct to say fal to reect the null hypothess, or not reect, snce what really happens s that we do not have enough emprcal evdence to reect the null hypothess. In the process of hypothess testng, the most subectve part s the a pror determnaton of the sgnfcance level. What crtera can be used to determne t? In general, ths s an arbtrary decson, though, as we have sad, the 1%, 5% and 1% levels for are the most used n practce. Sometmes the testng s made condtonal on several sgnfcance levels. Non Reecton Regon NRR Reecton Regon RR FIGURE 4.1. Hypothess testng: classcal approach. An alternatve approach: p-value Wth the use of computers, hypothess testng can be contemplated from a more ratonal perspectve. Computer programs typcally offer, together wth the test statstc, a probablty. Ths probablty, whch s called p-value (.e., probablty value), s also known as the crtcal or exact level of sgnfcance or the exact probablty of makng a c W 4

Type I error. More techncally, the p value s defned as the lowest sgnfcance level at whch a null hypothess can be reected. Once the p-value has been determned, we know that the null hypothess s reected for any p-value, whle the null hypothess s not reected when <p-value. Therefore, the p-value s an ndcator of the level of admssblty of the null hypothess: the hgher the p-value, the more confdence we can have n the null hypothess. The use of the p-value turns hypothess testng around. Thus, nstead of fxng a pror the sgnfcance level, the p-value s calculated to allow us to determne the sgnfcance levels of those n whch the null hypothess s reected. In the followng sectons, we wll see the use of p value n hypothess testng put nto practce. 4. Testng hypotheses usng the t test 4..1 Test of a sngle parameter The t test Under the CLM assumptons 1 through 9, If we typfy ˆ ~, var( ˆ N ) 1,,3,, k (4-8) ˆ ˆ ~ N,1 1,,3,, k (4-9) ˆ ( ˆ var( ) sd ) The clam for normalty s usually made on the bass of the Central Lmt Theorem (CLT), but ths s restrctve n some cases. That s to say, normalty cannot always be assumed. In any applcaton, whether normalty of u can be assumed s really an emprcal matter. It s often the case that usng a transformaton,.e. takng logs, yelds a dstrbuton that s closer to normalty, whch s easy to handle from a mathematcal pont of vew. Large samples wll allow us to drop normalty wthout affectng the results too much. Under the CLM assumptons 1 through 9, we obtan a Student s t dstrbuton bˆ -b t (4-1) n-k se( bˆ ) where k s the number of unknown parameters n the populaton model (k-1 slope parameters and the ntercept, 1 ). The expresson (4-1) s mportant because t allows us to test a hypothess on. If we compare (4-1) wth (4-9), we see that the Student s t dstrbuton derves from the fact that the parameter n sd( ˆ ) has been replaced by ts estmator ˆ, whch s a random varable. Thus, the degrees of freedom of t are n-1-k correspondng to the degrees of freedom used n the estmaton of ˆ. When the degrees of freedom (df) n the t dstrbuton are large, the t dstrbuton approaches the standard normal dstrbuton. In fgure 4., the densty functon for normal and t dstrbutons for dfferent df are represented. As can be seen, 5

the t densty functons are flatter (platycurtc) and the tals are wder than normal densty functon, but as df ncreases, t densty functons are closer to the normal densty. In fact, what happens s thatt the t dstrbuton takes nto account that s estmated because t s unknown. Gven ths uncertanty, the t dstrbuton extends more than the normal one. However, as the df grows the t-dstrbuton s nearer to the normal dstrbuton becausee the uncertanty of not knowng decreases. Therefore, the followng convergence n dstrbuton should be kept n mnd: tn N (,1) n (4-11)( Thus, when the number of degrees of freedom of a Student s S t tends to nfnty, the t dstrbuton converges towards a dstrbuton N(.1). In the context of testng a hypothess, f the sample sze grows, so wll the degrees of freedom. Ths means that for large szes the normal dstrbuton can be used to test hypothess wth one unque restrcton, even when you do not know the populaton varance. As a practcal rule, when the df are larger than 1, we can take the crtcal values from the normal dstrbuton. FIGURE E 4.. Densty functons: normal and t for dfferent degrees of freedom. Consder the null hypothess, H : Snce measures the partal effect of x on y after controllng for all other ndependent varables, H : means that, once x, x 3,,x 1, x +1,, x k have been accounted for, x has no effect on y. Ths s called a sgnfcance test. Thee statstc we use to test H :, aganstt any alternatve, s called the t statstc or the t rato of ˆ and s expressed as t ˆ ˆ se( ˆ ) In order to test H :, t s natural to look at our unbasedd estmator of, ˆ. In a gven sample ˆ wll never be exactly zero, but a small value wll ndcate that 6

the null hypothess could be true, whereas a large value wll ndcate a false null hypothess. The queston s: how far s ˆ from zero? We must recognze that there s a samplng error n our estmate ˆ, and thus the sze of ˆ must be weghted aganst ts samplng error. Ths s precsely what we do when we use t ˆ, snce ths statstc measures how many standard errors ˆ s away from zero. In order to determne a rule for reectng H, we need to decde on the relevant alternatve hypothess. There are three possbltes: one-tal alternatve hypotheses (rght and left tal), and two-tal alternatve hypothess. One-tal alternatve hypothess: rght Frst, let us consder the null hypothess aganst the alternatve hypothess H : H : 1 Ths s a postve sgnfcance test. In ths case, the decson rule s the followng: Decson rule If If t t reect H ˆ nk t t not reect H ˆ nk (4-1) Therefore, we reect H : n favor of H : 1 at when t ˆ t nk as can be seen n fgure 4.3. It s very clear that to reect H aganst H : 1, we must get a postve t ˆ. A negatve t ˆ, no matter how large, provdes no evdence n favor of H : 1. On the other hand, n order to obtan t n k n the t statstcal table, we only need the sgnfcance level and the degrees of freedom. It s mportant to remark that as decreases, t ncreases. n k To a certan extent, the classcal approach s somewhat arbtrary, snce we need to choose n advance, and eventually H s ether reected or not. In fgure 4.4, the alternatve approach s represented. As can be seen by observng the fgure, the determnaton of the p-value s the nverse operaton to fnd the value of the statstcal tables for a gven sgnfcance level. Once the p-value has been determned, we know that H s reected for any level of sgnfcance of >p-value, whle the null hypothess s not reected when <p-value. 7

FIG URE 4.3. Reecton regon usng t: rght-tal alternatve hypothess. EXAMPLE 4.1 Is the margnal propensty to consume smaller than the average a propensty to consume? As seen n example 1.1, testng the 3rd proposton of the Keynesan consumpton functon n a lnear model, s equvalent to testng whether the ntercept s sgnfcatve1y greater than. That s to say, n the model we must test whether 1 Wth a random sample of 4 observatons, the followng results have been obtaned The numbers n parentheses, below the estmates, are standard errors e (se) of the estmators.. The queston we posee s the followng: s the thrd proposton of thee Keynesan theory admssble? Next, we answer ths queston. 1) In ths case, the null and alternatve hypotheses are the followng: H : 1 H : ) The testt statstc s: conss nc u cons ˆ t se( ˆ ) 1 1 1 3) Decson rule It s useful to use several sgnfcance levels. Let us begn wth w a sgnfcance level of.1 because the value of t s relatvely small (smaller than 1.5). In ths case, the t degrees off freedom are 4 (4 observatons mnus estmated parameters). p If we look at the t statstcal table (row 4 and column.1,.1 or., n statstcal tables wth one tal, or two tals, respectvely), we fnd t4 1.33 As t<1.33, we do not reect H for =.1, and therefore we cannot reect for = =.5 (.5 t4 1.684 ) or =.1 ( t.1 4.43 ), as can been n fgure 4.5. In ths fgure, the reecton regon corresponds to =.1. Therefore, we cannott reect H n favor H 1. In other words, the sample data are not consstent wth Keynes s proposton 3. In the alternatve approach, as can be seen n fgure 4.6, thee p-value correspondng to a t ˆ 1 =1.171 for a t wth 4 df s equal to.14. For <.14 - for example,.1,.5 and.1-, H s not reected. FIGU 1 =.41 +.843nc (.35) (.6) 1 1 ˆ 1.41 1.171 se( ˆ ).35 1 URE 4.4. p-value usng t: rght-tal alternatve hypothess. 8

FIGU URE 4.5. Example 4.1: Reecton regon usng t wth a rght-tal alternatve hypothess. GURE 4.6. Example 4.1: p-value usng t wth rght-tal alternatve hypothess. FIG One-taaganst the alternatve hypothess H : 1 alternatve hypothess: left Consder now the null hypothess H : Ths s a negatve sgnfcance test. In ths case, the decson rule ss the followng: Decson rule If I If I t ˆ t ˆ t t nk nk reect not reect H H (4-13)( t ˆ t n H : 1 prov Therefore, we reect H : n favor of H 1 : at a gven when, as can be seen n fgure, we must get a negatve 4.7. It s t ˆ des no evdence n favor of H : 1.. A postve very clearr that to reect H aganst In fgure 4.8 the alternatve approach s represented. Once the p-value has been determned, we know thatt H s reected for any level of sgnfcance of >p-value, whle the null hypothess ss not reected when <p-value. t ˆ, no matter how large t s, 9

Reecton Regon RR Non Reecton Regon NRR tn k Non reected for α>p-value Reected for ɑ<p-value tn k p-value GURE 4.7. Reecton regon usng t: left-tal alternatve hypothess. FIG t nk FIGU t ˆ URE 4.8. p-value usng t: left-tal alternatve hypothess. MPLE 4. Has ncome a negatve nfluence on nfant mortalty? The followng model has been used to explan the deaths of chldren under 5 years per 1 lve brths (deathun5). deathun5 gnpc ltrate u 1 3 EXAM wheree gnpc s the gross natonall ncome per capta and ltrate s the adult (% 15 and older) llteracy rate n percentage. Wth a sample of 13 countres (workfle hdr1), the followng estmaton has been obtaned: deathun 5 = 7.91-.86 gnpc +.43ltrate (5.93) (.8) The numbers n parentheses, below the estmates, are standard errorss (se) of the estmators. One of the questons posed by researchers s whether ncome has h a negatvee nfluence on nfant mortalty. To answer ths queston, the followng hypothess testng s carred out: The null and alternatvee hypotheses, and the test statstc, are thee followng: H : ˆ.866 t.966 H1 : se( ˆ ).8 Snce the t value s relatvely r hgh, let us start testng wth a level off 1%. For =.1,.1.1 t13 1 t6.39. Gven that t<-.39, as s shown n fgure 4.9, we reect H n favour of H 1. Therefore, the gross natonal ncome per capta has an nfluence that s sgnfcantly s negatve n mortalty of chldren under 5.That s to say, the hgher the gross natonal ncome per capta the lower the percentage of mortalty of chldren under 5.. As H has been reected for =.1, t wll also be reected for levels of 5% and 1%. In the alternatve approach, as can be seen n fgure 4.1, the p-value p correspondng to a t ˆ =- 1.966 for a t wth 61 df s equal to t.. Forr all >., such as.1,.5 and.1, H s reected. (.183) FIGU URE 4.9. Example 4.: Reecton regon usng t wth a left-tal alternatve hypothess. FIGU RE 4.1. Example 4.: p-value usng t wth a left-tal alternatve a hypothess. Two-tal alternatve hypothess Consder now the null hypothess H : 1

aganst the alternatve hypothess H : 1 Ths s the relevantt alternatvee when the sgn of s not well determned by theory or common sense. When the alternatve s two-sded, we are nterested n the absolute value of the t statstc. Ths ss a sgnfcance test. In ths case, the decson rule ss the followng: Decson rule If If t ˆ t ˆ t t / nk / nk reect not reect H H (4-14)( Therefore, we reect H : n favor as can be seen n fgure 4. 11. In ths case, n order to reectt H aganstt H : 1, we must obtan a large enough whch s ether postve or negatve. t ˆ of H : 1 at when It s mportant to remark that ass decreases, t ncr reases n absolute value. In the alternatve approach, once the p-value has been determned, we know that whle H s reected for any level of sgnfcance of >p-value, the null hypothess s not reected when <p-value. In I ths case, the p-value s dstrbuted between both tals n a symmetrcal way, as s shown n fguree 4.1. / n k t ˆ /, t nk FIG When a specfc alternatve hypothess s not stated, t s usually consdered to be two-sdethat x s statstcally sgnfcant at thee level hypothess testng. If H s reected n favor of H 1 at a gven, we usually say. EXAM GURE 4.11. Reecton regon usng t: two-tal alternatve hypothess. MPLE 4.3 Has the rate of crme play a role n the prce of houses n an area? To explan housng prces n an Amercan town, the followng model m s estmated: prce rooms lowstat crmee u 1 wheree rooms s the number of rooms of the house, lowstat s the percentage of people of lower status n the area and crme s crmes commtted per capta n the area. The outpu for the ftted model, usngg the fle hprce (frst 55 observatons), o appears n table 4. and has been taken from E-vews. The meanng of the frst three columns s clear: t-statstc s the outcome to perform a sgnfcance test, that s to say, t s the rato between the Coeffcent and the Std error ; and Prob s the p-value to perform a two-taled test. FIGU 3 URE 4.1. p-value usng t: two-tal alternatve hypothess. 4 11

In relaton to ths model, the researcher questonss whether the rate of crmee n an area plays a role n the prce of houses n thatt area. To answer ths queston, the followng procedure has been carred out. In ths case, the null and alternatve hypothess and the test statstc are the followng: H : 4 H : 1 4 ˆ 4 3854 t 4.16 se( ˆ ) 96 4 TABLE 4.. Standardd output n the regresson explanng house prce. n= =55. Varable Coeffcent Std. Error t-statstc Prob. C -15693.61 81.989-1.95634.559 ROOMS 6788.41 11.7 5.6691. LOWSTAT -68.1636 8.7678-3.369.17 CRIME -3853.564 959.5618-4.1596. Snce the t value s relatvely hgh, let us by start testng wth w a level of 1%. For =.1,.1/.1/ t 51 t 5.69. (In the usual statstcal tables for t dstrbuton, there s no nformaton for each e df above ). Gven that t.69, we reect H n favour of H 1. Therefore, crme has a sgnfcant nfluence on housng prces for a sgnfcance level of 1% % and, thus, of 5% and 1% %. In the alternatve approach, we can perform the test wth more precson. p In table 4. we see that the p-value for the coeffcent of crme s.. That means that the probablty p of f the t statstcc beng greater than 4.16 s.1 and the probablty of t beng smaller than -4.16 s.1. That s to say, the p-value, as shown n Fgure 4.13, s dstrbuted n the two tals. As can be seen n ths fgure, H s reected for all sgnfcance levels greater thann., such as.1,.5 and.1. FIGURE 4.13. Example 4.3: p-value usng t wth a two-tal alternatve hypothess. So far we have seen sgnfcant tests of one-tal and two-tals, n whch a parameter takes the value n H. Now we are gong to look at a more generall case where the parameter n H takes t any value: Thus, the approprate t statstcc s H t ˆ : ˆ se( ˆ ) As before, t ˆ measures how from the hypotheszed value of. many estmated standard devatons ˆ s away EXAM MPLE 4.4 Is the elastcty expendture n frut/ncome equal to 1? Iss frut a luxury good? To answer these questons, we are gong to use the followng model for the expendture n frut: f ln( frut) ln(nc) househsze punders u 1 3 4 1

where nc s dsposable ncome of household, househsze s the number of household members and punder5 s the proporton of chldren under fve n the household. As the varables frut and nc appear expressed n natural logarthms, then s the expendture n frut/ncome elastcty. Usng a sample of 4 households (workfle demand), the results of table 4.3 have been obtaned. TABLE 4.3. Standard output n a regresson explanng expendture n frut. n=4. Varable Coeffcent Std. Error t-statstc Prob. C -9.767654 3.71469 -.638859.1 LN(INC).4539.5137 3.9186.4 HOUSEHSIZE -1.5348.178646-6.747147. PUNDER5 -.17946.13-1.37818.1767 Is the expendture n frut/ncome elastcty equal to 1? To answer ths queston, the followng procedure has been carred out: In ths case, the null and alternatve hypothess and the test statstc are the followng: H : 1 ˆ ˆ 1.5 1 t 1.961 H1: 1 se( ˆ ˆ.51 ) se( ).1/.1/ For =.1, we fnd that t t. As t >1.69, we reect H. For =.5, t.5/.5/ 36 t35.3 36 35 1.69. As t <.3, we do not reect H for =.5, nor for =.1. Therefore, we reect that the expendture on frut/ncome elastcty s equal to 1 for =.1, but we cannot reect t for =.5, nor for =.1. Is frut a luxury good? Accordng to economc theory, a commodty s a luxury good when ts expendture elastcty wth respect to ncome s hgher than 1. Therefore, to answer to the second queston, and takng nto account that the t statstc s the same, the followng procedure has been carred out: t For =.1, we fnd that t.5.5 36 t35 1.69 H : 1 H1: 1..1.1 36 t35 1.31. As t>1.31, we reect H n favour of H 1. For =.5,. As t>1.69, we reect H n favour of H 1. For =.1, t.1.1 36 t35.44. As t<.44, we do not reect H. Therefore, frut s a luxury good for =.1 and =.5, but we cannot reect H n favour of H 1 for =.1. EXAMPLE 4.5 Is the Madrd stock exchange market effcent? Before answerng ths queston, we wll examne some prevous concepts. The rate of return of an asset over a perod of tme s defned as the percentage change n the value nvested n the asset durng that perod of tme. Let us now consder a specfc asset: a share of an ndustral company acqured n a Spansh stock market at the end of one year and remans untl the end of next year. Those two moments of tme wll be denoted by t-1 and t respectvely. The rate of return of ths acton wthn that year can be expressed by the followng relatonshp: D Pt + Dt + At RA (4-15) t where Pt: s the share prce at the end of perod t, Dt: are the dvdends receved by the share durng the perod t, and At: s the value of the rghts that eventually corresponded to the share durng the perod t Thus, the numerator of (4-15) summarzes the three types of captal gans that have been receved for the mantenance of a share n year t; that s to say, an ncrease or decrease n quotaton, dvdends and rghts on captal ncrease. Dvdng by Pt-1, we obtan the rate of proft on share value at the end of the prevous perod. Of these three components, the most mportant one s the ncrease n quotaton. Consderng only that component, the yeld rate of the acton can be expressed by DP RA (4-16) Pt - 1 t 1t Pt - 1 13

or, alternatvely f we use a relatve rate of varaton, by RA DlnP t (4-17) In the same way as Rat represents the rate of return of a partcular share n ether of the two expressons, we can also calculate the rate of return of all shares lsted n the stock exchange. The latter rate of return, whch wll be denoted by RMt, s called the market rate of return. So far we have consdered the rate of return n a year, but we can also apply expressons such as (4-16), or (4-17), to obtan daly rates of return. It s nterestng to know whether the rates of return n the past are useful for predctng rates of return n the future. Ths queston s related to the concept of market effcency. A market s effcent f prces ncorporate all avalable nformaton, so there s no possblty of makng abnormal profts by usng ths nformaton. In order to test the effcency of a market, we defne the followng model, usng daly rates of return defned by (4-16): rmad9 t 1rmad9t 1 ut (4-18) If a market s effcent, then the parameter of the prevous model must be. Let us now compare whether the Madrd Stock Exchange s effcent as a whole. The model (4-18) has been estmated wth daly data from the Madrd Stock Exchange for 199, usng fle bolmadef. The results obtaned are the followng: rmad 9t.4+.167 rmad9t- 1 (.7) (.69) R =.163 n=47 The results are paradoxcal. On the one hand, the coeffcent of determnaton s very low (.163), whch means that only 1.63% of the total varance of the rate of return s explaned by the prevous day s rate of return. On the other hand, the coeffcent correspondng to the rate of sgnfcance of the prevous day s statstcally sgnfcant at a level of 5% but not at a level of 1% gven that the t.1.1 statstc s equal to.167/.69=., whch s slghtly larger n absolute value than t45 t6 =.. The reason for ths apparent paradox s that the sample sze s very hgh. Thus, although the mpact of the explanatory varable on the endogenous varable s relatvely small (as ndcated by the coeffcent of determnaton), ths fndng s sgnfcant (as evdenced by the statstcal t) because the sample s suffcently large. To answer the queston as to whether the Madrd Stock Exchange s an effcent market, we can say that t s not entrely effcent. However, ths response should be qualfed. In fnancal economcs there s a dependency relatonshp of the rate of return of one day wth respect to the rate correspondng to the prevous day. Ths relatonshp s not very strong, although t s statstcally sgnfcant n many world stock markets due to market frctons. In any case, market players cannot explot ths phenomenon, and thus the market s not neffcent, accordng to the above defnton of the concept of effcency. EXAMPLE 4.6 Is the rate of return of the Madrd Stock Exchange affected by the rate of return of the Tokyo Stock Exchange? The study of the relatonshp between dfferent stock markets (NYSE, Tokyo Stock Exchange Madrd Stock Exchange, London Stock Exchange, etc.) has receved much attenton n recent years due to the greater freedom n the movement of captal and the use of foregn markets to reduce the rsk n portfolo management. Ths s because the absence of perfect market ntegraton allows dversfcaton of rsk. In any case, there s a world trend toward a greater global ntegraton of fnancal markets n general and stock markets n partcular. If markets are effcent, and we have seen n example 4.5 that they are, the nnovatons (new nformaton) wll be reflected n the dfferent markets for a perod of 4 hours. It s mportant to dstngush between two types of nnovatons: a) global nnovatons, whch s news generated around the world and has an nfluence on stock prces n all markets, b) specfc nnovatons, whch s the nformaton generated durng a 4 hour perod and only affects the prce of a partcular market. Thus, nformaton on the evoluton of ol prces can be consdered as a global nnovaton, whle a new fnancal sector regulaton n a country would be consdered a specfc nnovaton. Accordng to the above dscusson, stock prces quoted at a sesson of a partcular stock market are affected by the global nnovatons of a dfferent market whch had closed earler. Thus, global nnovatons ncluded n the Tokyo market wll nfluence the market prces of Madrd on the same day. t 14

The followng model shows the transmsson of effects between the Tokyo Stock Exchange and the Madrd Stock Exchange n 199: rmad9 t = 1 + rtok9 t +u t (4-19) where rmad9t s the rate of return of the Madrd Stock Exchange n perod t and rtok9 t s the rate of return of the Tokyo Stock Exchange n perod t. The rates of return have been calculated accordng to (4-16). In the workng fle madtok you can fnd general ndces of the Madrd Stock Exchange and the Tokyo Stock Exchange durng the days both exchanges were open smultaneously n 199. That s, we elmnated observatons for those days when any one of the two stock exchanges was closed. In total, the number of observatons s 34, compared to the 47 and 46 days that the Madrd and Tokyo Stock Exchanges were open. The estmaton of the model (4-19) s as follows: rmad 9.5+.144 rtok9 t (.7) (.375) R =.45 n=35 Note that the coeffcent of determnaton s relatvely low. However, for testng H : =, the statstc t = (.144/.375) = 3.3, whch mples that we reect the hypothess that the rate of return of the Tokyo Stock Exchange has no effect on the rate of return of the Madrd Stock Exchange, for a sgnfcance level of.1. Once agan we fnd the same apparent paradox whch appeared when we analyzed the effcency of the Madrd Stock Exchange n example 4.5 except for one dfference. In the latter case, the rate of return from the prevous day appeared as sgnfcant due to problems arsng n the elaboraton of the general ndex of the Madrd Stock Exchange. Consequently, the fact that the null hypothess s reected mples that there s emprcal evdence supportng the theory that global nnovatons from the Tokyo Stock Exchange are transmtted to the quotes of the Madrd Stock Exchange that day. 4.. Confdence ntervals Under the CLM, we can easly construct a confdence nterval (CI) for the populaton parameter,. CI are also called nterval estmates because they provde a range of lkely values for, and not ust a pont estmate. The CI s bult n such a way that the unknown parameter s contaned wthn the range of the CI wth a prevously specfed probablty. By usng the fact that bˆ -b tn-k se( bˆ ) ˆ / / Pr tnk tnk 1 se( ˆ ) Operatng to put the unknown alone n the mddle of the nterval, we have ˆ ˆ / ˆ ˆ / Pr se( ) t n k se( ) t nk 1 Therefore, the lower and upper bounds of a (1-) CI respectvely are gven by ˆ ˆ / se( ) t n k ˆ ( ˆ ) / se t n k t 15

If random samples were obtaned over and over agan wth, and computed each tme, then the (unknown) populaton value would le n the nterval (, ) for (1 )% of the samples. Unfortunately, for the sngle sample that we use to construct CI, we do not know whether s actually contaned n the nterval. Once a CI s constructed, t s easy to carry out two-taled hypothess tests. If the null hypothess s H : a, then H s reected aganst H : 1 a at (say) the 5% sgnfcance level f, and only f, a s not n the 95% CI. To llustrate ths matter, n fgure 4.14 we constructed confdence ntervals of 9%, 95% and 99%, for the margnal propensty to consumpton - - correspondng to example 4.1.,99,95,9 1.11.968.947.843.739.718.675 FIGURE 4.14. Confdence ntervals for margnal propensty to consume n example 4.1. 4..3 Testng hypotheses about a sngle lnear combnaton of the parameters In many applcatons we are nterested n testng a hypothess nvolvng more than one of the populaton parameters. We can also use the t statstc to test a sngle lnear combnaton of the parameters, where two or more parameters are nvolved. There are two dfferent procedures to perform the test wth a sngle lnear combnaton of parameters. In the frst, the standard error of the lnear combnaton of parameters correspondng to the null hypothess s calculated usng nformaton on the covarance matrx of the estmators. In the second, the model s reparameterzed by ntroducng a new parameter derved from the null hypothess and the reparameterzed model s then estmated; testng for the new parameter ndcates whether the null hypothess s reected or not. The followng example llustrates both procedures. EXAMPLE 4.7 Are there constant returns to scale n the chemcal ndustry? To examne whether there are constant returns to scale n the chemcal sector, we are gong to use the Cobb-Douglas producton functon, gven by ln( output) 1 ln( labor) 3ln( captal) u (4-) In the above model parameters and 3 are elastctes (output/labor and output/captal). Before makng nferences, remember that returns to scale refers to a techncal property of the producton functon examnng changes n output subsequent to a change of the same proporton n all nputs, whch are labor and captal n ths case. If output ncreases by that same proportonal change then there are constant returns to scale. Constant returns to scale mply that f the factors labor and captal ncrease at a certan rate (say 1%), output wll ncrease at the same rate (e.g., 1%). If output ncreases by more than that proporton, there are ncreasng returns to scale. If output ncreases by less than that proportonal change, there are decreasng returns to scale. In the above model, the followng occurs - f + 3 =1, there are constant returns to scale. - f + 3 >1, there are ncreasng returns to scale. - f + 3 <1, there are decreasng returns to scale. 16

Data used for ths example are a sample of 7 companes of the prmary metal sector (workfle prodmet), where output s gross value added, labor s a measure of labor nput, and captal s the gross value of plant and equpment. Further detals on constructon of the data are gven n Agner, et al. (1977) and n Hldebrand and Lu (1957); these data were used by Greene n 1991. The results obtaned n the estmaton of model (4-), usng any econometrc software avalable, appear n table 4.4. TABLE 4.4. Standard output of the estmaton of the producton functon: model (4-). Varable Coeffcent Std. Error t-statstc Prob. constant 1.17644.3678 3.58339.15 ln(labor).6999.15954 4.787457.1 ln(captal).37571.85346 4.44. To answer the queston posed n ths example, we must test H : 3 1 aganst the followng alternatve hypothess H1: 3 1 Accordng to H, t s stated that 3 1. Therefore, the t statstc must now be based on whether the estmated sum ˆ ˆ 1 3 s suffcently dfferent from to reect H n favor of H 1. Two procedures wll be used to test ths hypothess. In the frst, the covarance matrx of the estmators s used. In the second, the model s reparameterzed by ntroducng a new parameter. Procedure: usng covarance matrx of estmators Accordng to H, t s stated that 3 1. Therefore, the t statstc must now be based on whether the estmated sum ˆ ˆ 1 3 s suffcently dfferent from to reect H n favor of H 1. To account for the samplng error n our estmators, we standardze ths sum by dvdng by ts standard error: ˆ ˆ 3 1 t ˆ ˆ 3 se( ˆ ˆ 3) Therefore, f t ˆ ˆ s large enough, we wll conclude, n a two sde alternatve test, that there are 3 not constant returns to scale. On the other hand, f t ˆ ˆ s postve and large enough, we wll reect, n a 3 one sde alternatve test (rght), H n favour of H1: 3 1. Therefore, there are ncreasng returns to scale. On the other hand, we have se( ˆ ˆ ˆ ˆ 3) var( 3) where var( ˆ ˆ ˆ ˆ ˆ ˆ 3) var( ) var( 3) covar(, 3) Hence, to compute se( ˆ ˆ 3) you need nformaton on the estmated covarance of estmators. Many econometrc software packages (such as e-vews) have an opton to dsplay estmates of the covarance matrx of the estmator vector. In ths case, the covarance matrx obtaned appears n table 4.5. Usng ths nformaton, we have se( ˆ ˆ 3).15864.784.9616.66 ˆ ˆ 3 1.19 t ˆ ˆ.34 3 se( ˆ ˆ ).66 3 17

TABLE 4.5. Covarance matrx n the producton functon. constant ln(labor) ln(captal) constant.16786 -.19835.1189 ln(labor)) -.19835.15864 -.9616 ln(captal).1189 -.9616.784 Gven that t=.34, t s clear that we cannot reect the exstence of constant returns to scale for the usual sgnfcance levels. Gven that the t statstc s negatve, t makes no sense to test whether there are ncreasng returns to scale Procedure: reparameterzng the model by ntroducng a new parameter It s easer to perform the test f we apply the second procedure. A dfferent model s estmated n ths procedure, whch drectly provdes the standard error of nterest. Thus, let us defne: 3 1 thus, the null hypothess that there are constant returns to scale s equvalent to sayng that H :. From the defnton of we have 3 1. Substtutng n the orgnal equaton: ln( output) 1( 3 1)ln( labor) 3ln( captal) u Hence, ln( output / labor) 1 ln( labor) 3ln( captal / labor) u Therefore, to test whether there are constant returns to scale s equvalent to carryng out a sgnfcance test on the coeffcent of ln(labor) n the prevous model. The strategy of rewrtng the model so that t contans the parameter of nterest works n all cases and s usually easy to mplement. If we apply ths transformaton to ths example, we obtan the results of Table 4.6. As can be seen we obtan the same result: ˆ t ˆ.34 se( ˆ ) TABLE 4.6. Estmaton output for the producton functon: reparameterzed model. Varable Coeffcent Std. Error t-statstc Prob. constant 1.17644.3678 3.58339.15 ln(labor) -.19.6577 -.347.7366 ln(captal/labor).37571.85346 4.44. EXAMPLE 4.8 Advertsng or ncentves? The Bush Company s engaged n the sale and dstrbuton of gfts mported from the Near East. The most popular tem n the catalog s the Guantanamo bracelet, whch has some relaxng propertes. The sales agents receve a commsson of 3% of total sales amount. In order to ncrease sales wthout expandng the sales network, the company establshed specal ncentves for those agents who exceeded a sales target durng the last year. Advertsng spots were rado broadcasted n dfferent areas to strengthen the promoton of sales. In those spots specal emphass was placed on hghlghtng the well-beng of wearng a Guantanamo bracelet. The manager of the Bush Company wonders whether a dollar spent on specal ncentves has a hgher ncdence on sales than a dollar spent on advertsng. To answer that queston, the company's econometrcan suggests the followng model to explan sales: sales advert ncent u 1 3 where ncent are ncentves to the salesmen and advert are expendtures n advertsng. The varables sales, ncent and advert are expressed n thousands of dollars. Usng a sample of 18 sale areas (workfle advncen), we have obtaned the output and the covarance matrx of the coeffcents that appear n table 4.7 and n table 4.8 respectvely. 18

TABLE 4.7. Standard output of the regresson for example 4.8. Varable Coeffcent Std. Error t-statstc Prob. constant 396.5945 3548.111.111776.915 advert 18.63673 8.94339.8834.54 ncent 3.69686 3.644 8.516448. TABLE 4.8. Covarance matrx for example 4.8. C ADVERT INCENT constant 158995-6674 -711 advert -6674 79.644.941 ncent -711.941 1.99 In ths model, the coeffcent ndcates the ncrease n sales produced by a dollar ncrease n spendng on advertsng, whle 3 ndcates the ncrease caused by a dollar ncrease n the specal ncentves, holdng fxed n both cases the other regressor. To answer the queston posed n ths example, the null and the alternatve hypothess are the followng: H : 3 H1: 3 The t statstc s bult usng nformaton about the covarance matrx of the estmators: ˆ 3 ˆ t ˆ ˆ 3 se( ˆ ˆ ) 3 se( ˆ ˆ 3 ) 79.644 1.99.941 9.314 ˆ 3 ˆ 3.697 18.637 t ˆ ˆ 1.95 3 se( ˆ ˆ 9.314 3 ) For =.1, we fnd that t.1 15 1.341. As t<1.341, we do not reect H for =.1, nor for =.5 or =.1. Therefore, there s no emprcal evdence that a dollar spent on specal ncentves has a hgher ncdence on sales than a dollar spent on advertsng. EXAMPLE 4.9 Testng the hypothess of homogenety n the demand for fsh In the case study n chapter, models for demand for dary products have been estmated from cross-sectonal data, usng dsposable ncome as an explanatory varable. However, the prce of the product tself and, to a greater or lesser extent, the prces of other goods are determnants of the demand. The demand analyss based on cross sectonal data has precsely the lmtaton that t s not possble to examne the effect of prces on demand because prces reman constant, snce the data refer to the same pont n tme. To analyze the effect of prces t s necessary to use tme seres data or, alternatvely, panel data. We wll brefly examne some aspects of the theory of demand for a good and then move to the estmaton of a demand functon wth tme seres data. As a postscrpt to ths case, we wll test one of the hypotheses whch, under certan crcumstances, a theoretcal model must satsfy. The demand for a commodty - say good - can be expressed, accordng to an optmzaton process carred out by the consumer, n terms of dsposable ncome, the prce of the good and the prces of the other goods. Analytcally: q f ( p1, p,, p,, pm, d) (4-1) where - d s the dsposable ncome of the consumer. - p1, p,, p, pm are the prces of the goods whch are taken nto account by consumers when they acqure the good. Logarthmc models are attractve n studes on demand,, because the coeffcents are drectly elastctes. The log model s gven by ln( q ln( p ) ln( p ) ln( p ) ln( p ) ln( R) u (4-) 1 1 3 m1 m m 19

It s clear to see that all coeffcents, excludng the constant term, are elastctes of dfferent types and therefore are ndependent of the unts of measurement for the varables. When there s no money lluson, f all prces and ncome grow at the same rate, the demand for a good s not affected by these changes. Thus, assumng that prces and ncome are multpled by f the consumer has no money lluson, the followng should be satsfed f ( lp1, lp,, lp,, lpm, lr) f ( p1, p,, p, pm, d) (4-3) From a mathematcal pont of vew, the above condton mples that the demand functon must be homogeneous of degree. Ths condton s called the restrcton of homogenety. Applyng Euler's theorem, the restrcton of homogenety n turn mples that the sum of the demand/ncome elastcty and of all demand/prce elastctes s zero,.e.: m q p h q R (4-4) h1 Ths restrcton appled to the logarthmc model (4-) mples that 3 m 1 m (4-5) In practce, when estmatng a demand functon, the prces of many goods are not ncluded, but only those that are closely related, ether because they are complementary or substtute goods. It s also well known that the budgetary allocaton of spendng s carred out n several stages. Next, the demand for fsh n Span wll be studed by usng a model smlar to (4-). Let us consder that n a frst assgnment, the consumer dstrbutes ts ncome between total consumpton and savngs. In a second stage, the consumpton expendture by functon s performed takng nto account the total consumpton and the relevant prces n each functon. Specfcally, we assume that the only relevant prce n the demand for fsh s the prce of the good (fsh) and the prce of the most mportant substtute (meat). Gven the above consderatons, the followng model s formulated: ln( fsh ln( fshpr) ln( meatpr) ln( cons) u (4-6) 1 3 4 where fsh s fsh expendture at constant prces, fshpr s the prce of fsh, meatpr s the prce of meat and cons s total consumpton at constant prces. The workfle fshdem contans nformaton about ths seres for the perod 1964-1991. Prces are ndex numbers wth 1986 as a base, and fsh and cons are magntudes at constant prces wth 1986 as a base also. The results of estmatng model (4-6) are as follows: ln( fsh7.788-.46 ln( fshpr) +.554 ln( meatpr) +.3 ln( cons) (.3) (.133) (.11) (.137) As can be seen, the sgns of the elastctes are correct: the elastcty of demand s negatve wth respect to the prce of the good, whle the elastctes wth respect to the prce of the substtute good and total consumpton are postve In model (4-6) the homogenety restrcton mples the followng null hypothess: 3 4 (4-7) To carry out ths test, we wll use a smlar procedure to the one used n example 4.6. Now, the parameter s defned as follows 3 4 (4-8) Settng 3 4, the followng model has been estmated: ln( fsh 1ln( fshpr) 3ln( meatpr fshpr) 4ln( cons fshpr) u (4-9) The results obtaned were the followng: ln( fsh 7.788-.4596 ln( fshpr ) +.554 ln( meatpr ) +.3 ln( cons ) (.3) (.1334) (.11) (.137) Usng (4-8), testng the null hypothess (4-7) s equvalent to testng that the coeffcent of ln(fshpr) n (4-9) s equal to. Snce the t statstc for ths coeffcent s equal to -3.44 and t.1/ 4 =.8, we reect the hypothess of homogenety regardng the demand for fsh.

4..4 Economc mportance versus statstcal sgnfcance Up untl now we have emphaszed statstcal sgnfcance. However, t s mportant to remember that we should pay attenton to the magntude and the sgn of the estmated coeffcent n addton to t statstcs. Statstcal sgnfcance of a varable x s determned entrely by the sze of t ˆ, whereas the economc sgnfcance of a varable s related to the sze (and sgn) of ˆ. Too much focus on statstcal sgnfcance can lead to the false concluson that a varable s mportant for explanng y, even though ts estmated effect s modest. Therefore, even f a varable s statstcally sgnfcant, you need to dscuss the magntude of the estmated coeffcent to get an dea of ts practcal or economc mportance. 4.3 Testng multple lnear restrctons usng the F test. So far, we have only consdered hypotheses nvolvng a sngle restrcton. But frequently, we wsh to test multple hypotheses about the underlyng parameters 1,, 3,, k. In multple lnear restrctons, we wll dstngush three types: excluson restrctons, model sgnfcance and other lnear restrctons. 4.3.1 Excluson restrctons Null and alternatve hypotheses; unrestrcted and restrcted model We begn wth the leadng case of testng whether a set of ndependent varables has no partal effect on the dependent varable, y. These are called excluson restrctons. Thus, consderng the model y x x x x u (4-3) 1 3 3 4 4 5 5 the null hypothess n a typcal example of excluson restrctons could be the followng: H : 4 5 Ths s an example of a set of multple restrctons, because we are puttng more than one restrcton on the parameters n the above equaton. A test of multple restrctons s called a ont hypothess test. The alternatve hypothess can be expressed n the followng way H 1 : H s not true It s mportant to remark that we test the above H ontly, not ndvdually. Now, we are gong to dstngush between unrestrcted (UR) and restrcted (R) models. The unrestrcted model s the reference model or ntal model. In ths example the unrestrcted model s the model gven n (4-3). The restrcted model s obtaned by mposng H on the orgnal model. In the above example, the restrcted model s y x x u 1 3 3 By defnton, the restrcted model always has fewer parameters than the unrestrcted one. Moreover, t s always true that 1

RSS R RSS UR where RSS R s the RSS of the restrcted model, and RSS UR s the RSS of the unrestrcted model. Remember that, because OLS estmates are chosen to mnmze the sum of squared resduals, the RSS never decreases (and generally ncreases) when certan restrctons (such as droppng varables) are ntroduced nto the model. The ncrease n the RSS when the restrctons are mposed can tell us somethng about the lkely truth of H. If we obtan a large ncrease, ths s evdence aganst H, and ths hypothess wll be reected. If the ncrease s small, ths s not evdence aganst H, and ths hypothess wll not be reected. The queston s therefore whether the observed ncrease n the RSS when the restrctons are mposed s large enough, relatve to the RSS n the unrestrcted model, to warrant reectng H. The answer depends on but we cannot carry out the test at a chosen untl we have a statstc whose dstrbuton s known, and s tabulated, under H. Thus, we need a way to combne the nformaton n RSS R and RSS UR to obtan a test statstc wth a known dstrbuton under H. Now, let us look at the general case, where the unrestrcted model s y x x x u (4-31) 1 3 3 k k+ Let us suppose that there are q excluson restrctons to test. H states that q of the varables have zero coeffcents. Assumng that they are the last q varables, H s stated as H : (4-3) kq1 kq k The restrcted model s obtaned by mposng the q restrctons of H on the unrestrcted model. y x x k qxk q u (4-33) 1 3 3 + H 1 s stated as H 1 : H s not true (4-34) Test statstc: F rato The F statstc, or F rato, s defned by ( RSSR RSSUR)/ q F RSS /( n k) UR (4-35) where RSS R s the RSS of the restrcted model, and RSS UR s the RSS of the unrestrcted model and q s the number of restrctons; that s to say, the number of equaltes n the null hypothess. In order to use the F statstc for a hypothess testng, we have to know ts samplng dstrbuton under H n order to choose the value c for a gven, and determne the reecton rule. It can be shown that, under H, and assumng the CLM assumptons hold, the F statstc s dstrbuted as a Snedecor s F random varable wth (q,n-k) df. We wrte ths result as F H F - (4-36) qn, k

A Snedecor s F wth q degrees of freedom n the numerator and n-k de degrees of freedom n the denomnator s equal to where x q and F x x q qn, k nk q nk xn k are Ch-square dstrbutons that are ndependent of each other. (4-37) In (4-35) we see that the degrees of freedom correspondng to RSS UR (df UR )are n- k. Remember that ˆ UR RSS n k UR (4-38) On the other hand, the degrees of freedom correspondng to RSS R (df R ) are n- k+q, because n the restrcted model k-q parameters are estmated. The degrees of freedom correspondng to RSS R -RSS UR are (n-k+q)-(n-k)=q = numerator degrees of freedom=df R -df UR Thus, n the numerator of F, the dfference n RSS s s dvded by q, whch s the number of restrctons mposed when movng from the unrestrcted to the restrcted model. In the denomnator of F, RSS UR s dvded by df UR. In fact, the denomnator of F s smply the unbased estmator of n the unrestrcted model. The F rato must be greater than or equal to, snce SSR SSR. It s often useful to have a form of the F statstc that can be computed from the R of the restrcted and unrestrcted models. Usng the fact that RSSR TSS(1 RR) and RSSUR TSS(1 RUR ), we can wrte (4-35) as the followng ( RUR RR )/ q F (4-39) (1 R ) / ( n k) UR snce the SST term s cancelled. Ths s called the R-squared form of the F statstc. Whereas the R-squared form of the F statstc s very useful for testng excluson restrctons, t cannot be appled for testng all knds of lnear restrctons. For example, the F rato (4-39) cannot be used when the model does not have ntercept or when the functonal form of the endogenous varable n the unrestrcted model s not the same as n the restrcted model. Decson rule The F q,n-k dstrbuton s tabulated and avalable n statstcal tables, where we look for the crtcal value ( F qn, k), whch depends on (the sgnfcance level), q (the df of the numerator), and n-k, (the df of the denomnator). Takng nto account the above, the decson rule s qute smple. R UR 3

Decson rule If If F F reect H qn, k F F not reect H qn, k (4-4) Therefore, we reect H n favor of H 1 at when F F qn, k, as can be seen n fgure 4.15. It s mportant to remark that as decreases, F ncreases. If H qn, k s reected, then we say that xkq 1, xkq,, xk are ontly statstcally sgnfcant, or ust ontly sgnfcant, at the selected sgnfcance level. Ths test alone does not allow us to say whch of the varables has a partal effect on y; they may all affect y or only one may affect y. If H s not reected, then we say that xkq 1, xkq,, xk are ontly not statstcally sgnfcant, or smply ontly not sgnfcant, whch often ustfes droppng them from the model. The F statstc s often useful for testng the excluson of a group of varables when the varables n the group are hghly correlated. Non Reecton Regon NRR Reecton Regon RR Reected for p-value Non reected for <p-value Fqn, k Fqn, k p-value F qn, k FIGURE 4.15. Reecton regon and non reecton regon usng F dstrbuton. FIGURE 4.16. p-value usng F dstrbuton. F In the F testng context, the p-value s defned as p -value Pr( F F ' H ) where F s the actual value of the test statstc and varable wth (q,n-k) df. F ' denotes a Snedecor s F random The p-value stll has the same nterpretaton as for t statstcs. A small p-value s evdence aganst H, whle a large p-value s not evdence aganst H. Once the p-value has been computed, the F test can be carred out at any sgnfcance level. In fgure 4.16 ths alternatve approach s represented. As can be seen by observng the fgure, the determnaton of the p-value s the nverse operaton to fnd the value n the statstcal tables for a gven sgnfcance level. Once the p-value has been determned, we know that H s reected for any level of sgnfcance of >p-value, whereas the null hypothess s not reected when <p-value. EXAMPLE 4.1 Wage, experence, tenure and age The followng model has been bult to analyze the determnant factors of wage: 4

ln( wage) 1 educ 3exper 4tenure 5age u where wage s monthly earnngs, educ s years of educaton, exper s years of work experence, tenure s years wth current employer, and age s age n years. The researcher s plannng to exclude tenure from the model, snce n many cases t s equal to experence, and also age, because t s hghly correlated wth experence. Is the excluson of both varables acceptable? The null and alternatve hypotheses are the followng: H : 4 5 H1: H s not true The restrcted model correspondng to ths H s ln( wage) 1 educ 3exper u Usng a sample consstng of 53 observatons from workfle wage, we have the followng estmatons for the unrestrcted and for the restrcted models: ln( wage) = 6.476 +.658educ +.67exper-.94tenure-.9 age RSS = 5.954 ln( wage) = 6.157 +.457educ +.11 exper RSS = 6.5 The F rato obtaned s the followng: RSSR RSSUR/ q (6.5 5.954) / F 1.193 RSSUR / ( n k) 5.954 / 48 Gven that the F statstc s low, let us see what happens wth a sgnfcance level of.1. In ths case the degrees of freedom for the denomnator are 48 (53 observatons mnus 5 estmated parameters). If we look n the F statstcal table for df n the numerator and 45 df n the denomnator, we fnd.1.1 F,48 F,45 =.4. As F<.4, we do not reect H. If we do not reect H for.1, we wll not reect H for.5 or.1, as can been n fgure 4.17. Therefore, we cannot reect H n favor of H 1. In other words tenure and age are not ontly sgnfcant. Non Reecton Regon NRR,1 Reecton Regon RR,5.5,1.1 F,48 1,98 1.98.4,44 3.3 3,3 5.18 5,18 FIGURE 4.17. Example 4.1: Reecton regon usng F dstrbuton ( values are from a F.4 ). 4.3. Model sgnfcance Testng model sgnfcance, or overall sgnfcance, s a partcular case of testng excluson restrctons. Model sgnfcance means global sgnfcance of the model. One could thnk that the H n ths test s the followng: : (4-41) H 1 3 k However, ths s not the adequate H to test for the global sgnfcance of the model. If 3 k, then the restrcted model would be the followng: y u (4-4) 1 + 5

If we take expectatons n (4-4), then we have Ey ( ) (4-43) Thus, H n (4-41) states not only that the explanatory varables have no nfluence on the endogenous varable, but also that the mean of the endogenous varable for example, the consumpton mean- s equal to. Therefore, f we want to know whether the model s globally sgnfcant, the H must be the followng: H: 3 k (4-44) The correspondng restrcted model gven n (4-4) does not explan anythng and, therefore, R R s equal to. Testng the H gven n (4-44) s very easy by usng the R-squared form of the F statstc: R / k F (4-45) (1 R ) / ( n k) where the R s the R UR 1, snce only the unrestrcted model needs to be estmated, because R of the model (4-4) restrcted model- s. EXAMPLE 4.11 Salares of CEOs Consder the followng equaton to explan salares of Chef Executve Offcers (CEOs) as a functon of annual frm sales, return on equty (roe, n percent form), and return on the frm's stock (ros, n percent form): ln(salary) = 1 + ln(sales)+ 3 roe+ 4 ros+ u. The queston posed s whether the performance of the company (sales, roe and ros) s crucal to set the salares of CEOs. To answer ths queston, we wll carry out an overall sgnfcance test. The null and alternatve hypotheses are the followng: H : 3 4 H 1 : H s not true Table 4.9 shows an E-vews complete output for least square (ls) usng the flework ceosal1. At the bottom the F-statstc can be seen for overall test sgnfcance, as well as Prob, whch s the p- value correspondng to ths statstc. In ths case the p-value s equal to, that s to say, H s reected for all sgnfcance levels (See fgure 4.18). Therefore, we can reect that the performance of a company has no nfluence on the salary of a CEO.,1,5,1 p-value, F 3,5 FIGURE 4.18. Example 4.11: p-value usng F dstrbuton ( values are for a F 3,14 ).,1,67 3,9 6,93 6

TABLE 4.9. Complete output from E-vews n the example 4.11. Dependent Varable: LOG(SALARY) Method: Least Squares Date: 4/1/1 Tme: 19:39 Sample: 1 9 Included observatons: 9 Varable Coeffcent Std. Error t-statstc Prob. C 4.31171.315433 13.66919. LOG(SALES).8315.353 7.93646. ROE.17417.49 4.55977. ROS.4.54.446.6561 R-squared.8685 Mean dependent var 6.95386 Adusted R-squared.7188 S.D. dependent var.566374 S.E. of regresson.483185 Akake nfo crteron 1.4118 Sum squared resd 47.868 Schwarz crteron 1.46686 Log lkelhood -14.513 F-statstc 6.993 Durbn-Watson stat.33496 Prob(F-statstc). 4.3.3 Testng other lnear restrctons So far, we have tested hypotheses wth excluson restrctons usng the F statstc. But we can also test hypotheses wth lnear restrctons of any knd. Thus, n the same test we can combne excluson restrctons, restrctons that mpose determned values to the parameters and restrctons on lnear combnaton of parameters. Therefore, let us consder the followng model and the null hypothess: y x x x x u 1 3 3 4 4 5 5 H 3 1 : 4 3 5 The restrcted model correspondng to ths null hypothess s ( yx 3 x ) ( x x ) u 4 1 3 3 In the example 4.1, the null hypothess conssts of two restrctons: a lnear combnaton of parameters and an excluson restrcton. EXAMPLE 4.1 An addtonal restrcton n the producton functon. (Contnuaton of example 4.7) In the producton functon of Cobb-Douglas, we are gong to test the followng H whch has two restrctons: 3 1 H : 1 H : H s not true 1 In the frst restrcton we mpose that there are constant returns to scale. In the second restrcton that 1, parameter lnked to the total factor productvty s equal to. 7

Substtutng the restrcton of H n the orgnal model (unrestrcted model), we have ln(output) (1 )ln( labor) ln(captal 3 3 l) u Operatng, we obtan the restrcted model: ln( output / labor) ln( captal / labor) u UR 3 In estmatng the unrestrcted and restrcted models, we get RSSR R =3.111 and RSS UR =.8516. Therefore, the F rato s RSSR RSSUR / q (3.111.8516) / F 13.551 RSS /( n k ).8516/(7 3) There are two reasons for not usng R n ths case. Frst, the restrcted r model has no ntercept. Second, the regressand of the restrcted modell s dfferent from the regressand of the unrestrcted model. Snce the F value s relatvely hgh, let us startt by testng wth w a level of 1%. For =.1,.1 F,4 5.61. Gven that F>5.61, we reect H n favour of H 1. Therefore, we reect the ont hypotheses that there are constant returns to scale and that the parameter 1 s equal to. If H s reected for =.1, t wlll also be reected for levels of 5% and 1%. 4.3.44 Relaton between F and t statstcs So far, we have seen how to use the F statstc to test several restrctons n the model, but t can be used to test a sngle restrcton. In ths case, we cann choose between usng the F statstc or the t statstc too carry out a two-tal test. The conclusons would, nevertheless, be exactly the same. But, what s the relatonshp between an F wth one degree off freedom n the numerator (to test a sngle restrcton) r and a t? It can be shown that t nk F Ths fact s llustrated n fguree 4.19. We observe that the tal off the F splts nto the two tals of the t. Hence, the twoo approaches lead to exactly thee same outcome, provded that the alternatve hypothess s two-sded. However, the t statstc s more flexble for testng a sngle hypothess,, because t can be usedd to test H aganst one-tal alternatves. 1, nk (4-46)( FIGURE E 4.19. Relatonshp between a F 1,n-k and a t n-k. Moreover, snce the t statstcs are also easer to obtan than the F statstcs, there s no good reason for usng an F statstc to test a hypothess wth a unque restrcton. 8

4.4 Testng wthout normalty The normalty of the OLS estmators depends crucally on the normalty assumpton of the dsturbances. What happens f the dsturbances do not have a normal dstrbuton? We have seen that the dsturbances under the Gauss-Markov assumptons, and consequently the OLS estmators are asymptotcally normally dstrbuted,.e. approxmately normally dstrbuted. If the dsturbances are not normal, the t statstc wll only have an approxmate t dstrbuton rather than an exact one. As t can be seen n the t student table, for a sample sze of 6 observatons the crtcal ponts are practcally equal to the standard normal dstrbuton. Smlarly, f the dsturbances are not normal, the F statstc wll only have an approxmate F dstrbuton rather than an exact one, when the sample sze s large enough and the Gauss-Markov assumptons are fulflled. Therefore, we can use the F statstc to test lnear restrctons n lnear models as an approxmate test. There are other asymptotc tests (the lkelhood rato, Lagrange multpler and Wald tests) based on the lkelhood functons that can be used n testng lnear restrcton f the dsturbances are non-normally dstrbuted. These three can also be appled when a) the restrctons are nonlnear; and b) the model s nonlnear n the parameters. For non-lnear restrctons, n lnear and non-lnear models, the most wdely used test s the Wald test. For testng the assumptons of the model (for example, homoskedastcty and no autocorrelaton) the Lagrange multpler (LM) test s usually appled. In the applcaton of the LM test, an auxlary regresson s often run. The name of auxlary regresson means that the coeffcents are not of drect nterest: only the R s retaned. In an auxlary regresson the regressand s usually the resduals (or functons of the resduals), obtaned n the OLS estmaton of the orgnal model, whle the regressors are often the regressors (and/or functons of them) of the orgnal model. 4.5 Predcton In ths secton two types of predcton wll be examned: pont and nterval predcton. 4.5.1 Pont predcton Obtanng a pont predcton does not pose any specal problems, snce t s a smple extrapolaton operaton n the context of descrptve methods. Let x, x3,, xk denote the partcular values n each of the k regressors for predcton; these may or may not correspond to an actual data pont n our sample. If we substtute these values n the multple regresson model, we have y x x x u u (4-47) 1 3 3... k k Therefore, the expected, or mean, value of y s gven by Ey ( ) x x... kxk (4-48) 1 3 3 The pont predcton s obtaned straghtaway by replacng the parameters of (4-48) by the correspondng OLS estmators: 9

ˆ ˆ ˆ x ˆ x ˆk x k (4-49) 1 3 3 To obtan (4-49) we dd not need any assumpton. But, f we adopt the assumptons 1 to 6, we wll mmedately fnd that that ˆ s an unbased predctor of : E ˆ E ˆ ˆ 1 ˆ ˆ x 3x3 kxk 1 x 3x3... kxk (4-5) On the other hand, adoptng the Gauss Markov assumptons (1 to 8), t can be proved that ths pont predctor s the best lnear unbased estmator (BLUE). We have a pont predcton for, but, what s the pont predcton for y? To answer ths queston, we have to predct u. As the error s not observable, the best predctor for u s ts expected value, whch s. Therefore, ŷ ˆ (4-51) 4.5. Interval predcton Pont predctons made wth an econometrc model wll n general not concde wth the observed values due to the uncertanty surroundng economc phenomena. The frst source of uncertanty s that we cannot use the populaton regresson functon because we do not know the parameters s. Instead we have to use the sample regresson functon. The confdence nterval for the expected value.e. for - whch wll examne next, ncludes only ths type of uncertanty. The second source of uncertanty s that n an econometrc model, n addton to the systematc part, there s a dsturbance whch s not observable. The predcton nterval for an ndvdual value.e. for y -, whch wll be dscussed later on ncludes both the uncertanty arsng from the estmaton as well as the dsturbance term. A thrd source of uncertanty may come from the fact of not knowng exactly what values the explanatory varables wll take for the predcton we want to make. Ths thrd source of uncertanty, whch s not addressed here, complcates calculatons for the constructon of ntervals. Confdence nterval for the expected value If we are predctng the expected value of y, whch s, then the predcton error ê 1 wll be ê1 ˆ. Accordng to (4-5), the expected predcton error s zero. Under the assumptons of the CLM, eˆ ˆ 1 t ˆ ˆ nk se( ) se( ) Therefore, we can wrte that ˆ / / Pr tnk tnk 1 ˆ se( ) Operatng, we can construct a (1-% confdence nterval (CI) for wth the followng structure: 3

ˆ ˆ / ˆ ˆ / Pr se( ) t n k se( ) t nk 1 (4-5) To obtan a CI for, we need to know the standard error ( se( ˆ ) ) for ˆ. In any case, there s an easy way to calculate t. Thus, solvng (4-48) for 1 we fnd that x x x. Pluggng ths nto the equaton (4-47), we obtan 1 3 3... k k y ( x x ) ( x x ) ( x x ) u (4-53) 3 3 3 k k k Applyng OLS to (4-53), n addton to the pont predcton, we obtan se( ˆ ) whch s the standard error correspondng to the ntercept n ths regresson. The prevous method allows us to put a CI around the OLS estmate of E(y), for any values of the x s. Predcton nterval for an ndvdual value We are now gong to construct an nterval for y, usually called predcton nterval for an ndvdual value, or for short, predcton nterval. Accordng to (4-47), y has two components: y u (4-54) The nterval for the expected value bult before s a confdence nterval around wcch s a combnaton of the parameters. In contrast, the nterval for y s random, because one of ts components, u, s random. Therefore, the nterval for y s a probablstc nterval and not a confdence nterval. The mechancs for obtanng t are the same, but bear n mnd that now we are gong to consder that the set x, x3,, xk vs outsde from of the sample used to estmate the regresson. The predcton error ( ê ) n usng ŷ to predct y s eˆ y yˆ u yˆ (4-55) Takng nto account (4-51) and (4-5), and that E(u )=, then the expected predcton error s zero. In fndng the varance of ê, t must be taken nto account that u s uncorrelated wth ŷ because x, x3,, xk s not n the sample. Therefore, the varance of the predcton error (condtonal on the x s) s the sum of the varances: Var( eˆ ) Var( yˆ ) Var( u ) Var( yˆ ) (4-56) There are two sources of varaton n ê : 1. The samplng error n ŷ, whch arses because we have estmated the s.. The gnorance of the unobserved factors that affect y, whch s reflected n. Under the CLM assumptons, ê s also normally dstrbuted. Usng the unbased estmator of and takng nto account that var( yˆ ) var( ˆ ), we can defne the standard error (se) of ê as 31

1 ˆ ˆ se( eˆ ) se( ) (4-57) Usually ˆ s larger than se( ˆ ). Under the assumptons of the CLM, Therefore, we can wrte that eˆ tnk (4-58) se( e ) ˆ / eˆ / Pr tnk t 1 nk se( eˆ ) (4-59) Pluggng n eˆ y yˆ nto (4-59) and rearrangng t gves a (1-% predcton nterval for y : Pr ˆ ( ˆ ) ˆ ( ˆ ) 1 / / y see tnk y y see tnk (4-6) EXAMPLE 4. 13 What s the expected score n the fnal exam wth 7 marks n the frst short exam? The followng model has been estmated to compare the marks n the fnal exam (fnalmrk) and n the frst short exam (shortex1) of Econometrcs: fnalmrk = 4.155 +.491shortex 1 (.715) (.13) ˆ =1.649 R =.533 n=16 To estmate the expected fnal mark for a student wth shortex1 =7 mark n the frst short exam, the followng model, accordng to (4-53), was estmated: fnalmrk = 7.593+.491( shortex 1-7) (.497) (.13) ˆ =1.649 R =.533 n=16 The pont predcton for shortex1 =7 s ˆ =7.593 and the lower and upper bounds of a 95% CI respectvely are gven by ˆ ˆ.5/ se( ) t 7.593.497.14 6.5 14 ˆ se( ˆ ) t 7.593.497.14 8.7.5/ 14 Therefore, the student wll have a 95% confdence of obtanng on average a fnal mark located between 6.5 and 8.7. The pont predcton could be also obtaned from the frst estmated equaton: s equal 4.155.491 7 7.593 fnalmrk = + = Now, we are gong to estmate a 95% probablty nterval for the ndvdual value. The se of 1 se( eˆ ) se( yˆ ) ˆ.497 1.649 1.7 where 1.649 s the S. E. of regresson obtaned from the E-vews output drectly. The lower and upper bounds of a 95% probablty nterval respectvely are gven by.5 ˆ ˆ 14 y y se( e ) t 7.5931.7.14 3.7.5 ˆ ˆ 14 y y se( e ) t 7.593 1.7.14 11.3 You must take nto account that ths probablty nterval s qute large because the sze of the sample s very small. ê 3

EXAMPLE 4.14 Predctng the salary of CEOs Usng data on the most mportant US companes taken from Forbes (workfle ceoforbes), the followng equaton has been estmated to explan salares (ncludng bonuses) earned yearly (thousands of dollars) n 1999 by the CEOs of these companes: salary = 1381+.8377 assets + 3.58tenure +.35 profts (14) (.13) (8.671) (.538) ˆ =156 R =.44 n=447 where assets are total assets of frm n mllons of dollars, tenure s number of years as CEO n the company, and profts are n mllons of dollars. In Table 4.1 descrptve measures of explanatory varables of the model on CEOs salares appear. TABLE 4.1. Descrptve measures of varables of the model on CEOs salary. assets tenure profts Mean 754 7.8 7 Medan 7811 5. 333 Maxmum 668641 6. 71 Mnmum 718. -669 Observatons 447 447 447 The predcted salares and the correspondng se( ˆ ) for selected values (maxmum, mean, medan and mnmum), usng a model as (4-53), appear n table 4.11. TABLE 4.11. Predctons for selected values. Predcton ˆ Std. Error se( ˆ ) Mean values 6 71 Medan value 1688 78 Maxmum values 1414 111 Mnmum values 76 195 4.5.3 Predctng y n a ln(y) model Consder the model n logs: ln( y) x x x + u (4-61) 1 3 3 Obtanng OLS estmates, we predct ln(y) as ln( y ) ˆ ˆ ˆ 1x kxk (4-6) Applyng exponentaton to (4-6), we obtan the predcton value y exp(ln( y)) exp( ˆ ˆ x ˆ x ) (4-63) 1 However, ths predcton s based and nconsstent because t wll systematcally underestmate the expected value of y. Let us see why. If we apply exponentaton n (4-61), we have yexp( x x x ) exp( u) (4-64) 1 3 3 k k k k k k then have Before takng expectaton n (4-64), we must take nto account that f u~n(, ), E(exp( u)) exp. Therefore, under the CLM assumptons 1 through 9, we Ey x x x (4-65) ( ) exp( 1 3 3 k k) exp( / ) 33

Takng as a reference (4-65), the adequate predctor of y s y= ˆ bˆ + bˆ x + + bˆ x sˆ = y sˆ (4-66) exp( 1 k k) exp( / ) exp( / ) where ˆ s the unbased estmator of. It s mportant to remark that although ŷ s a based predctor, t s consstent, whle y s based and nconsstent EXAMPLE 4.15 Predctng the salary of CEOs wth a log model (contnuaton 4.14) Usng the same data as n example 4.14, the followng model was estmated: ln( salary ) = 5.5168 +.1885ln( assets ) +.15tenure +.7 profts (.1) (.3) (.3) (.195) ˆ =.5499 R =.68 n=447 Salary and assets are taken n natural logs, whle profts are n levels because some observatons are negatve and thus not possble to take logs. Frst, we are gong to calculate the nconsstent predcton, accordng to (4-63) for a CEO workng n a corporaton wth assets=1, tenure=1 years and profts=1: salary exp(ln( = salary )) = exp(5.5168 +.1885ln(1) +.15 1 +.7 1) = 1716 Usng (4-66), we obtan a consstent predcton: salary = exp(.5499 / ) 1716 = 1996 4.5.4 Forecast evaluaton and dynamc predcton In ths secton we wll compare predctons made usng an econometrc model wth the actual values n order to evaluate the predctve ablty of the model. We wll also examne the dynamc predcton n models n whch lagged endogenous varables are ncluded as regressors. Forecast evaluaton statstcs Suppose that the sample forecast s =n+1, n+,, n+h, and denote the actual and forecasted value n perod as y and y ˆ, respectvely. Now, we present some of the more common statstcs used for forecast evaluaton. Mean absolute error (MAE) The MAE s defned as the average of the absolute values of the errors: MAE n+ h å n+ 1 yˆ h - y (4-67) Absolute values are taken so that postve errors are compensated by the negatve ones. Mean absolute percentage error (MAPE), n+ h yˆ - y å n+ 1 y MAPE h Root of the mean squared error (RMSE) 1 (4-68) 34

Ths statstc s defned as the square root of the mean of the squared error: n+ h å n+ 1 ( yˆ - y ) RMSE (4-69) h As the errors are squared, the compensaton between postve and negatve errors are avoded. It s mportant to remark that the MSE places a greater penalty on large forecast errors than the MAE. Thel Inequalty Coeffcent (U) Ths coeffcent s defned as follows: n+ h å n+ 1 ( yˆ - y ) n+ h n+ h å yˆ å y n+ 1 n+ 1 U h (4-7) h The smaller U s, the more accurate are the predctons. The scalng of U s such that t wll always le between and 1. If U=, then y = y ˆ, for all forecasts; f U=1 the predctve performance s as bad as t can be. Thel s U statstc can be rescaled and decomposed nto three proportons: bas, varance and covarance. Of course the sum of these three proportons s 1. The nterpretaton of these three proportons s as follows: 1) The bas reflects systematc errors. Whatever the value of U, we would hope that the bas s close to. A large bas suggests a systematc over or under predcton. ) The varance also reflects systematc errors. The sze of ths proporton s an ndcaton of the nablty of the forecasts to replcate the varablty of the varable to be forecasted. 3) The covarance measures unsystematc errors. Ideally, ths should have the hghest proporton of Thel nequalty. In addton of the coeffcent defned n (4-7), Thel proposed other coeffcents for forecast evaluaton. Dynamc predcton Let the followng model be gven: yt 1xt 3yt 1 ut (4-71) Suppose that the sample forecast s =n+1,,=n+h, and denote the actual and forecasted value n perod as y and y ˆ, respectvely. The forecast for the perod n+1 s y ˆ ˆ x ˆ y (4-7) + h ˆn 1 1 n 1 3 n As we can see for the predcton, we use the observed value of y (y n ) because t s nsde the sample used n the estmaton. For the remander of the forecast perods we 35

use the recursvely computed forecast of the lagged value of the dependent varable (dynamc predcton), that s to say, yˆ ˆ ˆ x ˆ yˆ,3,, h (4-73) n 1 n 3 n1 Thus, from perod n+ to n+h the forecast carred out n a perod s used to forecast the endogenous varable n the followng perod. Exercses Exercse 4.1 To explan the housng prce n an Amercan town, the followng model s formulated: prce 1rooms 3lowstat 4crme u where rooms s the number of rooms n the house, lowstat s the percentage of people of lower status n the area and crme s crmes commtted per capta n the area. Prces of houses are measured n dollars. Usng the data n hprce, the followng model has been estmated: prce =- 15694+ 6788rooms-68lowstat - 3854crme (8) (111) (81) (96) R =.771 n=55 (The numbers n parentheses are standard errors of the estmators.) a) Interpret the meanng of the coeffcents ˆ, 3 ˆ and ˆ 4. b) Does the percentage of people of lower status have a negatve nfluence on the prce of houses n that area? c) Does the number of rooms have a postve nfluence on the prce of houses? Exercse 4. Consder the followng model: ln( frut) 1ln( nc) 3hhsze 4punder5 u where frut s expendture n frut, nc s dsposable ncome of a household, hhsze s the number of household members and punder5 s the proporton of chldren under fve n the household. Usng the data n workfle demand, the followng model has been estmated: ln( frut) =- 9.768+.5ln( nc) -1.5hhsze-.179 punder5 (3.71) (.51) (.179) (.13) R =.78 n=4 (The numbers n parentheses are standard errors of the estmators.) a) Interpret the meanng of the coeffcents ˆ, 3 ˆ and ˆ 4. b) Does the number of household members have a statstcally sgnfcant effect on the expendture n frut? c) Is the proporton of chldren under fve n the household a factor that has a negatve nfluence on the expendture of frut? d) Is frut a luxury good? Exercse 4.3 (Contnuaton of exercse.5). Gven the model y 1 x u 1,,, n the followng results have been obtaned wth a sample sze of 11 observatons: 36

n 1 x n 1 y n n 1 n 1 1 x B yx y x 1 1 (Remember that ˆ 1 ) n n x x x n 1 y E a) Buld a statstc to test H : aganst H1:. b) Test the hypothess of queston a) when EB F. c) Test the hypothess of queston a) when EB F. Exercse 4.4 The followng model has been formulated to explan the spendng on food (food): food 1nc 3rpfood u where nc s dsposable ncome and rpfood s the relatve prce ndex of food compared to other consumer products. Takng a sample of observatons for successve years, the followng results are obtaned: food = 1.4+.16nc -.36rpfood (4.9) (.1) (.7) R =.996; uˆ t.196 n 1 x y F (The numbers n parentheses are standard errors of the estmators.) a) Test the null hypothess that the coeffcent of rpfood s less than. b) Obtan a confdence nterval of 95% for the margnal propensty to consume food n relaton to ncome. c) Test the ont sgnfcance of the model. Exercse 4.5 The followng demand functon for rental housng s formulated: ln(srenhous )=β 1 +β ln(prenhous )+ β 3 ln(nc )+ε where srenhous s spendng on rental housng, prenhous s the rental prce, and nc s dsposable ncome. Usng a sample of 43 observatons, we obtan the followng results: ln( srenhous ) 1.7ln prenhous.9ln nc 1. R =.39 cov( β ˆ).9.85.85.9 a) Interpret the coeffcents on ln(prenhous) and ln(nc). b) Usng a.1 sgnfcance level, test the null hypothess that β =β 3 =. c) Test the null hypothess that β =, aganst the alternatve that β <. d) Test the null hypothess that β 3 =1 aganst the alternatve that β 3 1. e) Test the null hypothess that a smultaneous ncrease n housng prces and ncome has no proportonal effect on housng demand. 37

Exercse 4.6 The followng estmated models correspondng to average cost (ac) functons have been obtaned, usng a sample of 3 frms: ac 17.4635.7 qty (11.97) (3.7) R.838 RSS 89 (1) 3 ac 31.785.39 qty 6.73qty 1.4 qty (9.44) (33.81) (11.61) (1.) R.978 RSS 197 () where ac s the average cost and qty s the quantty produced. (The numbers n parentheses are standard errors of estmators.) a) Test whether the quadratc and cubc terms of the quantty produced are sgnfcant n determnng the average cost. b) Test the overall sgnfcance n the model. Exercse 4.7 Usng a sample of 35 observatons, the followng models have been estmated to explan expendtures on coffee: ln( coffee ) = 1.3 +.11ln( nc ) - 1.33 ln( cprce ) + 1.35ln( tprce ) (.1) (.3) (1) R =. 95 RSS= 54 ln( coffee ) = 19.9 +.14 ln( nc )-1.4 ln( cprce) (.) (.1) () RSS = 59 where nc s dsposable ncome, cprce s coffee prce and tprce s tea prce. (The numbers n parentheses are standard errors of estmators.) a) Test the overall sgnfcance of model (1) b) The standard error of ln(tprce) s mssng n model (1), can you calculate t? c) Test whether the prce of tea s statstcally sgnfcant. d) How would you test the assumpton that the prce elastcty of coffee s equal but opposte to the prce elastcty of tea? Detal the procedure. Exercse 4.8 The followng model has been formulated to analyse the determnants of ar qualty (arqual) n 3 Standard Metropoltan Statstcal Areas (SMSA) of Calforna: arqual 1popln 3medncm 4poverty 5fueol 6valadd u where arqual s weght n μg/m 3 of suspended partcular matter, popln s populaton n thousands, medncm s medum per capta ncome n dollars, poverty s the percentage of famles wth ncome less than poverty levels, fuelol s thousands of barrels of fuel ol consumed n ndustral manufacturng, and valadd s value added by ndustral manufactures n 197 n thousands of dollars. Usng the data n workfle arqualy, the above model has been estmated: arqual 97.35.956 popln.17medncm.54 poverty (1.19) (.311) (.55) (.89).31 fueol.11valadd (.17) (.5) 38

R =.415 n=3 (The numbers n parentheses are standard errors of the estmators.) a) Interpret the coeffcents on medncm, poverty and valadd b) Are the slope coeffcents ndvdually sgnfcant at 1%? c) Test the ont sgnfcance of fuelol and valadd, knowng that arqual 97.67.566 popln.1medncm.174 poverty (1.41) (.) (.39) (.78) R.339 n 3 d) If you omt the varable poverty n the frst model, the followng results are obtaned: arqual 8.98.53 popln.97 medncm (1.) (.31) (.55).63 fueol.37 valadd (.17) (.8) R.18 n3 Are the slope coeffcents ndvdually sgnfcant at 1% n the new model? Do you consder these results to be reasonable n comparson wth those obtaned n part b). Comparng the R of the two estmated models, what s the role played by poverty n determnng ar qualty? e) If you regress arqual usng as regressors only the ntercept and poverty, you wll obtan that R =.37. Do you consder ths value to be reasonable takng nto account the results obtaned n part d)? Exercse 4.9 Wth a sample of 39 observatons, the followng producton functons by OLS was estmated: 1.3.3 outputt alabort captalt trendt ˆ 1.41.47 outputt blabort captalt = ˆ exp(.55 ) R =.9945 = R =.9937 output = g ˆ exp(.55 trendt ) R =.9549 a) Test the ont sgnfcance of labor and captal. b) Test the sgnfcance of the coeffcent of the varable trend. c) Identfy the statstcal assumptons under whch the test carred out n the two prevous sectons are correct. A further queston: Specfy the populaton model of the frst of the three prevous specfcatons. Exercse 4.1 A researcher has developed the followng model: y 1 x 3x3 u Usng a sample of 43 observatons, the followng results were obtaned: y ˆ -.6 1.44 x.48 x 1.111.7.5 1 ( XX ).31.16.1 y 444 a) Test that the ntercept s less than. yˆ 44.9 39

b) Test that =. c) Test the null hypothess that β +3β 3 =. Exercse 4.11 Gven the functon of producton q ak l exp( u) and usng data from the Spansh economy over the past years, the followng results were obtaned: ln( q ) =.15 +.73ln( k ) +.47ln( l ) 419 95 66 1 95 3 5 XX RSS.17 66 5 19 a) Test the ndvdual sgnfcance of the coeffcents on k and l. b) Test whether the parameter α s sgnfcantly dfferent from 1. c) Test whether there are ncreasng returns to scale. Exercse 4.1 Let the followng multple regresson model be: y x x u 1 1 Wth a sample of 33 observatons, ths model s estmated by OLS, obtanng the followng results: yˆ 1.7 14.x.1x 1 4.1.95.66 ˆ XX 1.95 3.8.5.66.5 1.9 a) Test the null hypothess = 1. b) Test whether 1 7 c) Are the coeffcents, 1, y ndvdually sgnfcant? Exercse 4.13 Usng a sample of 3 companes, the followng cost functons have been estmated: (11.97) (3.7) 3 (9.44) (33.81) (11.61) (1.) a) cost = 17.46+ 35.7 x R =.838 R =.89 RSS = 89 b) cost = 31.7-85.39 x + 6.73 x - 1.4 x R =.978 R =.974 RSS = 197 where cost s the average cost and x s the quantty produced. (The numbers n parentheses are standard errors of estmators.) a) Whch of the two models would you choose? What would be the crtera? b) Test whether the quadratc and cubc terms of the quantty produced are sgnfcant n determnng the average cost. c) Test the overall sgnfcance of the model b). Exercse 4.14 A researcher formulates the followng model: y 1x 3x 3+u Usng a sample of 13 observatons the followng results are obtaned: 4

yˆ 1. 1.8x.36x3 R.5 n 13.5.1.4 var( β ˆ).1.16.15.4.15.81 a) Test the null hypothess that aganst the alternatve hypothess that. b) Test the null hypothess that 3 1 aganst the alternatve hypothess that 3 1, wth a sgnfcance level of 5%. c) Is the whole model sgnfcant? d) Assumng that the varables n the estmated model are measured n natural logarthms, what s the nterpretaton of the coeffcent for x 3? Exercse 4.15 Wth a sample of 5 automotve companes the followng producton functons were estmated takng the gross value added of the automoble producton (gva) as the endogenous varable and labor nput (labor) and captal nput (captal) as explanatory varables. ln( gva ) = 3.87 +.8ln( labor ) + 1. 4 ln( captal ) 1) (.11) (.4), RSS = 54 R =. 75 R =. 7 ln( ) gva ) = 19. 9 + 1. 4ln( captal) RSS = 59 R =. 84, R =. 81 3) ln( gva labor ) = 15. +.87 ln( captal labor ) RSS = 38 (The numbers n parentheses are standard errors of estmators.) a) Test the ont sgnfcance of both factors n the producton functon. b) Test whether labor has a sgnfcant postve nfluence on the gross value added of automoble producton. c) Test the hypothess of constant returns to scale. Explan your answer. Exercse 4.16 Wth a sample of 35 annual observatons two demand functons of Roa wne have been estmated. The endogenous varable s spendng on Roa reserve wne (wne) and the explanatory varables are dsposable ncome (nc), the average prce of a bottle of Roa reserve wne (pwnro) and the average prce of a bottle of Rbera Duero reserve wne (pwnduer). The results are as follows: ln( vno ) = 1.3 +.11ln( renta ) - 1.33 ln( pvnro ) + 1.35 ln( pvnduer ) (.1) (.3) (.33) R =. 95 RSS= 54 ln( vno ) = 19.9 +.14 ln( renta ) -1.4 ln( pvnro ) (.) (.1) RSS = 59 (The numbers n parentheses are standard errors of the estmators.) a) Test the ont sgnfcance of the frst model. (1) 41

b) Test whether the prce of wne from Rbera del Duero has a sgnfcant nfluence, usng two statstcs that do not use the same nformaton. Show that both procedures are equvalent. c) How would you test the hypothess that the prce elastcty of Roa wne s the same but wth an opposte sgn to the prce elastcty of Rbera del Duero wne? Detal the procedure to follow. Exercse 4.17 To analyze the demand for Ceylon tea (tecel) the followng econometrc model s formulated: ln( tecel ) ln( nc ) ln( ptecel ) ln( ptend) ln( pcobras) u 1 3 4 5 where nc s the dsposable ncome, ptecel the prce of tea n Ceylon, ptend s the prce of tea n Inda and pcobras s the prce of Brazlan coffee. Wth a sample of observatons the followng estmates were made: ln( tecel ) =.83+.5 ln( nc ) -1.48 ln( ptecel ) (.17) (.98) + 1.18 ln( ptend ) +.19 ln( pcofbras ) (.69) (.16) RSS=.477 ln( tecel ptecel ) =.74 +.6 ln( nc ) +. ln( pcofbras ) (.16) (.15) RSS=.6788 (The numbers n parentheses are standard errors of the estmators.) a) Test the sgnfcance of dsposable ncome. b) Test the hypothess that 3 1 y 4, and explan the procedure appled. c) If nstead of havng nformaton on RSS, only R was known for each model, how would you proceed to test the hypothess of secton b)? Exercse 4.18 The followng ftted models are obtaned to explan the deaths of chldren under 5 years per 1 lve brths (deathu5) usng a sample of 64 countres. 1) deathun 5 = 63.64 -.56nc +.3 fertrate ; R =.777 (.19) (.1) ) deathun 5 = 168.31-.55nc + 1.76 femlrat + 1.87 fertrate, R =.7474 (.18) (.5) where nc s ncome per capta, femltrat s the female llteracy rate, and fertrate s the fertlty rate (The numbers n parentheses are standard errors of the estmators.) a) Test the ont sgnfcance of ncome, llteracy and fertlty rates. b) Test the sgnfcance of the fertlty rate. c) Whch of the two models would you choose? Explan your answer. Exercse 4.19 Usng a sample of 3 annual observatons, the followng estmatons were obtaned to explan the car sales (car) of a partcular brand: car = 14.8-6.64 pcar +.98adv å ( 6.48) (3.19) (.16) = - = å u ˆ 185.; (car car) 13581.4 4

where pcar s the prce of cars and adv are spendng on advertsng. (The numbers n parentheses are standard errors of the estmators.) a) Are prce and advertsng expendtures sgnfcant together? Explan your answer. b) Can you accept that prces have a negatve nfluence on sales? Explan your answer. c) Descrbe n detal how you would test the hypothess that the mpact of advertsng expendtures on sales s greater than mnus.4 tmes the mpact of the prce. Exercse 4. In a study of the producton costs (cost) of 6 coal mnes, the followng results are obtaned: cost =.-.14dmec + 3.48 geodf +.14 absent (3.4) (.5) (.) (.15) cp ˆ cp u 19.6 18.48 where dmec s the degree of mechanzaton, geodf s a measurement of geologcal dffcultes and absent s the percentage of absenteesm. a) Test the sgnfcance of each of the model coeffcents. b) Test the overall sgnfcance of the model. Exercse 4.1 Wth ffteen observatons, the followng estmaton was obtaned: yˆ 8.4.46 x.3 x 3 (1.) (.6) R.3 where the values between parentheses are standard devatons and the coeffcent of determnaton s the adusted one. a) Is the coeffcent of the varable x sgnfcant? b) Is the coeffcent of the varable x 3 sgnfcant? c) Dscuss the ont sgnfcance of the model. Exercse 4. Consder the followng econometrc specfcaton: y1x3x34x4 u Wth a sample of 6 observatons, the followng estmatons were obtaned: yˆ 3.5x.7x x u R =.98 1) 1 3 (1.9) (.) (1.5) yˆ 1.5 3 ( x x ).6 x u R =.876 ) 1 3 (.7) (.4) (The t statstcs are between brackets) a) Show that the followng expressons for the F-statstc are equvalent: RSSR RSSUR/ r RUR RR / q F F RSSUR /( n k) (1 RUR ) / ( n k) b) Test the null hypothess β = β 3. Exercse 4.3 In the estmaton of the Brown model n exercse 3.19, usng the workfle consumsp, we obtaned the followng results: conspc t 7.156.3965ncpct.5771conspct 1 (84.88) (.857) (.93) 43

R =.997 RSS=18913 n=56 Two addtonal estmatons are now obtaned: conspc conspc 98.13.757( ncpc conspc ) t t1 t t1 (84.43) (.83) R =.179 RSS=199474 n=56 conspc ncpc 7.156.64ncpc.5771( conspc ncpc ) t t1 t1 t (84.88) (.9) (.93) R =.657 RSS=18913 n=56 (The numbers n parentheses are standard errors of the estmators.) a) Test the sgnfcance of each of the coeffcents for the frst model. b) Test that the coeffcent on ncpc n the frst model s smaller than.5. c) Test the overall sgnfcance of the frst model. d) Is t admssble that 3 1? e) Show that by operatng n the thrd model you can reach the same coeffcents as n the frst model. Exercse 4.4 The followng model was formulated to analyze the determnants of the medan base salary n $ for graduatng classes of 1 from the best Amercan busness schools (salmbagr): salmbagr 1tuton 3salMBApr u where tuton s tuton fees ncludng all requred fees for the entre program (but excludng lvng expenses) and salmbapr s the medan annual salary n $ for ncomng classes n 1. Usng the data n MBAtu1, the prevous model has been estmated: salmbagr 4489.1881tuton.599 salmbapr (5415) (.68) (.115) R =.73 n=39 (The numbers n parentheses are standard errors of the estmators.) a) Whch of the regressors ncluded n the above model are ndvdually sgnfcant at 1% and at 5%? b) Test the overall sgnfcance of the model. c) What s the predcted value of salmbagr for a graduate student who pad 1$ tuton fees n a two-year MBA master and prevously had a salmbapr equal to 7$? How many years of work does the student requre to offset tuton expenses? To answer ths queston, suppose that the dscount rate equals the expected rate of salary ncrease and that the student receved no wage ncome durng the two master courses. d) If we added the regressor rank1 (the rank of each busness school n 1), the followng results were obtaned: salmbagr 613.19tuton.466 salmbapr (85) (.66) (.155) 3.6rank1 (85.13) R =.755 n=39 Whch of the regressors ncluded n ths model are ndvdually sgnfcant at 5%? What s the nterpretaton of the coeffcent on rank1? 44

e) The varable rank1 s based on three components: gradpoll s a rank based on surveys of MBA grads and contrbutes 45 percent to fnal rankng; corppoll s a rank based on surveys of MBA recruters and contrbutes 45 percent to fnal rankng; and ntellec s a rank based on a revew of faculty research publshed over a fve-year perod n top academc ournals and faculty books revewed n The New York Tmes, The Wall Street Journal, and Bloomberg Busnessweek over the same perod; ths last rank contrbutes 1 percent to the fnal rankng. In the followng estmated model rank1 has been substtuted for ts three components: salmbagr 7994.35tuton.3751salMBApr (17) (.696) (.17) 33.8 gradpoll 33.89corppoll 113.36ntellec (94.54) (61.6) (64.9) R =.797 n=39 What s the weght n percentage of each one of these three components n determnng the salmbagr? Compare the results wth the contrbuton of each n defnng rank1. f) Are gradpoll, corppoll and ntellec ontly sgnfcant at 5%? Are they ndvdually sgnfcant at 5%? Exercse 4.5 (Contnuaton of exercse 3.1). The populaton model correspondng to ths exercse s: ln( wage) educ tenure age u 1 3 4 Usng workfle wage6sp, the prevous model was estmated: ln( wage) 1.565.448educ.177tenure.65age (.73) (.35) (.19) (.16) R =.337 n=8 (The numbers n parentheses are standard errors of the estmators.) a) Test the overall sgnfcance of the model. b) Is tenure statstcally sgnfcant at 1%? Is age postvely sgnfcant at 1%? c) Is t admssble that the coeffcent of educ s equal to that of tenure? Is t admssble that the coeffcent of educ s trple to that of tenure? To answer these questons you have the followng addtonal nformaton: ln( wage) 1.565.71educ.177( educ tenure).65age (.73) (.4) (.19) (.16) ln( wage) 1.565.8educ.177(3 educ tenure).65age (.73) (.71) (.19) (.16) Can you calculate the R n the two equatons n part c)? Please do t. Exercse 4.6 (Contnuaton of exercse 3.13). Let us take the populaton model of ths exercse as the reference model. In the estmated model, usng workfle housecan, the standard errors of the coeffcents appear between brackets: prce 418587bedrooms 1975bathrms 5.411lotsze (3379) (17) (1785) (.388) R =.486 n=546 a) Test the overall sgnfcance of ths model. 45

b) Test the null hypothess that an addtonal bathroom has the same nfluence on housng prces than four addtonal bedrooms. Alternatvely, test that an addtonal bathroom has more nfluence on housng prces than four addtonal bedrooms. (Addtonal nformaton: var( ˆ ) =1455813; var( ˆ 3) =318653; and var( ˆ ˆ, 3) =-764846). c) If we add the regressor stores (number of stores excludng the basement) to the model, the followng results have been obtaned: prce 4185bedrooms 1715bathrms (363) (115) (1734) 5.49lotsze 7635stores (.369) (18) R =.536 n=546 What do you thnk about the sgn and magntude of the coeffcent on stores? Do you fnd t surprsng? What s the nterpretaton of ths coeffcent? Test whether the number of stores has a sgnfcant nfluence on housng prces. d) Repeat the tests n part b) wth the model estmated n part c). (Addtonal nformaton: var( ˆ ) =1475758; var( ˆ 3) =386; and var( ˆ ˆ, 3) =- 554381). Exercse 4.7 (Contnuaton of exercse 3.14). Let us take the populaton model of ths exercse as the reference model. Usng workfle ceoforbes, the estmated model was the followng: ln( salary) 4.641.54roa.893ln( sales ).564 profts.1tenure (.377) (.33) (.45) (.) (.3) R =.3 n=447 (The numbers n parentheses are standard errors of the estmators.) a) Does roa have a sgnfcant effect on salary? Does roa have a sgnfcant postve effect on salary? Carry out both tests at the 1% and 5% sgnfcance level. b) If roa ncreases by ponts, by what percentage s salary predcted to ncrease? c) Test the null hypothess that the elastcty salary/sales s equal to.4. d) If we add the regressor age, the followng results are obtaned: ln( salary) 4.159.55roa.93ln( sales ).539 profts (.44) (.33) (.43) (.).94tenure.88 age (.35) (.43) R =.4 n=447 Are the estmated coeffcents very dfferent from the estmates n the reference model? What about the coeffcent on tenure? Explan t. e) Does age have a sgnfcant effect on the salary of a CEO? f) Is t admssble that the coeffcent of age s equal to the coeffcent of tenure? (Addtonal nformaton: var( ˆ 5) =1.4E-5; var( ˆ 6) =1.8E-5; and var( ˆ ˆ 5, 6) =-6.9E-6). Exercse 4.8 (Contnuaton of exercse 3.15). Let us take the populaton model of ths exercse as the reference model. Usng workfle rdspan, the estmated model was the followng: 46

rdntens 1.8168.148 ln( sales ).11exponsal (.48) (.78) (.1) R =.48 n=1983 (The numbers n parentheses are standard errors of the estmators.) a) Is the sales varable ndvdually sgnfcant at 1%? b) Test the null hypothess that the coeffcent on sales s equal to.? c) Test the overall sgnfcance of the reference model. d) If we add the regressor ln(workers), the followng results are obtaned: rdntens.48.8585ln( sales).149exponsal.34 ln( workers) (.75) (.687) (.1) (.9198) R =.55 n=1983 Is sales ndvdually sgnfcant at 1% n the new estmated model? e) Test the null hypothess that the coeffcent on ln(workers) s greater than.5? Exercse 4.9 (Contnuaton of exercse 3.16). Let us take the populaton model of ths exercse as the reference model. Usng workfle hedcarsp, the correspondng ftted model s the followng: ln( prce) 14.4.581cd.383hpweght.7854 fueleff (.154) (.438) (.79) (.1) R =.83 n=14 (The numbers n parentheses are standard errors of the estmators.) a) Whch of the regressors ncluded n the reference model are ndvdually sgnfcant at 1%? b) Add the varable volume to the reference model. Does volume have a statstcally sgnfcant effect on ln(prce)? Does volume have a statstcally sgnfcant postve effect on ln(prce)? c) Is t admssble that the coeffcent of volume estmated n part b) s equal but s the opposte of the coeffcent of fueloff? d) Add the varables length, wdth and heght to the model estmated n part b). Takng nto account that volume=length wdth heght, s there perfect multcollnearty n the new model? Why? Why not? Estmate the new model f t s possble. e) Add the varable ln(volume) to the reference model. Test the null hypothess that the prce/volume elastcty s equal to 1? f) What happens f you add the regressors ln(length), ln(wdth) and ln(heght) to the model estmated n part e)? Exercse 4.3 (Contnuaton of exercse 3.17). Let us take the populaton model of ths exercse as the reference model. Usng workfle tmuse3, the correspondng ftted model s the followng: houswork 141.93.85educ.917 hhnc 1.767 age.89 padwork (3.7) (1.61) (.539) (.311) (.9) R =.144 n=1 (The numbers n parentheses are standard errors of the estmators.) a) Whch of the regressors ncluded n the reference model are ndvdually sgnfcant at 5% and at 1%? 47

b) Estmate a model n whch you could test drectly whether one addtonal year of educaton has the same effect on tme devoted to house work as two addtonal years of age. What s your concluson? c) Test the ont sgnfcance of educ and hhnc. d) Run a regresson n whch you add the varable chldup3 (number of chldren up to three years) to the reference model. In the new model, whch of the regressors are ndvdually sgnfcant at 5% and at 1%? e) In the model formulated n d), what s the most nfluental varable? Why? Exercse 4.31 (Contnuaton of exercse 3.18). Let us take the populaton model of ths exercse as the reference model. Usng workfle hdr1, the correspondng ftted model s the followng: stsfglo.375.7 gnpc.858lfexpec (.584) (.617) (.9) R =.64 n=144 (The numbers n parentheses are standard errors of the estmators.) a) Whch of the regressors ncluded n the reference model are ndvdually sgnfcant at 1%? b) Run a regresson by addng the varables popnosan (populaton n percentage wthout access to mproved santaton servces) and gnrank (rank n gn) to the reference model. Whch of the regressors ncluded n the new model are ndvdually sgnfcant at 1%? Interpret the coeffcents on popnosan and gnrank. c) Are popnosan and gnrank ontly sgnfcant? d) Test the overall sgnfcance of the model formulated n b). Exercse 4.3 Usng a sample of 4 observatons, the followng model has been estmated: yˆ t 67.591 1.8xt For observaton 43, t s known that the value of x s 1571.9. a) Calculate the pont predctor for observaton 43. 43 43 43 b) Knowng that the varance of the predcton error eˆ y yˆ s equal to (4.948), calculate a 9% probablty nterval for the ndvdual value. Exercse 4.33 Besdes the estmaton presented n exercse 4.3, the followng estmaton on the Brown consumpton functon s also avalable: conspc t 179.3965( ncpct 135).5771( conspct 1 1793.6) (64.35) (.857) (.93) R =.997 RSS=18913 n=56 (The numbers n parentheses are standard errors of the estmators.) a) Obtan the pont predctor for consumpton per capta n 11, knowng that ncpc 11 =135 and conspc 1 =1793.6. b) Obtan a 95%confdence nterval for the expected value of consumpton per capta n 11. c) Obtan a 95% predcton nterval for the ndvdual value of consumpton per capta n 11. 48

Exercse 4.34 (Contnuaton of exercse 4.3) Answer the followng questons: a) Usng the frst estmaton n exercse 4.3, obtan a predcton for houswork (mnutes devoted to house-work per day), when you plug n the equaton educ=1 (years), hhnc=1 (euros per month), age=5 (years) and padwork=4 (mnutes per day). b) Run a regresson, usng workfle tmuse3, whch allows you to calculate a 95% CI wth the characterstcs used n part a). c) Obtan a 95% predcton nterval for the ndvdual value of houswork wth the characterstcs used n parts a). Exercse 4.35 (Contnuaton of exercse 4.9) Answer the followng questons: a) Plug n the frst equaton of the exercse 4.9 of cd= (cubc nch dsplacement), hpweght=1 (rato horsepower/weght n kg expressed as percentage), and fueleff=6 (mnutes per day) Obtan the pont predctor of consumpton per capta n 11, knowng that ncpc 11 =1793.6 and conspc 1 =135. b) Obtan a consstent estmate of prce wth the characterstcs used n parts a). c) Run a regresson that allows you to calculate a 95% CI wth the characterstcs used n part a). d) Obtan a 95% predcton nterval for the ndvdual value of the consumpton per capta 11. 49