THE TWOVARIABLE LINEAR REGRESSION MODEL


 Iris Willis
 4 years ago
 Views:
Transcription
1 THE TWOVARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part of the US, for example souther Califoria, where the weather is almost always ice the whole year aroud. I order to support yourself through college, you have started your ow (weeked) busiess: a ice cream parlor o the beach. You have experieced that o hot weekeds you usually sell more ice cream tha o cold weekeds. Also, you have recorded the average temperature ad the sales of ice cream durig eight weekeds. Let Y be the sales of ice cream o weeked, measured i $00, ad let X be the average temperature o weeked, measured i uits of 0 degrees Fahreheit: Table : Ice cream data Sales (uit = $00) Temperature (uit = 0 degrees) Y = 8 X = 5 Y 2 = 0 X 2 = 7 Y 3 = 8 X 3 = 6 Y 4 = 3 X 4 = 8 Y 5 = 5 X 5 = 0 Y 6 = 4 X 6 = 9 Y 7 = X 7 = 7 Y 8 = 9 X 8 = 8 You wat to use this iformatio to forecast ext weeked's sales of ice cream, give a good forecast of ext weeked's temperature. Such a forecast of the sales will eable you to These lecture otes are based o lecture otes that I wrote while teachig at the Uiversity of Califoria, Sa Diego, i the witer of 987.
2 reduce your cost by adustig your purchase of ice cream to the expected demad, because the ice cream you do't sell has to be throw away. Let your forecastig scheme be ˆ Y ' ˆα % ˆβ.X, i.e., give the temperature of X times 0 degrees ad give the values of α^ ad β^, Yˆ times $00 will be your forecast of the sales of ice cream. This forecastig scheme together with the poits (X,Y ), ',2,...,8, is plotted i Figure : Figure Scatter plot of (X,Y ), ',2,...,8, together with the lie Y ˆ ' ˆα % ˆβ.X. The best values for α^ ad β^ are those for which the forecast error (= actual sales mius forecasted sales) is miimal. However, you do ot kow yet the actual sales i the ext weeked, but you do kow the actual sales i the eight weekeds for which you have recorded your sales ad the correspodig temperature. So what you could do is to forecast the sales of ice cream o each of these eight weekeds ad to determie α^ ad β^ such that the forecast errors are miimal. Because forecast errors ca be positive ad egative, as ca be see from Figure, the sum of the forecast errors is ot a good measure of the performace of your forecastig 2
3 scheme, because large positive errors ca be offset by large egative errors. Therefore, use the sum of squared errors as your measure of the accuracy of your forecasts: Q(ˆα,ˆβ) ' (Y & Yˆ ) 2 ' (Y & ˆα & ˆβX ) 2, where is the sample size ( = 8 i our example), ad miimize Q(α^,β^) to show (see the Appedix) that Q(α^,β^) is miimal for $$ ' ' (X & X )(Y & Ȳ ) ' (X & X ) 2 ' ' (X & X )Y ' (X & X ) 2 ˆα ad ˆβ. It ca be $" ' Ȳ & $$ X, where X ' (/)' X ad Ȳ ' (/)' Y. () I the ice cream parlor case we have ' 8, X ' 7.5, Ȳ ', ' X 2 ' 468, ' X Y ' 687, ' (X & X )(Y & Ȳ ) ' ' X Y &. X.Ȳ ' 27, ' (X & X ) 2 ' ' X 2 &. X 2 ' 8, so that ˆβ '.5, ˆα ' &0.25. Thus, our best forecastig scheme is Y ˆ ' &0.25 %.5X. This is the straight lie i Figure. hece ˆ Y x $00 Now suppose that the forecast of ext weeked's temperature is 75 degrees. The X = 7.5, ˆ Y ' &0.25 %.5 (7.5) = $,00. =. Therefore, the best forecast of ext weeked's sales is: 2. The twovariable liear regressio model. I order to aswer the questio how good this forecast is, we have to make assumptios about the true relatioship betwee the depedet variable Y ad the idepedet variable X, (also called the explaatory variable). The true relatioship we are goig to assume is the twovariable liear regressio model: Y ' " % $.X % U, ',2,...,. (2) 3
4 The U 's are radom error variables, called error terms, for which we assume: Assumptio I: The U 's are idepedet ad idetically distributed (i.i.d) radom variables. Assumptio II: The mathematical expectatio of U equals zero: E(U ) = 0 for =,2,...,. Assumptio III: The variace σ 2 ' var(u ) ' E[(U &E(U )) 2 ] ' E[U 2 ] of the U 's is costat ad fiite. Regardig the explaatory variables X we shall assume for the time beig that Assumptio IV: The idepedet variables X are oradom. This assumptio is ot strictly ecessary, ad is actually quite urealistic i ecoomics, but will be made for the sake of coveiece, as it will ease the argumet. Fially, we will assume that the errors are ormally distributed: Assumptio V: The errors U 's are N(0, σ 2 ) distributed. I particular, we shall eed the latter assumptio i order to say somethig about the reliability of the forecast. These assumptios will be relaxed later o. 3. The properties of ˆα ad ˆβ. Although we have motivated model (2) by the eed to forecast outofsample values of the depedet variables Y, a liear regressio model is more ofte used for testig ecoomic hypotheses. For example, let Y be the hourly wage of wage earer i a radom sample of size of wage earers, ad let X be a geder idicator, say X ' if perso is a female, ad X ' 0 if perso is a male. If you suspect geder discrimiatio i the workplace, you ca test this suspicio by testig the ull hypothesis that β = 0 (o geder discrimiatio) agaist oe of 4
5 three possible alterative hypotheses: (a) (b) (c) β 0: wome are paid differet hourly wages tha me, either higher or lower; β > 0: wome are paid higher hourly wages tha me; β < 0: wome are paid lower hourly wages tha me. The last hypothesis is usually what is meat by geder discrimiatio. A test for the ull hypothesis β = 0 agaist oe of these alterative hypotheses ca be based o the estimate ˆβ of β, provided that we kow how ˆβ is related to β. It will be show below that ˆα ad ˆβ are ideed reasoable approximatios of α ad β, respectively, possessig particular desirable properties. I geeral a estimator of a ukow parameter is a fuctio of the data that serves as a approximatio of the parameter ivolved. It follows from () that ˆα ad ˆβ are fuctios of the data, (Y,X ),...,(Y,X ). Because ˆα ad ˆβ will be used as approximatios of α ad β, respectively, ad were obtaied by miimizig the squared errors, we will call ˆα ad ˆβ the Ordiary 2 Least Squares (OLS) estimators of α ad β, respectively. 3. Ubiasedess The first property of ˆα ad ˆβ is that they are ubiased estimators of α ad β: Propositio. Uder Assumptios II ad IV the OLS estimators ˆα ad ˆβ are ubiased, which meas that E[ˆα] = α ad E[ˆβ] ' β. This result follows from the fact that we ca write $" ' " % X(X & & X ).U ' i' (X i &, $$ ' $ % ' (X & X )U. X ) 2 ' i' (X i & (3) X ) 2 See the Appedix. 2 The estimators ˆα ad ˆβ are called "Ordiary" least squares estimators to distiguish them from "Noliear" least squares estimators. 5
6 3.2 The variaces of ˆα ad ˆβ. Our ext issue cocers the variaces of followig two lemmas are coveiet. ˆα ad ˆβ. For derivig these variaces the Lemma. Let U, U 2,...,U be idepedet radom variables with zero mathematical expectatio (thus E(U ) = 0) ad variace σ 2. (Thus E[(U E(U )) 2 ] = E(U 2 ) = σ 2 ). Let v, v 2,...,v ad w,w 2,...,w be give costats. The E[(' v U )(' w U )] ' σ2 ' v w. Proof. See the Appedix. Note that if we choose v ' w for ',2,..., i Lemma the it reads: Lemma 2. Let U, U 2,...,U be idepedet radom variables with zero mathematical expectatio ad variace σ 2. Let w,w 2,...,w be give costats. The E[(' w U )2 ] ' σ 2 ' w 2. Usig (3) ad Lemmas ad 2 it ca be show that Propositio 2. Uder the assumptios I  IV, var($") ' F 2 ' X 2 ' (X & X) 2 ' F 2 $", say, var( $$) ' F 2 ' (X & X) 2 ' F 2 $$, say, ad cov($",$$) ' &F 2 X ' (X & X) 2. (4) Proof. See the Appedix 3.3 Normality of ˆα ad ˆβ. If we also assume ormality of the error terms U the ˆα ad ˆβ are also ormally distributed. This result follows from the followig lemma. 6
7 Lemma 3. Let Z, Z 2,...Z m be idepedet N(µ,σ 2 ) distributed radom variables ad let w,..,w m be costats. The ' m w Z is distributed N[(' m w )µ,('m w 2 )σ2 ]. The proof of this lemma requires advaced probability theory ad is therefore omitted. It follows ow straightforwardly from Propositio 2, Lemma 3, ad (3) that: Propositio 3. Uder the assumptios I  V, $" & "  N 0, F 2 ' X 2 F, $$ & $  N 0, 2, ' (X & X ) 2 ' (X & X ) 2 (5) where  is the symbol for is distributed as. Moreover, applyig Lemma 3 agai for m = it follows from (5) (Exercise: Why?) that Propositio 4. Uder the assumptios I  V, ($" & ") ' (X & X ) 2 F. ' X 2  N[0,], ($$ & $) ' (X & X ) 2 F  N[0,]. (6) These results play a keyrole i testig hypotheses about α ad β. The oly problem that prevets us from usig these results for testig is that σ is ukow. This problem will be addressed i the ext sectio. 4. How to estimate the error variace σ 2? If α ad β were kow the we could estimate σ 2 by #F 2 ' (Y & " & $.X ) 2 ' U 2. (7) However, α ad β are ot kow, but we do have OLS estimators of α ad β. This suggests to 7
8 replace α ad β i (7) by their OLS estimators: where #F 2 ' (Y & $" & $$.X ) 2 ' $U 2, (8) $U ' Y & $" & $$.X (9) is called the regressio residual. However, the estimator (8) is biased, due to the fact that Propositio 5. Uder the assumptios I  V, E[' Û 2 ] ' ( & 2)σ2. Proof: See the Appedix. This result suggests to use $F 2 ' &2 $U 2 (0) as a estimator of σ 2 istead of (8), because the by Propositio 5, ˆσ 2 is a ubiased estimator of σ 2 : The sum ' Û 2 Residual Sum of Squares (RSS), ad shortly SER. Thus, E[$F 2 ] ' F 2. () is called the Sum of Squares Residuals, shortly SSR, or also called the ˆσ ' ˆσ 2 is called the Stadard Error of the Residuals, SSR ' ' $ U 2, SER ' ' $ U 2 &2 ' SSR &2 (' $F). (2) Fially, ote that the sum of squared residuals ca be computed as follows: See the Appedix. SSR ' (Y & Ȳ ) 2 & $$ 2 (X & X ) 2. (3) 8
9 5. Stadard errors, tvalues ad pvalues of the OLS estimators The variaces of ˆα ad ˆβ ca ow be estimated by replacig σ 2 i (4) by ˆσ 2 : Estimated var( $") ' Estimated var( $$) ' $F 2 ' X 2 ' (X & X) 2 $F 2 ' (X & X) 2 ' $F 2 $", say, ' $F 2 $$, say. (4) The ˆσˆα ' ˆσ 2ˆα is called the stadard error of ˆα, also deoted by SE(ˆα), ad ˆσˆβ ' ˆσ2ˆβ is called the stadard error of ˆβ, also deoted by SE(ˆβ). chage: If we replace σ i Propositio 4 by the SER, ˆσ, the stadard ormality results ivolved Propositio 6. Uder the assumptios I  V, $"&" $F $" ' ($" & ") ' (X & X ) 2 $F. ' X 2  t &2, $$&$ $F $ $ ' ( $$ & $) ' (X & X ) 2 $F  t &2. (5) The proof of Propositio 6 is based o the fact that uder these assumptios, SSR/σ 2 is distributed χ 2 &2 ad is idepedet of ˆα ad ˆβ, but the proof ivolved requires advaced probability theory ad is therefore omitted. Because for large degrees of freedom the t distributio is approximately equal to the stadard ormal distributio, ad due to the cetral limit theorem, Propositio 4 holds if is large ad the errors are ot ormally distributed, we also have: Propositio 7. If the sample size is large the uder the assumptios I  IV we have approximately, $"&" $F $" ' ($" & ") ' (X & X ) 2 $F. ' X 2  N(0,), $$&$ $F $ $ ' ( $$ & $) ' (X & X ) 2 $F  N(0,). (6) 9
10 The results i Propositio 6 ow eable us to test hypotheses about α ad β. I particular the ull hypothesis that β = 0 is of importace, because this hypothesis implies that X has o effect o Y. The test statistic for testig this hypothesis is the tvalue (or tstatistic) of ˆβ: def. $t $ (' t&value of $$ $$) ' $F $ ' $$ ' (X & X ) 2 $F  t &2 if $ ' 0. (7) If β > 0 ad 6 4 the the tvalue of ˆβ coverges i probability to +4, ad if β < 0 ad 6 4 the the tvalue of ˆβ coverges i probability to!4. Moreover, if the sample size is large the by Propositio 7 we may use the stadard ormal distributio istead of the t distributio to fid critical values of the test. Similarly, def. t $ $" (' t&value of $") ' $" $F $"  t &2 if " ' 0. (8) However, the hypothesis α = 0 is ofte of o iterest. I the ice cream example, ' (X & X ) 2 ' 8 Y ' (X & X ) 2 ' , ad by (3), ' (Y & Ȳ ) 2 ' ' Y 2 &.Ȳ 2 ' 020 & 8 2 ' 52 ˆσ 2 ' &2 Û 2 ' &2 (Y & Ȳ ) 2 & ˆβ 2 &2 (X & X ) 2 Hece, ' 52 & (.5)2.8 8&2 ' Y ˆσ $$ t $ $ ' ' (X & X ) 2 $F ' (9) Assumig that the coditios of Propositio 6 hold, the ull hypothesis H 0 : β ' 0 ca be tested 0
11 agaist the alterative hypothesis H : β 0 usig the twosided ttest at say the 5% sigificace level, as follows. Uder the ull hypothesis, (9) is a radom drawig from the t distributio with!2 = 6 degrees of freedom. Look up i the table of the t distributio the value t ( such that for T  t 6, P[ T > t ( ] ' This value is t ( ' The accept the ull hypothesis if &t ( ' &2.447 # ˆtˆβ # ' t (, ad reect the ull hypothesis i favor of the alterative hypothesis if hypothesis ˆt β > t ( ' Thus, i the ice cream example we reect the ull H 0 : β ' 0 because ˆt β ' > ' t (. This test is illustrated i Figure 2 below. The curved lie i Figure 2 is the desity of the t distributio with 6 degrees of freedom. The grey areas are each 0.025, so that the total grey area is Figure 2 Twosided ttest of H 0 : β ' 0 agaist the alterative hypothesis H : β 0. The ull hypothesis H 0 : β ' 0 ca be tested agaist the alterative hypothesis H : β > 0 at the 5% sigificace level by the rightsided ttest. Now look up i the table of the t distributio the value t ( such that for T  t 6, P[T > t ( ] ' This value correspods to the critical value of the twosided ttest at the 0% sigificace level: ull hypothesis if hypothesis if t ( '.943. The accept the ˆtˆβ # t ( '.943, ad reect the ull hypothesis i favor of the alterative ˆt β > t ( '.943. Thus, i the ice cream case we reect the ull hypothesis
12 H 0 : β ' 0 i favor of the alterative hypothesis H : β > 0. This rightsided ttest is illustrated i Figure 3 below. Agai, the curved lie i Figure 3 is the desity of the t distributio with 6 degrees of freedom, ad the grey area is Figure 3 Rightsided ttest of H 0 : β ' 0 agaist the alterative hypothesis H : β > 0. If the sample size is large, so that ˆtˆβ  N(0,) if β ' 0, the a alterative way of testig the ull hypothesis β = 0 agaist the alterative hypothesis β 0 is to use the (twosided) pvalue: For example, if def. $p $ (' p&value of $$) ' P[ U > $ t $ ], where U  N(0,). (20) ˆpˆβ < 0.05 we reect the ull hypothesis β = 0 i favor of the alterative hypothesis β 0 at the 5% sigificace level, ad if = 0. The pvalue for ˆα is defied ad used similarly. ˆpˆβ $ 0.05 we accept the ull hypothesis β Although a tvalue is a test statistics of the ull hypothesis that the correspodig coefficiet i the regressio model is zero, it is quite easy to rebuild the tvalue for testig other ull hypotheses, as follows. Suppose you wat to test the ull hypothesis that is a give umber, for example β 0 '. The β ' β 0, where β 0 2
13 $$&$ 0 $F $ $ ' $ $ $F $ $ & $ 0 $F $ $ ' $ $ $F $ $ & $ 0 $ $ $$$F $ $ ' $ $ $F $ $ & $ 0 $$ ' $$&$ 0 $$. $ t $ $, (2) so that by Propositio 5, $t $ $,$'$ 0 ' $$&$ 0 $$. $ t $ $  t &2. (22) For example, suppose that i the ice cream case we wat to test the ull hypothesis H 0 : β '. The t $ $,$' $ ' $& $.$t $ $$ '.5& , (23) which uder the ull hypothesis H 0 : β ' is a radom drawig from the t distributio with 6 degrees of freedom. Note that the value of this test statistic is i the acceptace regios i Figures 2 ad 3. This trick is useful if the ecoometric software you are usig oly reports the tvalues but ot the stadard errors. If the stadard errors are reported, you ca compute ˆtˆβ,β'β directly as 0 ˆtˆβ,β'β ' (ˆβ&β Of course, if oly the stadard errors are reported ad ot the tvalues you 0 0 )/ˆσˆβ. ca compute the tvalue of ˆβ as ˆtˆβ ' ˆβ/ ˆσˆβ. 6. The R 2 The R 2 of a regressio model compares the sum of squared residuals (SSR) of the model with the SSR of a regressio model without regressors: Y ' " % U, ',2,...,. (24) It is easy to verify that the OLS estimator α of α is ust the sample mea of the Y s: #" ' Ȳ ' Y. 3
14 Therefore, the SSR of regressio model (24) is Squares (TSS), is ' (Y &Ȳ )2, which is called the Total Sum of The R 2 is ow defied as: TSS ' (Y & Ȳ ) 2. (26) R 2 def. ' & SSR TSS. (27) The R 2 is always betwee zero ad oe, because SSR # TSS. (Exercise: Why?) If SSR = TSS, so that R 2 = 0, the model (24) explais the depedet variable other words, the explaatory variables s equally well as model (2). I Y i (2) do ot matter: β = 0. The other extreme case is where R 2 =, which correspods to SSR = 0. The the depedet variable X i model (2) is completely explaied by X, without error: / Thus, the R 2 Y α % βx. measures how well the explaatory variables X are able to explai the correspodig depedet variables Y. For example, i the ice cream case, SSR =.5 ad TSS = 52, hece R 2 = Loosely speakig, this meas that about 78% of the variatio of ice cream sales ca be explaied by the variatio i temperature. Y 7. Presetig regressio results Whe you eed to report regressio results you should iclude, ext to the OLS estimates of course, either the correspodig tvalues or the stadard errors, the sample size, the stadard error of the residuals (SER), ad the R 2, because this iformatio will eable the reader to udge your results. For example, our ice cream estimatio results should be displayed as either Sales ' &0.25 %.5Temp., ' 8, SER ' , R 2 ' (&0.00) (4.597) or (t&values betwee brackets) 4
15 Sales ' &0.25 %.5Temp., ' 8, SER ' , R 2 ' ( ) ( ) (stadard errors betwee brackets) It is helpful to the reader if you would idicate whether you have displayed the tvalues betwee brackets or the stadard errors, but you oly eed to metio this oce. 8. Outofsample forecastig The liear regressio model was itroduced as a forecastig scheme. The questio we ow address is: How reliable is a outofsample forecast? Cosider the liear regressio model (2), ad suppose we observe X %. The the forecast of is Yˆ % ' ˆα % ˆβ.X %, where the OLS estimators ˆα ad ˆβ are computed o the basis of Y % the observatios for =,2,...,. The actual but ukow value of so that the forecast error is: Y % = α + β.x % % U %, Y % is Y % & $ Y % ' U % & ($"&") & ($$&$).X % ' U % & % (X % & X )(X & X ).U ' i' (X i &. (28) X ) 2 See the Appedix for the latter equality. It follows ow from Lemma 3 that uder Assumptios I through V, Y % & ˆ Y %  N[0,σ 2 Y % &Ŷ % ], where F 2 Y % & $Y % ' F 2 % % (X % & X ) 2 ' (X & X ) 2. (29) See the Appedix. Deotig, $F 2 Y % & $Y % ' $F 2 % % (X % & X ) 2 ' (X & X ) 2, (30) it follows ow similar to Propositio 6 that 5
16 Propositio 8. Uder assumptios I  V, (Y % & ˆ Y % )/ ˆσ Y% &Ŷ %  t &2. This result ca be used to costruct a 95% cofidece iterval, say, of Y %. Look up i the table of the t distributio the critical value t ( of the twosided ttest with!2 degrees of freedom. The it follows from Propositio 7 that 0.95 ' P[&t ( # (Y % & $ Y % )/$F Y% & $Y % # t ( ] ' P[&t ( $F Y% & $Y % # Y % & $ Y % # t ( $F Y% & $Y % ] (3) ' P[ $ Y % & t ( $F Y% & $Y % # Y % # $ Y % % t ( $F Y% & $Y % ] Thus, the 95% cofidece iterval of Y % is [ ˆ Y % & t (ˆσ Y% &Ŷ %, ˆ Y % % t (ˆσ Y% &Ŷ % ]. Observe from (30) that ˆσ Y% &Ŷ icreases with (X ad so does the width of the % % & X ) 2, cofidece iterval. Thus, the father X % is away from X, the more ureliable the forecast Yˆ % of Y % becomes. Also observe from (30) that ˆσ Y% &Ŷ $ ˆσ, ad that ˆσ gets close to % Y% &Ŷ ˆσ % if is large because lim 64 ' (X & X ) 2 ' Relaxig the oradom regressor assumptio As said before, the assumptio that the regressors X are oradom is too strog a assumptio i ecoomics. Therefore, we ow assume that the X s are radom variables. This requires the followig modificatios of the Assumptios IV: Assumptio I * : The pairs (X,Y ), ',2,3,...,, are idepedet ad idetically distributed. Assumptio II * : The coditioal expectatios E[U X ] are equal to zero: E[U X ] / 0. Assumptio III * : The coditioal expectatios fiite, costat ad equal: assumptio.) E[U 2 X ] / σ 2 < 4. E[U 2 X ] do ot deped o the X 's ad are (This is called the homoscedasticity 6
17 Assumptio IV * : Coditioal o X, U is N(0,σ 2 ) distributed. The Assumptios I * ad II * imply that for =,...,, E[U X,X 2,...,X ] / 0, (32) ad similarly the Assumptios I * ad III * imply that for =,...,, E[U 2 X,X 2,...,X ] / F 2. (33) Because (loosely speakig) coditioig o X,X 2,...,X is effectively the same as treatig them as give costats, most of the previous propositios carry over: Propositio 9. Uder Assumptios I * IV *, Propositios ad 4 through 7 carry over, ad the results i Propositios 2 ad 3 ow hold coditioal o X,X 2,...,X. However, without Assumptio IV * we eed a additioal coditio i Propositio 6 i order to use the cetral limit theorem, amely: Propositio 0. If the sample size is large the uder the assumptios I *  III * ad the additioal coditio E[X 2 ] < 4 the approximate ormality results i Propositio 7 carry over. Moreover, without Assumptio IV * the Propositios 6 ad 8 are o loger true. As to Propositio 6, this ot a big deal, as i large samples we ca still use Propositio 7, but without Assumptio IV * we ca o loger derive cofidece itervals for the forecasts, as these cofidece itervals are based o Propositio 8. It is therefore importat to test the ormality assumptio. 0. Testig the ormality assumptio that For a ormal radom variable U with zero expectatio ad variace σ 2 it ca be show 7
18 def. Kurtosis ' def. Skewess ' E[U 4 ]/F 4 & 3 ' 0, E[U 3 ] ' 0 (34) Therefore, the ormality coditio ca be tested by testig whether the kurtosis ad the skewess of the model errors are zero, usig the residuals. This is the idea behid the JarqueBera 3 ad KieferSalmo 4 tests. Uder the ull hypothesis (34) the test statistic ivolved has a χ 2 2 distributio. Heteroscedasticity 5 does ot hold: We say that the errors U of regressio model (2) are heteroskedastic if assumptio III * E[U 2 X ] ' R(X ) for some fuctio R(.). (35) Heteroscedasticity ofte occurs i practice. It is actually the rule rather tha the exceptio. The mai cosequece of heteroscedasticity is that the coditioal variace formulas i Propositios 2 ad 3 do o loger hold, although the ubiasedess result i Propositio is ot affected by heteroscedasticity. Therefore, the Propositios 48 are o loger valid as well. I particular, the coditioal variace of ˆβ [see (60)] uder heteroscedasticity takes the form var($$ X,...,X ) ' E[($$&$) 2 X,...,X ] ' ' (X & X ) 2 R(X ) ' i' (X i & X ) 2 2. (36) A cure for the heteroscedasticity problem is to replace the stadard error of ˆβ by 3 Jarque, C.M.ad A.K. Bera, (980), "Efficiet Tests for Normality, Homoscedasticity ad Serial Idepedece of Regressio Residuals". Ecoomics Letters 6, 255BB Kiefer, N. ad M. Salmo (983), "Testig Normality i Ecoometric Models", Ecoomic Letters, Also spelled as "Heteroskedasticity." 8
19 #F $ $ ' &2 ' (X & X ) 2 $U 2 ' i' (X i & X ) 2 2. (37) This is kow as the Heteroscedasticity Cosistet (H.C.) stadard error. The H.C. tvalue the becomes tˆβ ' ˆβ/ σˆβ. Uder the ull hypothesis β = 0 this tvalue is o loger t distributed, but the stadard ormal approximatio remais valid if the sample size is large. A popular test for heteroscedasticity is the BreuschPaga 6 test. Give that E[U 2 X ] ' g(( 0 % ( X ) for some ukow fuctio g(.). (38) the BreuschPaga test tests the ull hypothesis agaist the alterative hypothesis H 0 : ( ' 0 ] E[U 2 X ] ' g(( 0 ) ' F 2, say (39) H 0 : ( 0 ] E[U 2 X ] ' g(( 0 %( X ) ' R(X ), say. (40) Uder the ull hypothesis (39) of homoskedasticity the test statistic of the BreuschPaga test has a χ 2 distributio 7, ad the test is coducted rightsided. 2. How close are OLS estimators? The ice cream data i Table is ot based o ay actual observatios o sales ad temperature; I have picked the umbers for X ad Y quite arbitrarily. Therefore, there is o way to fid out how close the OLS estimates ˆα ' &0.25, ˆβ '.5 are to the ukow parameters α ad β. Actually, we do ot kow either whether the liear regressio model (2) ad its assumptios are applicable to this artificial data. I order to show how well OLS estimators approximate the correspodig parameters I 6 Breusch, T. ad A. Paga (979), "A Simple Test for Heteroscedasticity ad Radom Coefficiet Variatio", Ecoometrica 47, I the multiple regressio case the degrees of freedom is equal to the umber of parameters mius for the itercept. 9
20 have geerated radom samples 8 (Y,X ),...,(Y,X ) for three sample sizes: = 0, = 00 ad = 000, as follows. The explaatory variables distributio, the regressio errors distributio, ad the U X have bee draw idepedetly from the χ 2 have bee draw idepedetly from the N(0,) Y s have bee geerated by Y ' % X % U, ',2,...,. (4) Thus, i this case the parameters α ad β i model (2) are α = ad β =, ad the stadard error of U is σ =. Moreover, ote that the Assumptios I * IV * hold for model (4). The true R 2 ca be defied by R 2 0 ' & E[SSR] E[TSS] ' & (&2)σ 2. ' E[(Y &Ȳ )2 ] I the case (4), σ 2 ', µ Y ' E(Y ) ' % E(X ) ' 2, ' E[(Y &Ȳ )2 ] ' E' (Y &µ Y ) & (Ȳ&µ Y ) 2 ' E' (Y &µ Y )2 & (Ȳ&µ Y ) 2 ' (&)var(y ) ad var(y ) ' E[(X & % U ) 2 ] ' E[(X & ) 2 ] % E[U 2 ] ' E[(X & ) 2 ] % ' 3, because X is χ 2 distributed ad therefore has the same distributio as U 2, ad it ca be show that for stadard ormal radom variables is U, E[(U 2 &)2 ] ' 2. Thus, the true R 2 i this case R 2 0 ' & &2 3(&) ' 2& 3& for ' for ' for ' 000 The estimatio results ivolved are give i Table 2: 8 Via the EasyReg Iteratioal meus File 6 Choose a iput file 6 Create artificial data. Rather tha geeratig oe radom sample of size = 000 ad the usig subsamples of sizes = 0 ad = 00, these samples have bee geerates separately for = 0, = 00 ad =
21 Table 2: Artificial regressio estimatio results ˆβ ˆα SER (' ˆσ) R 2 estimate: (t&value): (7.87) (.675) estimate: (t&value): (2.753) (8.237) estimate: (t&value): (47.24) (26.037) Eve for a sample size of = 0 the OLS estimator ˆβ is already pretty close to its true value, ad the same applies to ˆσ, but ˆα is too far away from the true value α =. However, for = 00 the OLS estimators ˆβ ad ˆα deviate oly about ±4% from their true values α = β =, ad deviates about % from its true value. I the case = 000 these deviatios reduce to about ±2%. The R 2 's are too high, ad oly for = 000 is the R 2 reasoably close to its true value. However, the R 2 is oly a descriptive statistic; it does ot play a role i hypotheses testig, so that the ureliability of the R 2 i small samples is harmless. Notice the quite dramatic icrease of the tvalues. Recall that these tvalues are the test statistics of the ull hypotheses that the correspodig parameters are zero. Because the true parameters are equal to, what you see i Table 2 is the icrease of the power of the ttest with the sample size. ˆσ 2
22 APPENDIX Proof of (): The firstorder coditios for a miimum of Q(ˆα,ˆβ) ' ' (Y & ˆα & ˆβX ) 2 are: dq($",$$)/d$" ' 0 ] 2(Y & $" & $$X )(&) ' 0 ] (Y & $" & $$X ) ' 0 ] Y & $" & ($$X ) ' 0 ] Y ' $" % $$ X ' 0 ] Ȳ ' $" % $$. X, (42) ad dq($",$$)/d$$ ' 0 ] ] ] ] ] 2(Y & $" & $$X )(&X ) ' 0 (Y X & $"X & $$X 2 ) ' 0 X Y & $" X Y ' $" X & $$ X % $$ X Y ' $" X % $$ X 2 ' 0 X 2 X 2 (43) where X ' (/)' X ad Ȳ ' (/)' Y are the sample meas of the X 's ad Y 's, respectively. The last equatios i (42) ad (43) are called the ormal equatios: Ȳ ' $" % $$. X, (44) X Y ' $". X % $$ X 2. (45) To solve these ormal equatios, substitute ˆα ' Ȳ & ˆβ. X i (45). The we get 22
23 hece X Y ' (Ȳ & ˆβ X) X % ˆβ X 2 ' Ȳ. X & ˆβ X 2 % ˆβ ' X.Ȳ % ˆβ X 2 X 2 & X 2 X Y & X.Ȳ ' $$ X 2 & X 2. (46) Equatio (46) ca also be writte as (X & X)(Y & Ȳ ) ' $$ (X & X ) 2, (47) because ad similarly (X & X)(Y & Ȳ ) ' ' ' X Y & X.Y & X.Ȳ % X.Ȳ X Y & X. X Y & X.Ȳ (X & X ) 2 ' X 2 & X 2. Y & Ȳ. X % X.Ȳ (48) (49) Moreover, (X & X)(Y & Ȳ ) ' (X & X)Y & (X & X)Ȳ ' (X & X)Y & ( X & X)Ȳ ' (X & X)Y (50) 23
24 The result () ow follows from (44) ad (46) through (50). Proof of Propositio. Recall from () that $$ ' ' (X & X )Y ' (X & X ) 2. (5) Substitute model (2) i (5). The $$ ' ' (X & X )("%$X %U ) ' (X & X ) 2 ' "' (X & X ) % $' (X & X )X % ' (X & X )U ' (X & X ) 2 ' $. ' (X & X )X ' (X & X ) 2 % ' (X & X )U ' (X & X ) 2 (52) ' $ % ' (X & X )U ' (X & X ) 2, where the last step follows from the fact that similar to (50), (X & X) 2 ' (X & X)(X & X) ' (X & X )X. (53) Now take the mathematical expectatio at both sides of (52). The, E[$$] ' $ % E ' (X & X)U ' (X & X) 2 ' $ % ' (X & X)E(U ) ' $, ' (X & X) 2 (54) because takig the mathematical expectatio of a costat (β) does ot effect that costat, ad takig the mathematical expectatio of a liear fuctio of radom variables is equal to takig the liear fuctio of the mathematical expectatio of these radom variables. The last coclusio i (54) follows from assumptio II, ad the secod step i (54) ca be take because 24
25 we have assumed that the X 's are oradom (assumptio IV). Next cosider ˆα. We have already established that ˆα ' Ȳ & ˆβ. X. Substitutig the right had side of (52) for ˆβ i this equatio yields $" ' Ȳ & $ % ' (X & X )U ' (X & X ) 2. X ' Ȳ & $. X & ' X(X & X )U. ' (X & X ) 2 (55) Substitutig i (55) yields Ȳ ' Y ' (α%βx %U ) ' α % β. X % U $" ' " % U & ' X(X & X )U ' i' (X i & X ) 2 ' " % X(X & & X ).U ' i' (X i &. X ) 2 (56) Similar as for ˆβ we therefore have: E[$"] ' " % X(X & & X ) E[U ' i' (X i & ] ' ". X ) 2 (57) This completes the proof of Propositio. Proof of Lemma : We have E ' v U ' w U ' E' i' ' v i w U i U where the last equality i (58) follows from ' ' i' v w F 2, v i w E(U i U ) E(U i U ) ' E(U i )E(U ) ' 0 if i, (58) ' E(U 2 ) ' F2 if i '. (59) 25
26 Proof of Propositio 2: It follows from formula (52) ad Lemma 2 that var($$) ' E[($$&$) 2 ] ' E X & X ' i' (X i & X ) 2 U 2 ' F 2 X & X ' i' (X i & X ) 2 ' F 2 ' (X & X ) 2 ' i' (X i & X ' ' ) 2 2 F2 (X & X ) 2 ' (X & X ' F2. ) 2 2 ' (X & X ) 2 2 (60) Similarly, it follows from formula (56) ad Lemma 2 that var($") ' E[($"&") 2 ] ' E & X(X & X) ' i' (X i & X) 2 U 2 ' F 2 & X(X & X) ' i' (X i & X) 2 2 ' F 2 2 & 2 X(X & X) % ' i' (X i & X) 2 X 2 (X & X ) 2 ' i' (X i & X ) 2 2 ' F 2 & 2 X(/)' (X & X) % ' i' (X i & X) 2 X 2 ' (X & X ) 2 ' i' (X i & X ) 2 2 (6) ' F 2 % X 2 ' (X & X ) 2 ' F 2 (/)' (X & X ) 2 % X 2 ' (X & X ) 2 ' F 2 ' X 2 ' (X & X) 2, where the last equality follows from the fact that (/)' (X & X ) 2 ' (/)' X 2 & X 2. Fially, it follows from Lemma ad the formulas (52) ad (56) that 26
27 cov($",$$) ' E[($"&")($$&$)] ' E ' F 2 & X(X & X) X(X & & X ) (X & X ) ' i' (X i & X ) 2 ' i' (X i & X ) 2 U ' i' (X i & X) 2 X & X ' i' (X i & X ) 2 U (62) which ca be rewritte as (/)' cov($",$$) ' F 2 (X & X ) & X' (X & X ) 2 ' i' (X i & X ) 2 2 Proof of Propositio 5. Observe first from (44) ad (9) that ' &F 2. X ' (X & X ) 2. (63) so that we ca write $U ' Ȳ & $" & $$. X ' 0 (64) $U ' $U & i' $U i ' (Y & Ȳ ) & $$.(X & X ). (65) Next, observe from (2) that Substitutig the former equatio i (65) yields hece Y & Ȳ ' U & Ū % β.(x & X ), where Ū ' (/)' U. $U ' (U & Ū ) & ($$&$)(X & X ), (66) $U 2 ' (U &Ū ) & ($$&$)(X & X ) 2 ' (U &Ū ) 2 & 2($$&$) (X & X )(U &Ū ) % ($$&$) 2 (X & X ) 2 (67) ' (U &Ū ) 2 & 2($$&$) (X & X )U % ($$&$) 2 (X & X ) 2, 27
28 where the last equality follows from the fact that ' (X & X )Ū ' 0. It follows from (52), (67) ad the equality ' (U &Ū )2 ' ' U 2 & Ū 2 that $U 2 ' ' (U &Ū ) 2 & ($$&$) 2 (X & X ) 2 ' U 2 U 2 & ' i' U i 2 & ( $$&$) 2 (X & X ) 2. & Ū 2 & ($$&$) 2 (X & X ) 2. (68) Takig expectatios ad usig Lemma 2 ad Propositio 2 it follows ow from (68) that E[' $ U 2 ] ' ' E[U 2 ] & E ' i' U 2 i & E( $$&$) 2 ' (X & X ) 2 ' F 2 & F 2 & F 2 ' (&2)F 2. (69) Proof of (3): SSR ' ' ' ' $U 2 ' (Y & $" & $$.X ) 2 ' (Y & Ȳ ) & $$.(X & X ) 2 (Y & Ȳ ) 2 & 2$$ (Y & Ȳ ) 2 & $$ 2 (X & X ) 2. (Y & (Ȳ&$$. X ) & $$.X ) 2 (Y & Ȳ )(X & X ) % $$ 2 (X & X ) 2 (70) Proof of (28): It follows from (3) that Y % & $ Y % ' U % & ($"&") & ($$&$).X % ' U % & X(X & & X ) ' i' (X i & X ) 2.U & X % (X & X ) ' i' (X i & X ) 2 U (7) ' U % & % (X % & X )(X & X ).U ' i' (X i &. X ) 2 28
29 Proof of (29): It follows from (28) ad Lemma 3 that F 2 Y % & $Y % ' F 2 % % (X % & X )(X & X ) ' i' (X i & X ) 2 2.F 2 ' F 2 % % 2. (X % & X )' (X & X ) ' i' (X i & X ) 2 % (X % & X )2 ' (X & X ) 2 (' i' (X i & X ) 2 ) 2 (72) ' F 2 % % (X % & X ) 2 ' (X & X ) 2. 29
Properties of MLE: consistency, asymptotic normality. Fisher information.
Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationHypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
More information0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5
Sectio 13 KolmogorovSmirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationCase Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
More informationZTEST / ZSTATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown
ZTEST / ZSTATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large TTEST / TSTATISTIC: used to test hypotheses about
More informationChapter 7  Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:
Chapter 7  Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationChapter 6: Variance, the law of large numbers and the MonteCarlo method
Chapter 6: Variace, the law of large umbers ad the MoteCarlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
More information1. C. The formula for the confidence interval for a population mean is: x t, which was
s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : pvalue
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationInference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval
Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT  Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio
More informationOnesample test of proportions
Oesample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationConfidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.
Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).
More informationCenter, Spread, and Shape in Inference: Claims, Caveats, and Insights
Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationNonlife insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring
Nolife isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy
More informationUniversity of California, Los Angeles Department of Statistics. Distributions related to the normal distribution
Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chisquare (χ ) distributio.
More informationMaximum Likelihood Estimators.
Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationA Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:
A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio
More informationNormal Distribution.
Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued
More informationNow here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
More informationOverview of some probability distributions.
Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability
More informationStatistical inference: example 1. Inferential Statistics
Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either
More informationPractice Problems for Test 3
Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all
More informationChapter 7: Confidence Interval and Sample Size
Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More informationDetermining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
More informationOverview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals
Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of
More information15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011
15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes highdefiitio
More informationCS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
More informationChapter 14 Nonparametric Statistics
Chapter 14 Noparametric Statistics A.K.A. distributiofree statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they
More informationWeek 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable
Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5
More informationApproximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find
1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.
More informationThe Stable Marriage Problem
The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,
More informationBASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)
BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet
More informationSampling Distribution And Central Limit Theorem
() Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,
More informationUC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006
Exam format UC Bereley Departmet of Electrical Egieerig ad Computer Sciece EE 6: Probablity ad Radom Processes Solutios 9 Sprig 006 The secod midterm will be held o Wedesday May 7; CHECK the fial exam
More informationThe following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles
The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio
More informationA probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
More informationAnnuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.
Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio  Israel Istitute of Techology, 3000, Haifa, Israel I memory
More informationMeasures of Spread and Boxplots Discrete Math, Section 9.4
Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More information, a Wishart distribution with n 1 degrees of freedom and scale matrix.
UMEÅ UNIVERSITET Matematiskstatistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 00409 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that
More informationMath C067 Sampling Distributions
Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters
More informationChapter 5: Inner Product Spaces
Chapter 5: Ier Product Spaces Chapter 5: Ier Product Spaces SECION A Itroductio to Ier Product Spaces By the ed of this sectio you will be able to uderstad what is meat by a ier product space give examples
More informationOur aim is to show that under reasonable assumptions a given 2πperiodic function f can be represented as convergent series
8 Fourier Series Our aim is to show that uder reasoable assumptios a give periodic fuctio f ca be represeted as coverget series f(x) = a + (a cos x + b si x). (8.) By defiitio, the covergece of the series
More informationA Mathematical Perspective on Gambling
A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal
More informationLecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)
18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the BruMikowski iequality for boxes. Today we ll go over the
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationDiscrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13
EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may
More informationDefinition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
More informationChapter 5: Basic Linear Regression
Chapter 5: Basic Liear Regressio 1. Why Regressio Aalysis Has Domiated Ecoometrics By ow we have focused o formig estimates ad tests for fairly simple cases ivolvig oly oe variable at a time. But the core
More informationConfidence intervals and hypothesis tests
Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate
More informationCentral Limit Theorem and Its Applications to Baseball
Cetral Limit Theorem ad Its Applicatios to Baseball by Nicole Aderso A project submitted to the Departmet of Mathematical Scieces i coformity with the requiremets for Math 4301 (Hoours Semiar) Lakehead
More informationTHE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY
 THE ROLE OF EXPORTS IN ECONOMIC GROWTH WITH REFERENCE TO ETHIOPIAN COUNTRY BY: FAYE ENSERMU CHEMEDA EthioItalia Cooperatio ArsiBale Rural developmet Project Paper Prepared for the Coferece o Aual Meetig
More informationGCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea  add up all
More informationCHAPTER 7: Central Limit Theorem: CLT for Averages (Means)
CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:
More informationLesson 15 ANOVA (analysis of variance)
Outlie Variability betwee group variability withi group variability total variability Fratio Computatio sums of squares (betwee/withi/total degrees of freedom (betwee/withi/total mea square (betwee/withi
More informationInfinite Sequences and Series
CHAPTER 4 Ifiite Sequeces ad Series 4.1. Sequeces A sequece is a ifiite ordered list of umbers, for example the sequece of odd positive itegers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29...
More informationQuadrat Sampling in Population Ecology
Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More informationA Recursive Formula for Moments of a Binomial Distribution
A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,
More informationMEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)
MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:
More informationWHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER?
WHEN IS THE (CO)SINE OF A RATIONAL ANGLE EQUAL TO A RATIONAL NUMBER? JÖRG JAHNEL 1. My Motivatio Some Sort of a Itroductio Last term I tought Topological Groups at the Göttige Georg August Uiversity. This
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationMARTINGALES AND A BASIC APPLICATION
MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measuretheoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this
More informationConfidence Intervals
Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more
More informationFactoring x n 1: cyclotomic and Aurifeuillian polynomials Paul Garrett <garrett@math.umn.edu>
(March 16, 004) Factorig x 1: cyclotomic ad Aurifeuillia polyomials Paul Garrett Polyomials of the form x 1, x 3 1, x 4 1 have at least oe systematic factorizatio x 1 = (x 1)(x 1
More informationSystems Design Project: Indoor Location of Wireless Devices
Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 6985295 Email: bcm1@cec.wustl.edu Supervised
More informationTHE ABRACADABRA PROBLEM
THE ABRACADABRA PROBLEM FRANCESCO CARAVENNA Abstract. We preset a detailed solutio of Exercise E0.6 i [Wil9]: i a radom sequece of letters, draw idepedetly ad uiformly from the Eglish alphabet, the expected
More informationUniversal coding for classes of sources
Coexios module: m46228 Uiversal codig for classes of sources Dever Greee This work is produced by The Coexios Project ad licesed uder the Creative Commos Attributio Licese We have discussed several parametric
More informationSection 11.3: The Integral Test
Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult
More informationSubject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
More informationThe analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
More informationMultiserver Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu
Multiserver Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio coectio
More informationLecture 4: Cauchy sequences, BolzanoWeierstrass, and the Squeeze theorem
Lecture 4: Cauchy sequeces, BolzaoWeierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits
More informationTheorems About Power Series
Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real oegative umber R, called the radius
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationUnit 8: Inference for Proportions. Chapters 8 & 9 in IPS
Uit 8: Iferece for Proortios Chaters 8 & 9 i IPS Lecture Outlie Iferece for a Proortio (oe samle) Iferece for Two Proortios (two samles) Cotigecy Tables ad the χ test Iferece for Proortios IPS, Chater
More informationHere are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.
This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio
More informationA modified KolmogorovSmirnov test for normality
MPRA Muich Persoal RePEc Archive A modified KolmogorovSmirov test for ormality Zvi Drezer ad Ofir Turel ad Dawit Zerom Califoria State UiversityFullerto 22. October 2008 Olie at http://mpra.ub.uimueche.de/14385/
More informationPlugin martingales for testing exchangeability online
Plugi martigales for testig exchageability olie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
More informationLecture 5: Span, linear independence, bases, and dimension
Lecture 5: Spa, liear idepedece, bases, ad dimesio Travis Schedler Thurs, Sep 23, 2010 (versio: 9/21 9:55 PM) 1 Motivatio Motivatio To uderstad what it meas that R has dimesio oe, R 2 dimesio 2, etc.;
More informationEkkehart Schlicht: Economic Surplus and Derived Demand
Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 200617 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät LudwigMaximiliasUiversität Müche Olie at http://epub.ub.uimueche.de/940/
More informationTrading the randomness  Designing an optimal trading strategy under a drifted random walk price model
Tradig the radomess  Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore
More information*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More informationTrigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is
0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values
More informationParametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)
6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) Noparametric: o assumptio made about the distributio Advatages of assumig
More information