CHAPTER 14 MORE ABOUT REGRESSION


 Kory Carter
 2 years ago
 Views:
Transcription
1 CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp between the handspans (cm) and heghts (nches) of 167 college students, and found that the pattern of the relatonshp n ths sample could be descrbed by the equaton Average handspan = Heght An equaton lke the one relatng handspan to heght s called a regresson equaton, and the term smple regresson s sometmes used to descrbe the analyss of a straghtlne relatonshp (lnear relatonshp) between a response varable (yvarable) and an explanatory varable (xvarable). In Chapter 5, we only used regresson methods to descrbe a sample and dd not make statstcal nferences about the larger populaton. Now, we consder how to make nferences about a relatonshp n the populaton represented by the sample. Some questons nvolvng the populaton that we mght ask when analyzng a relatonshp are: 1. Does the observed relatonshp also occur n the populaton? For example, s the observed relatonshp between handspan and heght strong enough to conclude that the relatonshp also holds n the populaton? 2. For a lnear relatonshp, what s the slope of the regresson lne n the populaton? For example, n the larger populaton, what s the slope of the regresson lne that connects handspans to heghts? 3. What s the mean value of the response varable (y) for ndvduals wth a specfc value of the explanatory varable (x)? For example, what s the mean handspan n a populaton of people 65 nches tall? 4. What nterval of values predcts the value of the response varable (y) for an ndvdual wth a specfc value of the explanatory varable (x)? For example, what nterval predcts the handspan of an ndvdual 65 nches tall? 14.1 Sample and Populaton Regresson Models A regresson model descrbes the relatonshp between a quanttatve response varable (the yvarable) and one or more explanatory varables (xvarables). The yvarable s sometmes called the dependent varable, and because regresson models may be used to make predctons, the xvarables may be called the predctor varables. The labels response varable and explanatory varable may be used for the varables on the yaxs and xaxs, respectvely, even f there s not an obvous way to assgn these labels n the usual sense Any regresson model has two mportant components. The most obvo us component s the equaton that descrbes how the mean value of the yvarable s connected to specfc values of the xvarable. The equaton stated before for the connecton between handspan and heght, Average handspan = Heght, s an example. In ths Chapter, we focus on lnear relatonshps so a straghtlne equaton wll be used, but t s mportant to note that some relatonshps are curvlnear. The second component of a regresson model descrbes how ndvduals vary from the regresson lne. Fgure 14.1, whch s dentcal to Fgure 5.6, dsplays the raw data for the sample of n=167 handspans and heghts along wth the regresson lne that estmates how the mean handspan s connected to specfc heghts. Notce that most ndvduals vary from the lne. When 631
2 we examne sample data, we wll fnd t useful to estmate the general sze of the devatons from the lne. When we consder a model for the relatonshp wthn the populaton represented by a sample, we wll state assumptons about the dstrbuton of devatons from the lne. If the sample represents a larger populaton, we need to dstngush between the regresson lne for the sample and the regresson lne for the populaton. The observed data can be used to determne the regresson lne for the sample, but the regresson lne for the populaton can only be magned. Because we do not observe the whole populaton, we wll not know numercal values for the ntercept and slope of the regresson lne n the populaton. As n nearly every statstcal problem, the statstcs from a sample are used to estmate the unknown populaton parameters, whch n ths case are the slope and ntercept of the regresson lne. Fgure 14.1 Regresson Lne Lnkng HandSpan and Heght for a Sample of College Students The Regresson Lne for the Sample In Chapter 5, we ntroduced ths notaton for the regresson lne that descrbes sample data: yˆ = b0 + b1 x. In any gven stuaton, the sample s used to determne values for b 0 and b 1. ŷ s spoken as yhat and t s also referred to ether as predcted y or estmated y. b 0 s the ntercept of the straght lne. The ntercept s the value of ŷ when x = 0. b 1 s the slope of the straght lne. The slope tells us how much of an ncrease (or decrease) there s for ŷ when the xvarable ncreases by one unt. The sgn of the slope tells us whether ŷ ncreases or decreases when x ncreases. If the slope s 0, there s no lnear relatonshp between x and y because ŷ s the same for all values of x. The equaton descrbng the relatonshp between handspan and heght for the sample of college students can be wrtten as ŷ = x. In ths equaton: ŷ estmates the average handspan for any specfc heght x. If heght=70 nches, for nstance, ŷ = (70)= 21.5 cm. 632
3 The ntercept s b 0 = 3. Whle necessary for the lne, ths value does not have a useful statstcal nterpretaton n ths example. It estmates the average handspan for ndvduals who have heght = 0 nches, an mpossble heght far from the range of the observed heghts. It also s an mpossble hand span. The slope s b 1 = Ths value tells us that the average ncrease n handspan s 0.35 centmeters for every onench ncrease n heght. Remnder: The LeastSquares Crteron In Chapter 5, we descrbed the leastsquares crteron. Ths mathematcal crteron s used to determne numercal values of the ntercept and slope of a sample regresson lne. The leastsquares lne s the lne, among all possble lnes, that has the smallest sum of squared dfferences between the sample values of y and the correspondng values of ŷ. Devatons from the Regresson Lne n the Sample The terms random error, resdual varaton, and resdual error all are used as synonyms for the term devaton. Most commonly, the word resdual s used to descrbe the devaton of an observed yvalue from the sample regresson lne. A resdual s easy to compute. It smply s the dfference between the observed yvalue for an ndvdual and the value of ŷ determned from the xvalue for that ndvdual. Example 1. Resduals n the HandSpan and Heght Regresson Consder a person 70 nches tall whose handspan s 23 centmeters. The sample regresson lne s ŷ = x, so ŷ = (70) = 21.5 cm for ths person. The resdual = observed y predcted y = y ŷ = = 1.5 cm. Fgure 14.2 llustrates ths resdual. For an observaton y n the sample, the resdual s e = y ŷ. y = the value of the response varable for the observaton. ŷ = b0 + b1x where x s the value of the explanatory varable for the observaton. Techncal Note : The sum of the resduals s 0 for any leastsquares regresson lne. The "least squares" formulas for determnng the equaton always result n y = yˆ, so e =
4 Fgure 14.2 Resdual for a person 70 nches tall wth a hand span = 23 centmeters. The resdual s the dfference between observed y=23 and ŷ =21.5, the predcted value for a person 70 nches tall. The Regresson Lne for the Populaton The regresson equaton for a smple lnear relatonshp n a populaton can be wrtten as: E( Y ) = β 0 + β1 x E(Y) represents the mean or expected value of y for ndvduals n the populaton who all have the same partcular value of x. Note that ŷ s an estmate of E(Y). β 0 s the ntercept of the straght lne n the populaton. β 1 s the slope of the lne n the populaton. Note that f the slope β 1 = 0, there s no lnear relatonshp n the populaton. Unless we measure the entre populaton, we cannot know the numercal values of β 0 and β 1. These are populaton parameters that we estmate usng the correspondng sample statstcs. In the handspan and heght example, b 1 =0.35 s a sample statstc that estmates the populaton parameter β 1, and b 0 = 3 s a sample statstc that estmates the populaton parameter β 0. Devatons from the Regresson Lne n the Populaton To make statstcal nferences about the populaton, two assumptons about how the y values vary from the populaton regresson lne are necessary. Frst, we assume that the general sze of the devaton of yvalues from the lne s the same for all values of the explanatory varable (x), an assumpton called the constant varance assumpton. Ths assumpton may or may not be correct n any partcular stuaton, and a scatter plot should be examned to see f t s reasonable or not. In Fgure 14.1, the constant varance assumpton looks reasonable because the magntude of the devaton from the lne appears to be about the same across the range of observed heghts. The second assumpton about the populaton s that for any specfc value of x, the dstrbuton of yvalues s a normal dstrbuton. Equvalently, ths assumpton s that devatons from the populaton regresson lne have a normal curve dstrbuton. Fgure 14.3 llustrates ths assumpton along wth the other elements of the populaton regresson model for a lnear 634
5 relatonshp. The lne E( Y ) = β 0 + β1 x descrbes the mean of y, and the normal curves descrbe devatons from the mean. Fgure 14.3 Regresson Model for Populaton Summary of the Smple Regresson Model A useful format for expressng the components of the populaton regresson model s Y = MEAN + DEVIATION. Ths conceptual equaton states that for any ndvdual, the value of the response varable (y) can be constructed by combnng two components: The MEAN, whch n the populaton s the lne E( Y ) = β 0 + β1 x f the relatonshp s lnear. There are other possble relatonshps, such as curvlnear, a specal case of whch s a 2 quadratc relatonshp, E(Y) = β0 +β1x + β2x. Relatonshps that are not lnear wll not be dscussed n ths book. The ndvdual's DEVIATION = y  MEAN, whch s what s left unexplaned after accountng for the mean yvalue at that ndvdual's xvalue. Ths format also apples to the sample, although techncally we should use the term "estmated mean" when referrng to the sample regresson lne. Example 1 Contnued. MEAN and DEVIATION for Heght and HandSpan Regresson. Recall that the sample regresson lne for hand spans and heghts s ŷ = x. Although t s not lkely to be true, let's assume for convenence that ths equaton also holds n the populaton. If your heght s x=70 nches and your hand span s y=23 cm., then: MEAN = (70) = 21.5, DEVIATION= Y  MEAN = = 1.5, and y = 23 = MEAN + DEVIATION = In other words, your handspan s 1.5 cm above the mean for people wth your heght. 635
6 In the theoretcal development of procedures for makng statstcal nferences for a regresson model, the collecton of all DEVIATIONS n the populaton s assumed to have a 2 normal dstrbuton wth mean 0 and standard devaton σ (so, the varance s σ ). The value of the standard devaton σ s an unknown populaton parameter that s estmated usng the sample. Ths standard devaton can be nterpreted n the usual way that we nterpret a standard devaton. It s, roughly the average dstance between ndvdual values of y and the mean of y as descrbed by the regresson lne. In other words, t s roughly the sze of the average devaton across all ndvduals n the range of xvalues. Keepng the regresson notaton straght for populatons and samples can be confusng. Although we have not yet ntroduced all relevant notaton, a summary at ths stage wll help you keep t straght. Smple Lnear Regresson Model For ( x1, y1),(x 2, y2),...,(x n, yn ), a sample of n observatons of the explanatory varable x and the response varable y from a large populaton, the smple lnear regresson model descrbng the relatonshp between y and x s: Populaton verson Mean: Indvdual: E 0 1 ( Y ) = β + β x y = β +β x + ε = E( Y) + ε 0 1 The devatons ε are assumed to follow a normal dstrbuton wth mean 0 and standard devaton σ. Sample verson Mean: ˆ = b + b x y 0 1 Indvdual: y = b + b x + e = yˆ e where e s the resdual for ndvdual. The sample statstcs b 0 and b 1 estmate the populaton parameters β,β 0 1. The mean of the resduals s 0, and the resduals can be used to estmate the populaton standard devaton σ Estmatng the Standard Devaton From the Mean Recall that the standard devaton n the regresson model measures, roughly, the average devaton of yvalues from the mean (the regresson lne). Expressed another way, the standard devaton for regresson measures the general sze of the resduals. Ths s an mportant and useful statstc for descrbng ndvdual varaton n a regresson problem, and t also provdes nformaton about how accurately the regresson equaton mght predct yvalues for ndvduals. A relatvely small standard devaton from the regresson lne ndcates that ndvdual data ponts generally fall close to the lne, so predctons based on the lne wll be close to the actual values. The calculaton of the estmate of standard devaton s based on the sum of the squared resduals for the sample. Ths quantty s called the sum of squared errors and s denoted by SSE. Synonyms for sum of squared errors are resdual sum of squares or sum of squared resduals. To fnd the SSE, resduals are calculated for all observatons, then the resduals are squared and summed. The standard devaton for the sample s Sum of Squared Resduals SSE s = =, and ths sample statstc estmates the populaton n2 n 2 standard devaton σ. 636
7 Estmatng the Standard Devaton for a Smple Regresson Model 2 2 SSE = ( y yˆ ) = e 2 SSE ( y yˆ ) s = = n 2 n 2 The statstc s s an estmate of the populaton standard devaton σ. Remember that n the regresson context, σ s the standard devaton of the yvalues at each x, not the standard devaton of the whole populaton of yvalues. Example 2. Re latonshp Between Heght and Weght for College Men Fgure 14.4 dsplays regresson results from the Mntab program and a scatter plot for the relatonshp between y = weght (pounds) and x = heght (nches) n a sample of n=43 men n a Penn State statstcs class. The regresson lne for the sample s ŷ = x, and ths lne s drawn onto the plot. We see from the plot that there s consderable varaton from the lne at any gven heght. The standard devaton, shown n the row of computer output mmedately above the plot, s "s=24.00." Ths value roughly measures, for any gven heght, the general sze of the devatons of ndvdual weghts from the mean weght for the heght. The standard devaton from the regresson lne can be nterpreted n conjuncton wth the Emprcal Rule for bellshaped data stated n Secton 2.7. Recall, for nstance, that about 95% of ndvduals wll fall wthn two standard devatons of the mean. As an example, consder men who are 72 nches tall. For men wth ths heght, the estmated average weght determned from the regresson equaton s (72) = 186 pounds. The estmated standard devaton from the regresson lne s s=24 pounds, so we can estmate that about 95% of men 72 nches tall have weghts wthn 2 24=48 pounds of 186 pounds, whch s 186 ± 48, or 138 to 234 pounds. Thnk about whether ths makes sense for all the men you know who are 72 nches (6 feet) tall. 637
8 Fgure 14.4 The Relatonshp Between Weght and Heght for n=43 College Men The regresson equaton s Weght = Heght Predctor Coef SE Coef T P Constant Heght S = RSq = 32.3% RSq(adj) = 30.7% The Proporton of Varaton Explaned by x In Chapter 5, we learned that a statstc denoted as r 2 s used to measure how well the explanatory varable actually does explan the varaton n the response varable. Ths statstc s also denoted as R 2 (rather than r 2 ), and the value s commonly expressed as a percent. Researchers typcally use the phrase proporton of varaton explaned by x n conjuncton wth the value of r 2. For example, f r 2 = 0.60 (or 60%), the researcher may wrte that the explanatory varable explans 60% of the varaton n the response varable. The formula for r 2 presented n Chapter 5 was 2 SSTO SSE r = SSTO The quantty SSTO s the sum of squared dfferences between observed y values and the sample mean y. It measures the sze of the devatons of the yvalues from the overall mean of y, whereas SSE measures the devatons of the yvalues from the predcted values ŷ. 638
9 Example 2 Contnued. R 2 Heghts and Weghts of College Men In Fgure 14.4, we can fnd the nformaton the "Rsq = 32.3%" for the relatonshp between weght and heght. A researcher mght wrte the varable heght explans 32.3% of the varaton n the weghts of college men. Ths sn t a partcularly mpressve statstc. As we noted before, there s substantal devaton of ndvdual weghts from the regresson lne so a predcton of a college man's weght based on heght may not be partcularly accurate. Example 3. Drver Age and Hghway Sgn Readng Dstance In Example 5.2, we examned data for the relatonshp between y=maxmum dstance (feet) at whch a drver can read a hghway sgn and x = the age of the drver. There were n=30 observatons n the data set. Fgure 14.5 dsplays Mntab regresson output for these data. The equaton descrbng the lnear relatonshp n the sample s Average dstance = Age From the output, we learn that the standard devaton from the regresson lne s s=49.76 and R sq=64.2%. Roughly, the average devaton from the regresson lne s about 50 feet, and the proporton of varaton n sgn readng dstances explaned by age s 0.642, or 64.2%. Fgure 14.5 Mntab Output: Sgn Readng Dstance and Drver Age The regresson equaton s Dstance = Age Predctor Coef SE Coef T P Constant Age S = RSq = 64.2% RSq(adj) = 62.9% Analyss of Varance Source DF SS MS F P Regresson Resdual Error Total Unusual Observatons Obs Age Dstance Ft SE Ft Resdual St Resd R R denotes an observaton wth a large standardzed resdual The "Analyss of Varance" table provdes the peces needed to compute r 2 and s: SSE=69334 SSE s = = = n 2 28 SSTO= SSTOSSE = = r 2 = =.642 or 64.2%
10 14.3 Inference about the Lnear Regresson Relatonshp When researchers do a regresson analyss, they occasonally know based on past research or common sense that the varables are ndeed related. In some nstances, however, t may be necessary to do a hypothess test n order to make the generalzaton that two varables are related n the populaton represented by the sample. The statstcal sgnfcance of a lnear relatonshp can be evaluated by testng whether or not the slope s 0. Recall that f the slope s 0 n a smple regresson model, the two varables are not related because changes n the xvarable wll not lead to changes n the yvarable. The usual null hypothess and alternatve hypotheses about β 1, the slope of the populaton lne E( Y ) = β 0 + β1 x, are: H o : β 1 = 0 (the populaton slope s 0, so y and x are not lnearly related.) H a : β 1 0 (the populaton slope s not 0, so y and x are lnearly related.) The alternatve hypothess may be onesded or twosded, although most statstcal software uses the two sded alternatve. The test statstc used to do the hypothess test s a t statstc wth the same general format that we saw n Chapter 13. That format, and ts applcaton to ths stuaton, s sample statstc null value b1 0 t = = standard error s. e.( b1 ) Ths s a standardzed statstc for the dfference between the sample slope and 0, the null value. Notce that a large value of the sample slope (ether postve or negatve) relatve to ts standard error wll gve a large value of t. If the mathematcal assumptons about the populaton model descrbed n Secton 14.1 are correct, the statstc has a t dstrbuton wth n2 degrees of freedom. The pvalue for the test s determned usng that dstrbuton. By hand calculatons of the sample slope and ts standard error are cumbersome. Fortunately, the regresson analyss of most statstcal software ncludes a tstatstc and a pvalue for ths sgnfcance test. Techncal Note: In case you ever need to compute the values by hand, here are the formulas for the sample slope and ts standard error: sy b 1 = r s s s.e.(b 1) =, where s = 2 (x x) x SSE n 2 In the formula for the sample slope, s x and s y are the sample standard devatons of the x and y values respectvely, and r s the correlaton between x and y. Example 3 Contnued: Drver Age and Hghway Sgn Readng Dstance Fgure 14.5 presents the Mntab output for the regresson of sgn readng dstance and drver age. The sample estmate of the slope s b 1 = Ths sample slope s dfferent than 0, but s t enough dfferent to enable us to generalze that a lnear relatonshp exsts n the populaton represented by ths sample? The part of the Mntab output that can be used to test the statstcal sgnfcance of the relatonshp s shown n bold n Fgure 14.5, and the relevant pvalue s underlned (by the authors of ths text, not by Mntab). Ths lne of the output provdes nformaton about the sample slope, the standard error of the sample slope, the t statstc for testng statstcal sgnfcance and the p value for the test of: 640
11 H o : β 1 = 0 (the populaton slope s 0, so y and x are not lnearly related.) H a : β 1 0 (the populaton slope s not 0, so y and x are lnearly related.) The test statstc s: sample statstc null value b t = = = = 7.09 standard error s. e.( b1) The pvalue s, to 3 decmal places, Ths means the probablty s vrtually 0 that the observed slope could be as far from 0 or farther than t s f there s no lnear relatonshp n the populaton. So, as we mght expect for these varables, we can conclude that the relatonshp between the two varables n the sample represents a real relatonshp n the populaton. Confdence Interval for the Populaton Slope The sgnfcance test of whether or not the populaton slope s 0 only tells us f we can declare the relatonshp to be statstcally sgnfcant. If we decde that the true slope s not 0, we mght ask, What s the value of the slope? We can answer ths queston wth a confdence nterval for β 1, the populaton slope. The format for ths confdence nterval s the same as the general format used n Chapters 10 and 12, whch s sample estmate multpler standard error The estmate of the populaton slope β 1 s b 1, the slope of the leastsquares regresson lne for the sample. As shown already, the standard error formula s complcated and we ll usually rely on statstcal software to determne ths value. The multpler wll be labeled t* and s determned usng a tdstrbuton wth df = n2. Table 12.1 can be used to fnd the multpler for the desred confdence level. Formula for Confdence Interval for β 1, the Populaton Slope A confdence nterval for β 1 s b ± * 1 t s.e.(b1) The multpler t* s found usng a tdstrbuton wth n2 degrees of freedom, and s such that the probablty between t* and +t* equals the confdence level for the nterval. Example 3 Contnued. 95% Confdence Interval for Slope Between Age and Sgn Readng Dstance In Fgure 14.4, we see that the estmated slope s b 1=3.01 and s.e.( b 1 )= There are n=30 observatons so df=28 for fndng t*. For a 95% confdence level, t*=2.05 (see Table 12.1). The 95% confdence nterval for the populaton slope s 3.01± ± to 2.14 Wth 95% confdence, we can estmate that n the populaton of drvers represented by ths sample, the mean sgn readng dstance decreases somewhere between 3.88 and 2.14 feet for each oneyear ncrease n age. 641
12 Testng Hypotheses about the Correlaton Coeffcent In Chapter 5, we learned that the correlaton coeffcent s 0 when the regresson lne s horzontal. In other words, f the slope of the regresson lne s 0, the correlaton s 0. Ths means that the results of a hypothess test for the populaton slope can also be nterpreted as applyng to equvalent hypotheses about the correlaton between x and y n the populaton. As we dd for the regresson model, we use dfferent notaton to dstngush between a correlaton computed for a sample and a correlaton wthn a populaton. It s commonplace to use the symbol ρ (pronounced rho ) to represent the correlaton between two varables wthn a populaton. Usng ths notaton, null and alternatve hypotheses of nterest are: H 0 : ρ = 0 (x and y are not correlated) H a : ρ 0 (x and y are correlated) The results of the hypothess test descrbed before for the populaton slope β 1 can be used for these hypotheses as well. If we reject H 0 : β 1 = 0, we also reject H 0 : ρ = 0. If we decde n favor of H a : β 1 0, we also decde n favor of H a : ρ 0. Many statstcal software programs, ncludng Mntab, wll gve a pvalue for testng whether the populaton correlaton s 0 or not. Ths pvalue wll be the same as the pvalue gven for testng whether the populaton slope s 0 or not. In the followng Mntab output for the relatonshp between pulse rate and weght n a sample of 35 college women, notce that s gven as the pvalue for testng that the slope s 0 (look under P n the regresson results) and for testng that the correlaton s 0. Because ths s not a small pvalue, we can reject the null hypotheses for the slope and the correlaton. Regresson Analyss: Pulse versus Weght The regresson equaton s Pulse = Weght Predctor Coef SE Coef T P Constant Weght Correlatons: Pulse, Weght Pearson correlaton of Pulse and Weght = PValue = The Effect of Sample Sze on Sgnfcance The sze of a sample always affects whether a specfc observed result acheves statstcal sgnfcance. For example, r =.183 s not a statstcally sgnfcant correlaton for a sample sze of n=35, as n the pulse and weght example, but t would be statstcally sgnfcant f n=1,000. Wth very large sample szes, weak relatonshps wth low correlaton values can be statstcally sgnfcant. The moral of the story here s that wth a large sample sze, t may not be sayng much to say that two varables are sgnfcantly related. Ths only means that we thnk the correlaton s not 0. To assess the practcal sgnfcance of the result, we should carefully examne the observed strength of the relatonshp. 642
13 14.4 Predctng the Value of Y for an Indvdual An mportant use of a regresson equaton s to estmate or predct the unknown value of a response varable for an ndvdual wth a known specfc value of the explanatory varable. Usng the data descrbed n Example 3, for nstance, we can predct the maxmum dstance at whch an ndvdual can read a hghway sgn by substtutng hs or her age for x n the sample regresson equaton. Consder a person 21 years old. The predcted dstance s approxmately ŷ = = 514 feet. There wll be varaton among 21 yearolds wth regard to the sgn readng dstance, so the predcted dstance of 514 feet s not lkely to be the exact dstance for the next 21 year old who vews the sgn. Rather than predctng that the dstance wll be exactly 514 feet, we should nstead predct that the dstance wll be wthn a partcular nterval of values. A 95% predcton nterval for the value of the response varable (y) accounts for the varaton among ndvduals wth a partcular value of x. Ths nterval can be nterpreted n two equvalent ways. The 95% predcton nterval estmates the central 95% of the values of y for members of the populaton wth a specfed value of x. The probablty s 0.95 that a randomly selected ndvdual from the populaton wth a specfed value of x falls nto the correspondng 95% predcton nterval. Notce that a predcton nterval dffers conceptually from a confdence nterval. A confdence nterval estmates an unknown populaton parameter, whch s a numercal characterstc or summary of the populaton. An example n ths Chapter s a confdence nterval for the slope of the populaton lne. A predcton nterval, however, does not estmate a parameter; nstead t estmates the potental data value for an ndvdual. Equvalently, t descrbes an nterval nto whch a specfed percentage of the populaton may fall. As wth most regresson calculatons, the by hand formulas for predcton ntervals are formdable. Statstcal software can be used to create the nterval. Fgure 14.6 shows Mntab output that ncludes the 95% predcton ntervals for three dfferent ages (21 years old, 30 years old, and 45 years old). The ntervals are toward the bottom rght sde of the dsplay n a column labeled "95% PI" and are hghlghted wth bold type. (Note: The term Ft s a synonym for ŷ, the estmate of the average response at the specfc x value.) Here s what we can conclude: The probablty s 0.95 that a randomly selected 21 yearold wll read the sgn at somewhere between roughly 407 and 620 feet. The probablty s 0.95 that a randomly selected 30 yearold wll read the sgn at somewhere between roughly 381and 592 feet. The probablty s 0.95 that a randomly selected 45 yearold wll read the sgn at somewhere between roughly 338 and 545 feet. We can also nterpret each nterval as an estmate of the sgn readng dstances for the central 95% of a populaton of drvers wth a specfed age. For nstance, about 95% of all drvers 21 years old wll be able to read the sgn at a dstance somewhere between 407 and 620 feet. 643
14 Fgure 14.6 Mntab output showng predcton nterval of dstance The regresson equaton s Dstance = Age Predctor Coef SE Coef T P Constant Age S = RSq = 64.2% RSq(adj) = 62.9% Analyss of Varance Source DF SS MS F P Regresson Resdual Error Total Unusual Observatons Obs Age Dstance Ft SE Ft Resdual St Resd R R denotes an observaton wth a large standardzed resdual Predcted Values for New Observatons New Obs Ft SE Ft 95.0% CI 95.0% PI ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) Values of Predctors for New Observatons New Obs Age We re not lmted to usng only 95% predcton ntervals. Wth Mntab, we can descrbe any central percentage of the populaton that we wsh. For example, here are 50% predcton ntervals for the sgn readng dstance at the three specfc ages we consdered above. Age Ft 50.0% PI ( , ) ( , ) ( , ) For each specfc age, the 50% predcton nterval estmates the central 50% of the maxmum sgn readng dstances n a populaton of drvers wth that age. For example, we can estmate that 50% of drvers 21 years old would have a maxmum sgn readng dstance somewhere between about 478 feet and 549 feet. The dstances for the other 50% of 21 yearold drvers would be predcted to be outsde ths range wth 25% beyond 549 feet and 25% below 478 feet. Interpretaton of a Predcton Interval A predcton nterval estmates the value of y for an ndvdual wth a partcular value of x, or equvalently, the range of values of the response varable for a specfed central percentage of a populaton wth a partcular value of x. 644
15 Techncal Note: The formula for the predcton nterval for y at a specfc x s: where 2 2 ŷ± t* s + [s.e.(ft)] 2 1 ( x x) s. e.( ft) = s + 2 n ( x x) The multpler t* s found usng a tdstrbuton wth n2 degrees of freedom, and s such that the probablty between t* and +t* equals the desred level for the nterval. Note: The s.e.(ft), and thus the wdth of the nterval, depends upon how far the specfed xvalue s from x. The further the specfc x s from the mean, the wder the nterval. When n s large, s.e.(ft) wll be small, and the predcton nterval wll be approxmately ŷ± t*s Estmatng the Mean Y at a Specfed X In the prevous secton, we focused on the estmaton of the values of the response varable for ndvduals. A researcher may nstead want to estmate the mean value of the response varable for ndvduals wth a partcular value of the explanatory varable. We mght ask, What s the mean weght for college men who are 6 feet tall? Ths queston only asks about the mean weght n a group wth a common heght, and t s not concerned wth the devatons of ndvduals from that mean. In techncal terms, we wsh to estmate the populaton mean E(Y) for a specfc value of x that s of nterest to us. To make ths estmate, we use a confdence nterval. Ths format for ths confdence nterval s agan: sample estmate multpler standard error The sample estmate of E(Y) s the value of ŷdetermned by substtutng the xvalue of nterest nto yˆ = b0 + b1 x, the leastsquares regresson lne for the sample. The standard error of ŷ s the s.e.(ft) shown n the Techncal Note n the prevous secton, and ts value s usually provded by statstcal software. The multpler s found usng a tdstrbuton wth df=n2, and Appendx A3 can be used to determne ts value. Example 2 Revsted. Estmatng Mean Weght of College Men at Varous Heghts Based on the sample of n=43 college men n Example 2, let s estmate the mean weght n the populaton of college men for each of three dfferent heghts: 68 nches, 70 nches, and 72 nches. Fgure 14.7 shows Mntab output that ncludes the three dfferent confdence ntervals for these three dfferent heghts. These ntervals are toward the bottom of the dsplay n a column labeled 95% CI. The frst entry n that column s the estmate of the populaton mean weght for men who are 68 nches tall. Wth 95% confdence, we can estmate that mean weght of college men 68 nches tall s somewhere between and pounds. The second row under 95% CI contans the nformaton that the 95% confdence nterval for the mean weght of college men 70 nches tall s to pounds. The 95% confdence nterval for the mean weght for men 72 nches tall s to pounds. Agan, t s mportant to realze that the confdence ntervals for E(Y) do not descrbe the varaton among ndvduals. They only are estmates of the mean weghts for specfc heghts. The predcton ntervals for ndvdual responses descrbe the varaton among ndvduals. You may have notced that 95% predcton ntervals, labeled 95% PI, are next to the confdence 645
16 ntervals n the output. Among men 70 nches tall, for nstance, we would estmate that 95% of the ndvdual weghts would be n the nterval from about 122 to about 221 pounds. Fgure 14.7 Mntab Output wth Confdence Intervals For Mean Weght The regresson equaton s Weght = Heght Predctor Coef SE Coef T P Constant Heght S = RSq = 32.3% RSq(adj) = 30.7%  Some Output Omtted  Predcted Values for New Observatons New Obs Ft SE Ft 95.0% CI 95.0% PI ( , ) ( , ) ( , ) ( , ) ( , ) ( , ) Values of Predctors for New Observatons New Obs Heght Checkng Condtons for Usng Regresson Models for Inference There are a few condtons that should be at least approxmately true when we use a regresson model to make an nference about a populaton. Of the fve condtons that follow, the frst two are partcularly crucal. Condtons for Lnear Regresson 1. The form of the equaton that lnks the mean value of y to x must be correct. For nstance, we won t make proper nferences f we use a straght lne to descrbe a curved relatonshp. 2. There should not be any extreme outlers that nfluence the results unduly. 3. The standard devaton of the values of y from the mean y s the same regardless of the value of the x varable. In other words, y values are smlarly spread out at all values of x. 4. For ndvduals n the populaton wth the same partcular value of x, the dstrbuton of the values of y s a normal dstrbuton. Equvalently, the dstrbuton of devatons from the mean value of y s a normal dstrbuton. Ths condton can be relaxed f the sample sze s large. 5. Observatons n the sample are ndependent of each other. 646
17 Checkng the Condtons wth Plots A scatter plot of the raw data and plots of the resduals provde nformaton about the valdty of the assumptons. Remember that a resdual s the dfference between an observed value and the predcted value for that observaton, and that some assumptons made for a lnear regresson model have to do wth how yvalues devate from the regresson lne. If the propertes of the resduals for the sample appear to be consstent wth the mathematcal assumptons made about devatons wthn the populaton, we can use the model to make statstcal nferences. Condtons 1, 2 and 3 can be checked usng two useful plots: A scatter plot of y versus x for the sample (y vs x) A scatter plot of the resduals versus x for the sample (resds vs x) If Condton 1 holds for a lnear relatonshp, then: The plot of y vs x should show ponts randomly scattered around an magnary straght lne. The plot of resds vs x should show ponts randomly scattered around a horzontal lne at resd = 0. If Condton 2 holds, extreme outlers should not be evdent n ether plot. If condton 3 holds, nether plot should show ncreasng or decreasng spread n the ponts as x ncreases. Example 2 Contnued. Checkng the Condtons for the Weght and Heght Problem Fgure 14.4 dsplayed a scatter plot of the weghts and heghts of n=43 college men. In that plot, t appears that a straghtlne s a sutable model for how mean weght s lnked to heght. In Fgure 14.8 there s a plot of the resduals ( e ) versus the correspondng values of heght for these 43 men. Ths plot s further evdence that the rght model has been used. If the rght model has been used, the way n whch ndvduals devate from the lne (resduals) wll not be affected by the value of the explanatory varable. The somewhat random lookng blob of ponts n Fgure 14.8 s the way a plot of resduals versus x should look f the rght equaton for the mean has been used. Both plots (Fgures 14.4 and 14.8) also show that there are no extreme outlers and that the heghts have approxmately the same varance across the range of heghts n the sample. Therefore, Condtons 2 and 3 appear to be met. Fgure 14.8 Plot of Resduals versus X for Example 2. The Absence of a Pattern Indcates the Rght Model Has Been Used 647
18 Condton 4, whch s that devatons from the regresson lne are normally dstrbuted, s dffcult to verfy but t s also the least mportant of the condtons because the nference procedures for regresson are robust. Ths means that f there are no major outlers or extreme skewness, the nference procedures work well even f the dstrbuton of yvalues s not a normal dstrbuton. In Chapters 12 and 13, we saw that confdence ntervals and hypothess tests for a mean or a dfference between two means also were robust. To examne the dstrbuton of the devatons from the lne, a hstogram of the resduals s useful although for small samples a hstogram may not be nformatve. A more advanced plot called a normal probablty plot can also be used to check whether the resduals are normally dstrbuted, but we do not provde the detals n ths text. Fgure 14.9 dsplays a hstogram of the resduals for Example 2. It appears that the resduals are approxmately normally dstrbuted, so Condton 4 s met. Fgure 14.9 Hstogram of Resduals for Example 2 Condton 5 follows from the data collecton process. It s met as long as the unts are measured ndependently. It would not be met f the same ndvduals were measured across the range of xvalues, such as f x=average speed and y=gas mleage were to be measured for multple tanks of gas on the same cars. More complcated models are needed for dependent observatons, and those models wll not be dscussed n ths book. Correctons When Condtons Are Not Met There are some steps that can be taken f Condtons 1, 2 or 3 are not met. If Condton 1 s not met, more complcated models can be used. For nstance, Fgure shows a typcal plot of resduals that occurs when a straghtlne model s used to descrbe data that are curvlnear. It may help to thnk of the resduals as predcton errors that would occur f we use the regresson lne to predct the value of y for the ndvduals n the sample. In the plot shown n Fgure 14.10, the predcton errors are all negatve n the central regon of X and nearly all postve for outer values of X. Ths occurs because the wrong model s beng used to make the predctons. A curvlnear model, such as the quadratc model dscussed earler, may be more approprate. Fgure A Resdual Plot Indcatng the Wrong Model Has Been Used 648
19 Condton 2, that there are no nfluental outlers, can be checked graphcally wth the scatter plot of y versus x and the plot of resduals versus x. The approprate correcton f there are outlers depends on the reason for the outlers. The same consderatons and correctve acton dscussed n Chapter 2 would be taken, dependng on the cause of the outler. For nstance, Fgure shows a scatter plot and a resdual plot for the data of Exercse 38 n Chapter 5. A potental outler s seen n both plots. In ths example, the xvarable s weght and the yvarable s tme to chug a beverage. The outler probably represents a legtmate data value. The relatonshp appears to be lnear for weghts rangng up to about 210 pounds, but then t appears to change. It could ether become quadratc, or t could level off. We do not have enough data to determne what happens for hgher weghts. The soluton n ths case would be to remove the outler, and use the lnear regresson relatonshp only for body weghts under about 210 pounds. Determnng the relatonshp for hgher body weghts would requre a larger sample of ndvduals n that range. 649
20 Fgure Scatter plot and Resdual Plot Wth an Outler If ether Condton 1 or Condton 3 s not met, a transformaton may be requred. Ths s equvalent to usng a dfferent model. Fortunately, often the same transformaton wll correct problems wth Condtons 1,3, and 4. For nstance, when the response varable s monetary, such as salares, t s often more approprate to use the relatonshp ln(y) = b 0 + b 1 x + e In other words, to assume that there s a lnear relatonshp between the natural log of y and the x values. Ths s called a log transformaton on the y's. We wll not pursue transformatons further n ths book. 650
CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES
CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable
More informationCausal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting
Causal, Explanatory Forecastng Assumes causeandeffect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of
More informationb) The mean of the fitted (predicted) values of Y is equal to the mean of the Y values: c) The residuals of the regression line sum up to zero: = ei
Mathematcal Propertes of the Least Squares Regresson The least squares regresson lne obeys certan mathematcal propertes whch are useful to know n practce. The followng propertes can be establshed algebracally:
More informationIntroduction to Regression
Introducton to Regresson Regresson a means of predctng a dependent varable based one or more ndependent varables. Ths s done by fttng a lne or surface to the data ponts that mnmzes the total error. 
More informationThe covariance is the two variable analog to the variance. The formula for the covariance between two variables is
Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.
More informationQuestions that we may have about the variables
Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent
More informationTHE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES
The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered
More information1. Measuring association using correlation and regression
How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a
More informationbenefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).
REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or
More informationChapter 14 Simple Linear Regression
Sldes Prepared JOHN S. LOUCKS St. Edward s Unverst Slde Chapter 4 Smple Lnear Regresson Smple Lnear Regresson Model Least Squares Method Coeffcent of Determnaton Model Assumptons Testng for Sgnfcance Usng
More information9.1 The Cumulative Sum Control Chart
Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s
More informationThe Analysis of Covariance. ERSH 8310 Keppel and Wickens Chapter 15
The Analyss of Covarance ERSH 830 Keppel and Wckens Chapter 5 Today s Class Intal Consderatons Covarance and Lnear Regresson The Lnear Regresson Equaton TheAnalyss of Covarance Assumptons Underlyng the
More informationSIMPLE LINEAR CORRELATION
SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.
More informationTHE TITANIC SHIPWRECK: WHO WAS
THE TITANIC SHIPWRECK: WHO WAS MOST LIKELY TO SURVIVE? A STATISTICAL ANALYSIS Ths paper examnes the probablty of survvng the Ttanc shpwreck usng lmted dependent varable regresson analyss. Ths appled analyss
More informationx f(x) 1 0.25 1 0.75 x 1 0 1 1 0.04 0.01 0.20 1 0.12 0.03 0.60
BIVARIATE DISTRIBUTIONS Let be a varable that assumes the values { 1,,..., n }. Then, a functon that epresses the relatve frequenc of these values s called a unvarate frequenc functon. It must be true
More informationSTATISTICAL DATA ANALYSIS IN EXCEL
Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 1401013 petr.nazarov@crpsante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for
More informationLecture 10: Linear Regression Approach, Assumptions and Diagnostics
Approach to Modelng I Lecture 1: Lnear Regresson Approach, Assumptons and Dagnostcs Sandy Eckel seckel@jhsph.edu 8 May 8 General approach for most statstcal modelng: Defne the populaton of nterest State
More informationPSYCHOLOGICAL RESEARCH (PYC 304C) Lecture 12
14 The Chsquared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed
More informationCHAPTER 7 THE TWOVARIABLE REGRESSION MODEL: HYPOTHESIS TESTING
CHAPTER 7 THE TWOVARIABLE REGRESSION MODEL: HYPOTHESIS TESTING QUESTIONS 7.1. (a) In the regresson contet, the method of least squares estmates the regresson parameters n such a way that the sum of the
More informationInequality and The Accounting Period. Quentin Wodon and Shlomo Yitzhaki. World Bank and Hebrew University. September 2001.
Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.
More informationAnalysis of Covariance
Chapter 551 Analyss of Covarance Introducton A common tas n research s to compare the averages of two or more populatons (groups). We mght want to compare the ncome level of two regons, the ntrogen content
More informationEconomic Interpretation of Regression. Theory and Applications
Economc Interpretaton of Regresson Theor and Applcatons Classcal and Baesan Econometrc Methods Applcaton of mathematcal statstcs to economc data for emprcal support Economc theor postulates a qualtatve
More informationErrorPropagation.nb 1. Error Propagation
ErrorPropagaton.nb Error Propagaton Suppose that we make observatons of a quantty x that s subject to random fluctuatons or measurement errors. Our best estmate of the true value for ths quantty s then
More informationBinary Dependent Variables. In some cases the outcome of interest rather than one of the right hand side variables is discrete rather than continuous
Bnary Dependent Varables In some cases the outcome of nterest rather than one of the rght hand sde varables s dscrete rather than contnuous The smplest example of ths s when the Y varable s bnary so that
More informationThe OC Curve of Attribute Acceptance Plans
The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4
More information8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by
6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng
More informationHYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION
HYPOTHESIS TESTING OF PARAMETERS FOR ORDINARY LINEAR CIRCULAR REGRESSION Abdul Ghapor Hussn Centre for Foundaton Studes n Scence Unversty of Malaya 563 KUALA LUMPUR Emal: ghapor@umedumy Abstract Ths paper
More informationNPAR TESTS. OneSample ChiSquare Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6
PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has
More informationMULTIPLE LINEAR REGRESSION IN MINITAB
MULTIPLE LINEAR REGRESSION IN MINITAB Ths document shows a complcated Mntab multple regresson. It ncludes descrptons of the Mntab commands, and the Mntab output s heavly annotated. Comments n { } are used
More informationLatent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006
Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model
More informationCalibration and Linear Regression Analysis: A SelfGuided Tutorial
Calbraton and Lnear Regresson Analyss: A SelfGuded Tutoral Part The Calbraton Curve, Correlaton Coeffcent and Confdence Lmts CHM314 Instrumental Analyss Department of Chemstry, Unversty of Toronto Dr.
More informationEXPLORATION 2.5A Exploring the motion diagram of a dropped object
5 Acceleraton Let s turn now to moton that s not at constant elocty. An example s the moton of an object you release from rest from some dstance aboe the floor. EXPLORATION.5A Explorng the moton dagram
More informationCan Auto Liability Insurance Purchases Signal Risk Attitude?
Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? ChuShu L Department of Internatonal Busness, Asa Unversty, Tawan ShengChang
More informationSolution of Algebraic and Transcendental Equations
CHAPTER Soluton of Algerac and Transcendental Equatons. INTRODUCTION One of the most common prolem encountered n engneerng analyss s that gven a functon f (, fnd the values of for whch f ( = 0. The soluton
More informationI. SCOPE, APPLICABILITY AND PARAMETERS Scope
D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable
More informationWhat is Candidate Sampling
What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble
More informationDescriptive Statistics (60 points)
Economcs 30330: Statstcs for Economcs Problem Set 2 Unversty of otre Dame Instructor: Julo Garín Sprng 2012 Descrptve Statstcs (60 ponts) 1. Followng a recent government shutdown, Mnnesota Governor Mark
More informationRegression Models for a Binary Response Using EXCEL and JMP
SEMATECH 997 Statstcal Methods Symposum Austn Regresson Models for a Bnary Response Usng EXCEL and JMP Davd C. Trndade, Ph.D. STATTECH Consultng and Tranng n Appled Statstcs San Jose, CA Topcs Practcal
More informationCHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol
CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL
More informationCapital asset pricing model, arbitrage pricing theory and portfolio management
Captal asset prcng model, arbtrage prcng theory and portfolo management Vnod Kothar The captal asset prcng model (CAPM) s great n terms of ts understandng of rsk decomposton of rsk nto securtyspecfc rsk
More informationMetaAnalysis of Hazard Ratios
NCSS Statstcal Softare Chapter 458 MetaAnalyss of Hazard Ratos Introducton Ths module performs a metaanalyss on a set of togroup, tme to event (survval), studes n hch some data may be censored. These
More informationSIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA
SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA E. LAGENDIJK Department of Appled Physcs, Delft Unversty of Technology Lorentzweg 1, 68 CJ, The Netherlands Emal: e.lagendjk@tnw.tudelft.nl
More informationMultivariate EWMA Control Chart
Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant
More informationAn Alternative Way to Measure Private Equity Performance
An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate
More informationH 1 : at least one is not zero
Chapter 6 More Multple Regresson Model The Ftest Jont Hypothess Tests Consder the lnear regresson equaton: () y = β + βx + βx + β4x4 + e for =,,..., N The tstatstc gve a test of sgnfcance of an ndvdual
More informationDescribing Communities. Species Diversity Concepts. Species Richness. Species Richness. SpeciesArea Curve. SpeciesArea Curve
peces versty Concepts peces Rchness pecesarea Curves versty Indces  mpson's Index  hannonwener Index  rlloun Index peces Abundance Models escrbng Communtes There are two mportant descrptors of a communty:
More informationExhaustive Regression. An Exploration of RegressionBased Data Mining Techniques Using Super Computation
Exhaustve Regresson An Exploraton of RegressonBased Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The
More informationThe Analysis of Outliers in Statistical Data
THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate
More information1 Example 1: Axisaligned rectangles
COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton
More informationLinear Regression Analysis for STARDEX
Lnear Regresson Analss for STARDEX Malcolm Halock, Clmatc Research Unt The followng document s an overvew of lnear regresson methods for reference b members of STARDEX. Whle t ams to cover the most common
More informationNandini Dendukuri 1,2 Caroline Reinhold 3,4
Dendukur and Renhold Correlaton and Regresson Research Fundamentals of Clncal Research for Radologsts Downloaded from www.ajronlne.org by 37.44.07.0 on 0/3/7 from I address 37.44.07.0. Copyrght ARRS. For
More informationIntroduction: Analysis of Electronic Circuits
/30/008 ntroducton / ntroducton: Analyss of Electronc Crcuts Readng Assgnment: KVL and KCL text from EECS Just lke EECS, the majorty of problems (hw and exam) n EECS 3 wll be crcut analyss problems. Thus,
More informationThe Magnetic Field. Concepts and Principles. Moving Charges. Permanent Magnets
. The Magnetc Feld Concepts and Prncples Movng Charges All charged partcles create electrc felds, and these felds can be detected by other charged partcles resultng n electrc force. However, a completely
More informationNasdaq Iceland Bond Indices 01 April 2015
Nasdaq Iceland Bond Indces 01 Aprl 2015 Fxed duraton Indces Introducton Nasdaq Iceland (the Exchange) began calculatng ts current bond ndces n the begnnng of 2005. They were a response to recent changes
More informationChapter XX More advanced approaches to the analysis of survey data. Gad Nathan Hebrew University Jerusalem, Israel. Abstract
Household Sample Surveys n Developng and Transton Countres Chapter More advanced approaches to the analyss of survey data Gad Nathan Hebrew Unversty Jerusalem, Israel Abstract In the present chapter, we
More informationCalculation of Sampling Weights
Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a twostage stratfed cluster desgn. 1 The frst stage conssted of a sample
More informationDEFINING %COMPLETE IN MICROSOFT PROJECT
CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMISP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,
More informationInstitute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic
Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange
More informationSection 5.4 Annuities, Present Value, and Amortization
Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today
More informationThe Probit Model. Alexander Spermann. SoSe 2009
The Probt Model Aleander Spermann Unversty of Freburg SoSe 009 Course outlne. Notaton and statstcal foundatons. Introducton to the Probt model 3. Applcaton 4. Coeffcents and margnal effects 5. Goodnessofft
More informationBrigid Mullany, Ph.D University of North Carolina, Charlotte
Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte
More informationChapter 7. RandomVariate Generation 7.1. Prof. Dr. Mesut Güneş Ch. 7 RandomVariate Generation
Chapter 7 RandomVarate Generaton 7. Contents Inversetransform Technque AcceptanceRejecton Technque Specal Propertes 7. Purpose & Overvew Develop understandng of generatng samples from a specfed dstrbuton
More informationRecurrence. 1 Definitions and main statements
Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.
More informationStudy on CET4 Marks in China s Graded English Teaching
Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes
More informationII. PROBABILITY OF AN EVENT
II. PROBABILITY OF AN EVENT As ndcated above, probablty s a quantfcaton, or a mathematcal model, of a random experment. Ths quantfcaton s a measure of the lkelhood that a gven event wll occur when the
More informationPing Pong Fun  Video Analysis Project
Png Pong Fun  Vdeo Analyss Project Objectve In ths experment we are gong to nvestgate the projectle moton of png pong balls usng Verner s Logger Pro Software. Does the object travel n a straght lne? What
More informationModule 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..
More informationPassive Filters. References: Barbow (pp 265275), Hayes & Horowitz (pp 3260), Rizzoni (Chap. 6)
Passve Flters eferences: Barbow (pp 6575), Hayes & Horowtz (pp 360), zzon (Chap. 6) Frequencyselectve or flter crcuts pass to the output only those nput sgnals that are n a desred range of frequences (called
More informationAnswer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy
4.02 Quz Solutons Fall 2004 MultpleChoce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multplechoce questons. For each queston, only one of the answers s correct.
More informationGraph Theory and Cayley s Formula
Graph Theory and Cayley s Formula Chad Casarotto August 10, 2006 Contents 1 Introducton 1 2 Bascs and Defntons 1 Cayley s Formula 4 4 Prüfer Encodng A Forest of Trees 7 1 Introducton In ths paper, I wll
More informationPrediction of Wind Energy with Limited Observed Data
Predcton of Wnd Energy wth Lmted Observed Data Shgeto HIRI, khro HOND Nagasak R&D Center, MITSISHI HEVY INDSTRIES, LTD, Nagasak, 8539 JPN Masaak SHIT Nagasak Shpyard & Machnery Works, MITSISHI HEVY INDSTRIES,
More informationTHE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek
HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo
More informationQuestion 2: What is the variance and standard deviation of a dataset?
Queston 2: What s the varance and standard devaton of a dataset? The varance of the data uses all of the data to compute a measure of the spread n the data. The varance may be computed for a sample of
More informationCommunication Networks II Contents
8 / 1  Communcaton Networs II (Görg)  www.comnets.unbremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP
More information7 ANALYSIS OF VARIANCE (ANOVA)
7 ANALYSIS OF VARIANCE (ANOVA) Chapter 7 Analyss of Varance (Anova) Objectves After studyng ths chapter you should apprecate the need for analysng data from more than two samples; understand the underlyng
More informationOn the correct model specification for estimating the structure of a currency basket
On the correct model specfcaton for estmatng the structure of a currency basket JyhDean Hwang Department of Internatonal Busness Natonal Tawan Unversty 85 Roosevelt Road Sect. 4, Tape 106, Tawan jdhwang@ntu.edu.tw
More informationExamples of Multiple Linear Regression Models
ECON *: Examples of Multple Regresson Models Examples of Multple Lnear Regresson Models Data: Stata tutoral data set n text fle autoraw or autotxt Sample data: A crosssectonal sample of 7 cars sold n
More informationTime Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University
Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton
More informationChapter 15 Multiple Regression
Chapter 5 Multple Regresson In chapter 9, we consdered one dependent varable (Y) and one predctor (regressor or ndependent varable) (X) and predcted Y based on X only, whch also known as the smple lnear
More informationBinomial Link Functions. Lori Murray, Phil Munz
Bnomal Lnk Functons Lor Murray, Phl Munz Bnomal Lnk Functons Logt Lnk functon: ( p) p ln 1 p Probt Lnk functon: ( p) 1 ( p) Complentary Log Log functon: ( p) ln( ln(1 p)) Motvatng Example A researcher
More informationAn Evaluation of the Extended Logistic, Simple Logistic, and Gompertz Models for Forecasting Short Lifecycle Products and Services
An Evaluaton of the Extended Logstc, Smple Logstc, and Gompertz Models for Forecastng Short Lfecycle Products and Servces Charles V. Trappey a,1, Hsnyng Wu b a Professor (Management Scence), Natonal Chao
More informationPart 1: quick summary 5. Part 2: understanding the basics of ANOVA 8
Statstcs Rudolf N. Cardnal Graduatelevel statstcs for psychology and neuroscence NOV n practce, and complex NOV desgns Verson of May 4 Part : quck summary 5. Overvew of ths document 5. Background knowledge
More informationPRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB.
PRACTICE 1: MUTUAL FUNDS EVALUATION USING MATLAB. INDEX 1. Load data usng the Edtor wndow and mfle 2. Learnng to save results from the Edtor wndow. 3. Computng the Sharpe Rato 4. Obtanng the Treynor Rato
More informationLinear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits
Lnear Crcuts Analyss. Superposton, Theenn /Norton Equalent crcuts So far we hae explored tmendependent (resste) elements that are also lnear. A tmendependent elements s one for whch we can plot an / cure.
More informationControl Charts for Means (Simulation)
Chapter 290 Control Charts for Means (Smulaton) Introducton Ths procedure allows you to study the run length dstrbuton of Shewhart (Xbar), Cusum, FIR Cusum, and EWMA process control charts for means usng
More information9 Arithmetic and Geometric Sequence
AAU  Busness Mathematcs I Lecture #5, Aprl 4, 010 9 Arthmetc and Geometrc Sequence Fnte sequence: 1, 5, 9, 13, 17 Fnte seres: 1 + 5 + 9 + 13 +17 Infnte sequence: 1,, 4, 8, 16,... Infnte seres: 1 + + 4
More informationForecasting the Direction and Strength of Stock Market Movement
Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract  Stock market s one of the most complcated systems
More informationJoe Pimbley, unpublished, 2005. Yield Curve Calculations
Joe Pmbley, unpublshed, 005. Yeld Curve Calculatons Background: Everythng s dscount factors Yeld curve calculatons nclude valuaton of forward rate agreements (FRAs), swaps, nterest rate optons, and forward
More informationQuantization Effects in Digital Filters
Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value
More informationThe Development of Web Log Mining Based on ImproveKMeans Clustering Analysis
The Development of Web Log Mnng Based on ImproveKMeans Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.
More informationStaff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall
SP 200502 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 148537801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent
More informationExperiment 8 Two Types of Pendulum
Experment 8 Two Types of Pendulum Preparaton For ths week's quz revew past experments and read about pendulums and harmonc moton Prncples Any object that swngs back and forth can be consdered a pendulum
More informationPortfolio Loss Distribution
Portfolo Loss Dstrbuton Rsky assets n loan ortfolo hghly llqud assets holdtomaturty n the bank s balance sheet Outstandngs The orton of the bank asset that has already been extended to borrowers. Commtment
More informationLinear Regression, Regularization BiasVariance Tradeoff
HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton BasVarance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng
More informationQuality Adjustment of Secondhand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index
Qualty Adustment of Secondhand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan
More informationChapter 2. Determination of appropriate Sample Size
Chapter Determnaton of approprate Sample Sze Dscusson of ths chapter s on the bass of two of our publshed papers Importance of the sze of sample and ts determnaton n the context of data related to the
More informationStatistical Methods to Develop Rating Models
Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and
More information+ + +   This circuit than can be reduced to a planar circuit
MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to
More informationEvaluating credit risk models: A critique and a new proposal
Evaluatng credt rsk models: A crtque and a new proposal Hergen Frerchs* Gunter Löffler Unversty of Frankfurt (Man) February 14, 2001 Abstract Evaluatng the qualty of credt portfolo rsk models s an mportant
More informationAn Empirical Study of Search Engine Advertising Effectiveness
An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan RmmKaufman, RmmKaufman
More information