SIMPLE LINEAR REGRESSION

Size: px
Start display at page:

Download "SIMPLE LINEAR REGRESSION"

Transcription

1 SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION Documets prepared for use i course B0.305, New York Uiversity, Ster School of Busiess Fictitious eample, = 0. Page 3 This shows the arithmetic for fittig a simple liear regressio. Summary of simple regressio arithmetic page 4 This documet shows the formulas for simple liear regressio, icludig the calculatios for the aalysis of variace table. Aother eample of regressio arithmetic page 8 This eample illustrates the use of wolf tail legths to assess weights. Yes, these data are fictitious. A illustratio of residuals page 0 This eample shows a eperimet relatig the height of suds i a dishpa to the quatity of soap placed ito the water. This also shows how you ca get Miitab to list the residuals. The simple liear regressio model page This sectio shows the very importat liear regressio model. It s very helpful to uderstad the distictio betwee parameters ad estimates. Regressio oise terms page 4 What are those epsilos all about? What do they mea? Why do we eed to use them? More about oise i a regressio page 8 Radom oise obscures the eact relatioship betwee the depedet ad idepedet variables. Here are pictures showig the cosequeces of icreasig oise stadard deviatio. There is a techical discussio of the cosequeces of measuremet oise i a idepedet variable. This etire discussio is doe for simple regressio, but the ideas carry over i a complicated way to multiple regressio. Does regressio idicate causality? page 6 This shows a covicig relatioship betwee X ad Y. Do you thik that this should be iterpreted as cause ad effect? A iterpretatio for residuals page 8 The residuals i this eample have a very cocrete iterpretatio.

2 SIMPLE LINEAR REGRESSION Elasticity page 3 The ecoomic otio of elasticity is geerally obtaied from liear regressio. Here s how. Summary of regressio otios for oe predictor page 34 This is a quick oe-page summary as to what we are tryig to do with a simple regressio. The residual versus fitted plot page 35 Checkig the residual versus fitted plot is ow stadard practice i doig liear regressios. A eample of the residual versus fitted plot page 39 This shows that the methods eplored o pages ca be useful for real data problems. Ideed, the epadig residuals situatio is very commo. Trasformig the depedet variable page 44 Why does takig the log of the depedet variable cure the problem of epadig residuals? The math is esoteric, but these pages lay out the details for you. The correlatio coefficiet page 48 These pages provide the calculatio formulas for fidig the correlatio coefficiet. There is also a discussio of iterpretatio, alog with a detailed role of the correlatio coefficiet i makig ivestmet diversificatio decisios. O page 4 is a prelude to the discussio of the regressio effect (below). Covariace page 53 The covariace calculatio is part of the arithmetic used to obtai a correlatio. Covariaces are ot that easy to iterpret. The regressio effect page 55 The regressio effect is everywhere. What is it? Why does it happe? The correlatio coefficiet has the very importat role of determiig the rate of regressio back to average. Cover photo: Motauk lighthouse Revised 4 AUG 004 Gary Simo, 004

3 FICTITIOUS EXAMPLE, = 0 Cosider a set of 0 data poits: : Y: Begi by fidig Σ i = = 43 Σ = i = 7 Σ y i = = 0 Σ y = i =,094 Σ i y i = = 476 The fid = 43 0 = 4.3 ad y = 0 = Net S = Σ - ( Σ i ) i ( )( ) S y = Σ i y i - Σ Σy i i = = = = 37.4 S yy = Σ - ( Σ yi ) y i =, = 53.6 This leads to b = S y S = ad the b 0 = y - b = The regressio lie ca be reported as Y = If the spurious precisio aoys you, report the lie istead as Y = The quatity S yy was ot used here. It has may other uses i regressio calculatios, so it is worth the trouble to fid it early i the work. 3

4 SUMMARY OF SIMPLE REGRESSION ARITHMETIC Here are the calculatios eeded to do a simple regressio. Aside: The word simple here refers to the use of just oe to predict y. Problems i which two or more variables are used to predict y are called multiple regressio. The iput data are (, y ), (, y ),, (, y ). The outputs i which we are itereseted (so far) are the values of b (estimated regressio slope) ad b 0 (estimated regressio itercept). These will allow us to write the fitted regressio lie Y = b 0 + b. () Fid the five sums i, y i, i, y i, y i i. i= i= i= () Fid the five epressios, y, S = S yy = i= y i F HG i= y i I KJ i=, S y = y i= i i i= i F HG F HG i= i= i I KJ I KJ F H G i i= i=, y i I K J. (3) Give the slope estimate as b = S y S b 0 = y - b. ad the itercept estimate as (4) For later use, record S yy = S yy d Sy i. S Virtually all the calculatios for simple regressio are based o the five quatities foud i step (). The regressio fittig procedure is kow as least squares. It gets this ame ( [ ]) i 0 i because the resultig values of b 0 ad b miimize the epressio y b + b. This is a good criterio to optimize for may reasos, but uderstadig these reasos will force us to go ito the regressio model. i= 4

5 SUMMARY OF SIMPLE REGRESSION ARITHMETIC As a eample, cosider a data set with = 0 ad with It follows that Σ i = 00 Σ i = 4,50 Σ y i =,000 Σ y i = 06,50 Σ i y i = 0,750 = 00 0 = 0 y =, 000 = 00 0 S = 4, = 50 S y = 0,750-00,000 0 = 750 It follows et that b = S y S = = 3 ad b 0 = y - b = 00-3(0) = 40. The fitted regressio lie would be give as Y = We could ote also S yy = 06,50 - = 4,000.,000 0 = 6,50. The S yy = 6, We use S yy to get s ε, the estimate of the oise stadard deviatio. The relatioship is s ε = S yy, ad here that value is 4,000 0 =

6 SUMMARY OF SIMPLE REGRESSION ARITHMETIC I fact, we ca use these simple quatities to compute the regressio aalysis of variace table. The table is built o the idetity SS total = SS regressio + SS residual The quatity SS residual is ofte amed SS error. The subscripts are ofte abbreviated. Thus, you will see referece to SS tot, SS regr, SS resid, ad SS err. For the simple regressio case, these are computed as SS tot = S yy ( Sy SS regr = ) S SS resid = S yy ( S ) y S The aalysis of variace table for simple regressio is set up as follows: Source of Variatio Degrees of freedom Sum of Squares ( S ) y Regressio ( S ) y Residual - Total - S yy S yy S S Mea Squares S yy ( S ) y S ( S ) y S MS MS F Regressio Resid 6

7 SUMMARY OF SIMPLE REGRESSION ARITHMETIC For the data set used here, the aalysis of variace table would be Source of Variatio Degrees of freedom Sum of Squares Mea Squares Regressio,50, Residual 8 4, Total 9 6,50 F Just for the record, let s ote some other computatios commoly doe for regressio. The iformatio give et applies to regressios with K predictors. To see the forms for simple regressio, just use K = as eeded. The estimate for the oise stadard deviatio is the square root of the mea square i the residual lie. This is , as oted previously. The symbol s is frequetly used for this, as are s Y X ad s ε. The R statistic is the ratio SS SS regr tot, which is here, 50 6, 50 = The stadard deviatio of Y ca be give as SS tot, which is here 6,50 9 It is sometimes iterestig to compare s ε (the estimate for the oise stadard deviatio) to s Y (the stadard deviatio of Y). It ca be show that the ratio of these is sε = s K Y ( R ) s ε The quatity = ( R ) is called the adjusted R sy K statistic, R adj. 7

8 ANOTHER EXAMPLE OF REGRESSION ARITHMETIC The followig data are foud i the file X:\SOR\B0305\M\WOLVES.MTP: TLegth Weight These refer to the tail legths (i iches) ad the weights (i pouds) of 0 wolves. The idea is predict weight from tail legths. Here are some useful summaries: Descriptive Statistics Variable N Mea Media Tr Mea StDev SE Mea TLegth Weight Variable Mi Ma Q Q3 TLegth Weight Correlatios (Pearso) Correlatio of TLegth ad Weight = Here are the results of a regressio request: Regressio Aalysis: Weight versus TLegth The regressio equatio is Weight = TLegth Predictor Coef SE Coef T P Costat TLegth S =.36 R-Sq = 36.0% R-Sq(adj) = 8.0% Aalysis of Variace Source DF SS MS F P Regressio Residual Error Total Uusual Observatios Obs TLegth Weight Fit SE Fit Residual St Resid R R deotes a observatio with a large stadardized residual 8

9 ANOTHER EXAMPLE OF REGRESSION ARITHMETIC We must, of course, eamie scatterplots. Formally, the regressio activity is usig the model WEIGHT i = β 0 + β TLENGTH i + ε i, where i =,,, 0, where β 0 ad β are ukow parameters, ad where ε, ε,, ε 0 are statistical oise terms. It is assumed that the oise terms are idepedet with mea 0 ad ukow stadard deviatio σ. The fitted regressio equatio is that obtaied from the computer output. Namely, it s WEIGHT = TLENGTH. Here b 0 = 40 is the estimate of β 0, ad b = 3 is the estimate of β. (We sometimes replace the symbols b 0 ad b by ˆβ 0 ad β ˆ.) If you wish to check the computatioal formulas, use for the Tlegth variable ad use y for the Weight variable. The, it happes that Σ i = 00 Σ i = 4,50 Σ y i =,000 Σ y i = 06,50 Σ i y i = 0,750 It follows that = 00 0 = 0 y =, 000 = 00 0 S = 4, = 50 S y = 0,750-00, = 750 It follows the that b = S y S = = 3 ad b 0 = y - b = 00-3(0) = 40. 9

10 AN ILLUSTRATION OF RESIDUALS The data below give the suds height i millimeters as a fuctio of grams of soap used i a stadard dishpa. SOAP SUDS Let s fit the ordiary regressio model ad eamie the residuals. You ca arrage to have the residuals saved by doig Stat Regressio Regressio Storage [ Residuals OK ] Here is the regressio output: Regressio Aalysis The regressio equatio is SUDS = SOAP Predictor Coef StDev T P Costat SOAP S =.835 R-Sq = 98.0% R-Sq(adj) = 97.8% Aalysis of Variace Source Regressio DF SS MS F P Error Total The fitted model is SUDS = SOAP. Usig this fitted model we ca get the residuals as e i = SUDS i - [ SOAP i ] = [ Actual SUDS value for poit i ] - [ Retro-fit SUDS value for poit i ] For istace, for poit, this value is [ (3.5) ]. 0

11 AN ILLUSTRATION OF RESIDUALS Actually, our Storage request to Miitab did the arithmetic. The residuals were left i a ew colum i Miitab s Data widow uder the ame RESI. (Residuals from subsequet regressios would have differet ames, ad you also have the optio of editig the ame RESI.) Here are the actual values for this data set: SOAP SUDS RESI

12 THE SIMPLE LINEAR REGRESSION MODEL The data for a Y-o-X regressio problem come i the form (, Y ), (, Y ),., (, Y ). These may be coveietly laid out i a matri or spreadsheet: Case Y Y Y Y The word case might be replaced by poit or data poit or sequece umber or might eve be completely abset. The labels ad Y could be other ames, such as year or sales. I a data file i Miitab, the values for the s ad y s will be actual umbers, rather tha algebra symbols. I a Ecel spreadsheet, these could be either umbers or implicit values. If a computer program is asked for the regressio of Y o, the umeric calculatios will be doe. These calculatios have somethig to say about the regressio model, which we discuss ow. The most commo liear regressio model is this. The values,,..., are kow o-radom quatities which are measured without error. If i fact the values really are radom, the we assume that they are fied oce we have observed them. This is a verbal sleight of had; techically we say we are doig the aalysis coditioal o the s. The Y-values are idepedet of each other, ad they are related to the s through the model equatio Y i = β 0 + β i + ε i for i =,, 3,, The symbols β 0 ad β i the model equatio are oradom ukow parameters. The symbols ε,ε,, ε are called statistical oise or errors. The ε-values prevet us from seeig the eact liear relatioship betwee ad Y. These ε-values are uobserved radom quatities. They are assumed to be statistically idepedet of each other, ad they are assumed to have epected value zero. It is also assumed that (usig SD for stadard deviatio) SD(ε ) = SD(ε ) = = SE(ε ) = σ ε. The symbol σ ε is aother oradom ukow parameter.

13 THE SIMPLE LINEAR REGRESSION MODEL The calculatios that we will do for a regressio will make statemets about the model. Sy For eample, the estimated regressio slope b = is a estimate of the parameter β. S Here is a summary of a few regressio calculatios, alog with the statemets that they make about the model. Calculatio Sy b = S What it meas (β ˆ Estimate of regressio slope β used also) b 0 = y - b ( ˆβ 0 used also) Estimate of regressio itercept β 0 Residual mea square Estimate of σ Root mea square residual (stadard error Estimate of σ of regressio) Stadard error of a estimated coefficiet Estimate of the stadard deviatio of that coefficiet t (of a estimated coefficiet) Estimated coefficiet, divided by its stadard error 3

14 NOISE IN A REGRESSION The liear regressio model with oe predictor says that Y i = β 0 + β i + ε i for i =,,, The ε s represet oise terms. These are assumed to be draw from a populatio with mea 0 ad with stadard deviatio σ. Let s make the iitial observatio that if σ = 0, the all the ε s are zero ad we should see the lie eactly. Here is such a situatio: Y X

15 NOISE IN A REGRESSION Ideed, if you do the regressio computatios, you ll get to see the true lie eactly. Regressio Plot Y = X S = 0 R-Sq = 00.0 % R-Sq(adj) = 00.0 % Y X The equatio is revealed as Y =

16 NOISE IN A REGRESSION Now, what if there really were some oise? Suppose that σ = 0. The picture below shows what might happe. Regressio Plot Y = X S = R-Sq = 94.0 % R-Sq(adj) = 93. % 450 Y X The poits stray from the true lie. As a result, the fitted lie we get, here Y = , is somewhat differet from the true lie. 6

17 NOISE IN A REGRESSION What would happe if we had large, disturbig oise? Suppose that σ = 50. The picture below shows this problem: Regressio Plot Y = X S = 83.8 R-Sq =.8 % R-Sq(adj) = 0.7 % Y X You might otice the chage i the vertical scale! We did t do a very good job of fidig the correct lie. The poits i this picture are so scattered that it s ot eve clear that we have ay relatioship at all betwee Y ad. 7

18 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø There are may cotets i which regressio aalysis is used to estimate fied ad variable costs for complicated processes. The followig data set ivolves the quatities produced ad the costs for the productio of a livestock food mi for each of 0 days. The quatities produced were measured i the obvious way, ad the costs were calculated directly as labor costs + raw material costs + lightig + heatig + equipmet costs. The equipmet costs were computed by amortizig purchase costs over the useful lifetimes, ad the other costs are reasoably straightforward. I fact, the actual fied cost (per day) was $,500, ad the variable cost was $00/to. Thus the eact relatioship we see should be Cost = $, $ to Quatity. Here is a picture of this eact relatioship: 000 True cost Quatity (tos) It happes, however, that there is statistical oise i assessig cost, ad this oise has a stadard deviatio of $00. Schematically, we ca thik of our origial picture as beig spread out with vertical oise: ø 8

19 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø 000 C Quatity (tos) Here the are the data which we actually see: Quatity Cost Quatity Cost The quatities are i tos, ad the costs are i dollars. ø 9

20 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø Here is a scatterplot for the actual data: Costs i dollars to produce feed quatities i tos 0800 Cost(00) Quatity (There is a oise stadard deviatio of $00 i computig costs.) The footote shows that i the process of assessig costs, there is oise with a stadard deviatio of $00. I spite of this oise, the picture is fairly clea. The fitted regressio lie is Côst = $, $ to Quatity. The value of R is 9.7%, so we kow that this is a good regressio. We would assess the daily fied cost at $,088, ad we would assess the variable cost at $0/to. Please bear i mid that this discussio higes o kowig the eact fied ad variable costs ad kowig about the $00 oise stadard deviatio; i other words, this is a simulatio i which we really kow the facts. A aalyst who sees oly these data would ot kow the eact aswer. Of course, the aalyst would compute s ε = $83.74, so that Quatity True value Value estimated from data Fied cost $,500 b 0 = $,088 Variable cost $00/to b = $0/to Noise stadard deviatio $00 s ε = $83.74 All i all, this is ot bad. ø 0

21 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø As a etesio of this hypothetical eercise, we might ask how the data would behave with a $00 stadard deviatio associated with assessig costs. Here is that scatterplot: Cost i dollars to produce feed quatities i tos 000 Cost(00) Quatity (tos) (There is a oise stadard deviatio of $00 i computig costs.) 4 4 $ For this scatterplot, the fitted regressio equatio is Côst = $3, to Quatity. Also for this regressio we have R = 55.4%. Our estimates of fied ad variable costs are still statistically ubiased, but they are ifected with more oise. Thus, our fied cost $ estimate of $3,90 ad our variable cost estimate of 65 to are ot all that good. Of course, oe ca overcome the larger stadard deviatio i computig the cost by takig more data. For this problem, the aalyst would see s ε = $0.0. Quatity True value Value estimated from data Fied cost $,500 b 0 = $3,90 Variable cost $00/to b = $65/to Noise stadard deviatio $00 s ε = $0.0 This is ot early as good as the above, but this may be more typical. It is importat to ote that oise i assessig cost, the vertical variable, still gives us a statistically valid procedure. The ucertaity ca be overcome with a larger sample size. ø

22 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø We will ow make a distictio betwee oise i the vertical directio (oise i computig cost) ad oise i the horizotal directio (oise i measurig quatity). A more serious problem occurs whe the horizotal variable, here quatity produced, is ot measured eactly. It is certaily plausible that oe might make such measurig errors whe dealig with merchadise such as livestock feed. For these data, the set of 0 quatities has a stadard deviatio of.39 tos. This schematic illustrates the otio that our quatities, the horizotal variable, might ot be measured precisely: 000 True cost Quatity (tos) Here is a picture showig the hypothetical situatio i which costs eperieced a stadard deviatio of measuremet of $00 while the feed quatities had a stadard deviatio of measuremet of.5 tos. ø

23 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø Cost i dollars to produce feed quatities i tos 0800 Cost(00) Qtty(SD.5) (There is a oise stadard deviatio of $00 i computig costs ad quatities have bee measured with a SD of.5 tos.) 45 For this picture the relatioship is much less covicig. I fact, the fitted regressio $ equatio is Côst = $7, to Quatity. Also, this has s ε = $5.60. This has ot helped: Quatity True value Value estimated from data Fied cost $,500 b 0 = $7,5 Variable cost $00/to b = $74.0/to Noise stadard deviatio $00 s ε = $5.60 The value of R here is 34.0%, which suggests that the fit is ot good. Clearly, we would like both cost ad quatity to be assessed perfectly. However, oise i measurig costs leaves our procedure valid (ubiased) but with imprecisio that ca be overcome with large sample sizes oise i measurig quatities makes our procedure biased The data do ot geerally provide clues as to the situatio. ø 3

24 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø Here the is a summary of our situatio. Suppose that the relatioship is True cost = β 0 + β True quatity Suppose that we observe ad where β 0 is the fied cost ad β is the variable cost Y = True cost + ε where ε represets the oise i measurig or assessig the cost, with stadard deviatio σ ε = True quatity + ζ where ζ represets the oise i measurig or assessig the quatity, with stadard deviatio σ ζ Let us also suppose that the True quatities themselves are draw from a populatio with mea µ ad stadard deviatio σ. You will do least squares to fid the fitted lie Y = b 0 + b. σ It happes that b, the sample versio of the variable cost, estimates β. σ + σ ζ Of course, if σ ζ = 0 (o measurig error i the quatities), the b estimates β. It is importat to observe that if σ ζ > 0, the b is biased closer to zero. It happes that b 0, the sample versio of the fied cost, estimates σ β +β µ. σ +σ ζ 0 ζ If σ ζ = 0, the b 0 correctly estimates the fied cost β 0. The impact i accoutig problems is that we will ted to uderestimate the variable cost ad overestimate the fied cost. ø 4

25 øøøøøøøøøøø MORE ABOUT NOISE IN A REGRESSION ø øøøøøøøøøø ζ You ca see that the critical ratio here is σ, the ratio of the variace of the oise i σ relative to the variace of the populatio from which the s are draw. I the real situatio, you ve got oe set of data, you have o idea about the values of β 0, β, σ, σ ζ, or σ ε. If you have a large value of R, say over 90%, the you ca be pretty sure that b ad b 0 are useful as estimates of β ad β 0. If the value of R is ot large, you simply do ot kow whether to attribute this to a large σ ε, to a large σ ζ, or to both. Quatity X idepedet variable Cost Y depedet variable Small σ ε (cost measured precisely) Large σ ε (cost measured imprecisely) Small σ ζ /σ (quatity measured precisely relative to its backgroud variatio) b 0 ad b early ubiased with their ow stadard deviatios low; R will be large b 0 ad b early ubiased but their ow stadard deviatios may be large; R will ot be large Large σ ζ /σ (quatity measured imprecisely relative to its backgroud variatio) b seriously biased dowward ad b 0 seriously biased upward; R will ot be large b seriously biased dowward ad b 0 seriously biased upward; R will ot be large Do you have ay recourse here? If you kow or suspect that σ ε will be large, meaig poor precisio is assessig costs, you ca simply recommed a larger sample size. If you kow or suspect that σ ζ will be large relative to σ, there are two possible actios: By obtaiig multiple readigs of for a sigle true quatity, it may be possible to estimate σ ζ ad thus udo the bias. You will eed to obtai the services of a serious statistical epert, ad he or she should certaily be well paid. You ca spread out the -values so as to elarge σ (presumably without alterig the value of σ ζ ). I the situatio of our aimal feed eample, it may be procedurally impossible to do this. ø 5

26 DOES REGRESSION SHOW CAUSALITY? The followig data file shows iformatio o 30 male MBA cadidates at the Uiversity of Pittsburgh. The first colum gives height i iches, ad the secod colum gives the mothly icome of the iitial post-mba job. (These appeared i the Wall Street Joural, 30 DEC 86.) Here is a scatterplot: Icome Height 75 This certaily suggests some form of relatioship! The results of the regressio are these: Regressio Aalysis The regressio equatio is Icome = Height Predictor Costat Coef -45. StDev 48.5 T -.08 P 0.90 Height S = R-Sq = 7.4% R-Sq(adj) = 70.3% Aalysis of Variace Source DF SS MS F P Regressio Error Total Uusual Observatios 6

27 DOES REGRESSION SHOW CAUSALITY? Obs Height Icome Fit StDev Fit Residual St Resid R R R deotes a observatio with a large stadardized residual The sectio below gives fitted values correspodig to HEIGHT ew = Note that Miitab lists the 66.5 value i its output; this is to remid you that you asked for this particular predictio. Predicted Values for New Observatios New Obs Fit SE Fit 95.0% CI 95.0% PI ( 833.3, 938.3) ( 68.6, ) Values of Predictors for New Observatios New Obs Height 66.5 We see that the fitted equatio is INCÔME = HEIGHT. The obvious iterpretatio is that each additioal ich of height is worth $50.0 per moth. Ca we believe ay cause-ad-effect here? The R value is reasoably large, so that this is certaily a useful regressio. Suppose that you wated a 95% cofidece iterval for the true slope β. This would be give as b ± t α/;- SE(b), which is ± , or ±.307. You should be able to locate the values ad i the listig above. Suppose that you d like to make a predictio for a perso with height Miitab will give this to you, ad remids you by repeatig the value 66.5 i its Sessio widow. You ca see from the above that the fit (or poit predictio) is, You could of course have obtaied this as , Miitab provides several other facts ear this,885.8 figure. The oly thig likely to be useful to you is idetified as the 95.0% PI, meaig 95% predictio iterval. This is (,68.5, 3,090.), meaig that you re 95% sure that the INCOME for a perso 66.5 iches tall would be betwee $,68.5 ad $3,090.. The residual-versus-fitted plot must of course be eamied. It s ot show here, just to save space, but it should be oted that this plot showed o difficulties. 7

28 AN INTERPRETATION FOR RESIDUALS The data listed below give two umbers for each of 50 middle-level maagers at a particular compay. The first umber is aual salary, ad the secod umber is years of eperiece. We are goig to eamie the relatioship betwee salary ad years of eperiece. The we ll use the residuals to idetify idividuals whose salary is out of lie with their eperiece Note that the depedet variable is SALARY, ad the idepedet variable is YEARS, meaig years of eperiece. Here is a scatterplot showig these data: SALARY YEARS Certaily there seems to be a relatioship! 8

29 AN INTERPRETATION FOR RESIDUALS We ca get the regressio work by doig this: Stat Regressio Regressio [ Respose: SALARY Predictors: YEARS Storage [ Residuals OK ] OK ] There are other iterestig features of these values, ad this eercise will ot ehaust the work that might be doe. Here are the regressio results: Regressio Aalysis The regressio equatio is SALARY = YEARS Predictor Coef SE Coef T P Costat YEARS S = 864 R-Sq = 78.7% R-Sq(adj) = 78.% Aalysis of Variace Source DF SS MS F P Regressio Error Total Uusual Observatios Obs YEARS SALARY Fit SE Fit Residual St Resid X R R R deotes a observatio with a large stadardized residual X deotes a observatio whose X value gives it large ifluece. There s more iformatio we eed right ow, but you ca at least read the fitted regressio lie as SALÂRY =,369 +,4 YEARS with the iterpretatio that a year of eperiece is worth $,4. Let s give a 95% cofidece iterval for the regressio slope. This particular topic will pursued vigorously later. The slope is estimated as,4.3, ad its stadard error is give as A stadard error, or SE, is simply a data-based estimate of a stadard deviatio of a statistic. Thus, the 95% cofidece iterval is,4.3 ± t 0.05; , or,4.3 ± , or,4.3 ± This precisio is clearly ot called for, so we ca give the iterval as simply,4 ± 33. You might observe that this is very close to (estimate) ± SE. The value of S, meaig s ε, is 8,64. The iterpretatio is that the regressio eplais salary to withi a oise with stadard deviatio $8,64. The stadard deviatio of 9

30 AN INTERPRETATION FOR RESIDUALS SALARY (for which the computatio is ot show) is $8,530; the regressio would thus have to be appraised as quite successful. The fitted value for the i th perso i the list is SALÂRY i =,369 +,4 YEARS i ad the residual for the i th perso is SALARY i - SALÂRY i, which we will call e i. Clearly the value of e i idicates how far above or below the regressio lie this perso is. The Miitab request Storage [ Residuals ] caused the residuals to be calculated ad saved. These will appear i a ew colum i the spreadsheet, RESI. (Subsequet uses would create RESI, RESI3, ad so o.) Here are the values, displayed et to the origial data: SALARY YEARS RESI SALARY YEARS RESI SALARY YEARS RESI It s easy to pick out the most etreme residuals. These are -$7,665.4 (poit 45, a 0-year perso with salary $36,530) ad $7,83. (poit 35, a 8-year perso with $99,39). These two poits are reported after the regressio as beig uusual observatios, ad they are oted to have large residuals. Poit 3 is a high ifluece poit, but that issue is ot the subject of this documet. 30

31 ÐÐÐÐÐÐÐÐÐÐÐÐ ELASTICITY Ð ÐÐÐÐÐÐÐÐÐÐÐ Let s thik of role of b i fitted model Y = b0 + b. This says that as goes up by oe uit, Y (teds to) go up by b. I fact, it s precisely what we call dy i calculus. d This measures the sesitivity of Y to. Suppose that is the price at which somethig is sold ad Y is the quatity that clears the market. It helps to rewrite this as Q = b 0 + b P. We d epect b to be egative, sice the quatities cosumed will usually decrease as the price rises. If curretly the price is P 0 ad curretly the quatity is Q 0, the Q 0 = b 0 + b P 0. Now a chage of price from P 0 to P 0 + θ leads to a chage i quatity from Q 0 to b 0 + b (P 0 + θ) = b 0 + b P 0 + b θ = Q 0 + b θ. Thus, the chage i quatity is chage i Q Q0 b θ. The proportioal chage ratio - is called a elasticity. (We use chage i P P0 mius sig sice they ted to move i opposite directio, ad we like positive elasticities.) Of course, we ca simplify this a bit, writig it fially as - b P 0. Q0 Suppose that a demad curve has equatio Q = 400,000 -,000 P. We put curve i quotes as this equatio is actually a straight lie. Ideed the illustratios i most microecoomics tets use straight lies. Suppose that the curret price is P 0 = $60. The quatity correspodig to this is Q 0 = 400,000 -, = 80,000. Now suppose that the price rises by %, to $ The quatity is ow 400,000 -, = 78,800. This is a decrease of,00 i quatity cosumed, which is a decrease of, = 0.43%. Thus a % icrease i price led to a decrease i quatity of 80, %, so we would give the elasticity as This is less tha, so that the demad is ielastic. We could of course obtai this as - b P 0 Q0 =, , Oe approach follows through o the % chage i price idea. The other approach simply uses - b P 0. These will ot give eactly the Q0 same umerical result, but the values will be very close. Observe that the elasticity calculatio depeds here o where we started. Suppose that we had started at P 0 = $80. The quatity cosumed at this price of $80 is Q 0 = 400,000 -, = 40,000. A icrease i price by %, meaig to $80.80, leads to a ew quatity of 400,000 -, = 38,400. This is a decrease of,600 i Ð 3

32 ÐÐÐÐÐÐÐÐÐÐÐÐ ELASTICITY Ð ÐÐÐÐÐÐÐÐÐÐÐ quatity cosumed. I percetage terms, this decrease is We would give the elasticity as 0.67., , = 0.67%. This is also foud as - b P 0 Q0 =, , We ca see that the elasticity for this demad curve Q = 400,000 -,000 P depeds o where we start Elasticity = 0.43 here QTTY Elasticity = 0.67 here PRICE You might check that at startig price P 0 = $0, the elasticity would be computed as.50, ad we would ow claim that the demad is (highly) elastic. Details: Fid Q 0 = 400,000 -,000 0 = 60,000. The % icrease i price would lead to a cosumptio decrease of. 000 =,400. This is a, 400 percetage decrease of = 0.05 =.5%. 60, 000 What is the shape of a demad curve o which the elasticity is the same at all prices? dq This is equivalet to askig about the curve for which Q dp P = -c, where c is costat. (The mius sig is used simply as a coveiece; of course c will be positive.) This coditio ca be epressed as dq Q = c dp P Ð 3

33 ÐÐÐÐÐÐÐÐÐÐÐÐ ELASTICITY Ð ÐÐÐÐÐÐÐÐÐÐÐ a f af for which the solutio is log Q = -c log P + d. This uses the calculus fact that d f t log faf t =. The result ca be reepressed as dt f t or e logq Q = clog = e + mp c P d where m = e d. The picture below shows the graph of Q = 4,000,000 P This curve has elasticity 0.80 at every price QTTY PRICE The equatio log Q = -c log P + d is a simple liear relatioship betwee log Q ad log P. Thus, if we base our work o a regressio of log(quatity) o log(price), the we ca claim that the resultig elasticity is the same at every price. If you bega by takig logs of BOTH variables ad fitted the regressio (log-o-log), you d get the elasticity directly from the slope (with o eed to worry about P 0 or Q 0 ). That is, i a log-o-log regressio, the elasticity is eactly -b. Now, where would we get the iformatio to actually perform oe of these regressios? It would be woderful to have prices ad quatities i a umber of differet markets which are believed to be otherwise homogeeous. For eample, we could cosider per capita cigarette sales i states with differet ta rates. Ufortuately, it is rather rare that we ca get clealy orgaized price ad quatity o a geographic basis. It is more commo by far to have price ad quatity iformatio over a large area at may poits i time. I such a case, we would also have to worry about the possibility that the market chages over time. Ð 33

34 ````` SUMMARY OF REGRESSION NOTIONS WITH ONE PREDICTOR ` ```` Measures of quality for a regressio: R wat big (as close as possible to 00%) s ε wat small s ε /s Y wat small (as close as possible to 0%) F wat big (up to ) t statistic for b wat far away from 0 Fially, here s the summary of what to do with the regressio game with oe idepedet variable. Begi by makig a scatterplot of (X, Y). You might see the eed to trasform. Idicatios: Ecessive curvature or etreme clusterig of poits i oe regio of the plot. Note SD(Y). Perform regressio of Y o X. Note R, t for b, ad s ε. Aother useful summary is the correlatio betwee ad Y. Formally, this is computed Sy as r =. It happes that b ad r have the same sig. S S yy Plot residual versus fitted. If you have pathologies, correct them ad start over. Use the regressio to make your relevat iferece. We ll check up o this later. ` 34

35 XXXXX THE RESIDUAL VERSUS FITTED PLOT X XXXX It is ow stadard practice to eamie the plot of the residuals agaist the fitted values to check for appropriateess of the regressio model. Patters i this plot are used to detect violatios of assumptios. The plot is obtaied easily i most computer packages. If you are workig i Miitab, click o Stat Regressio Regressio, idicate the depedet ad idepedet variable(s), the click o Graphs, ad check off Residuals versus fits. I the resultig plot, you ca idetify idividual poits by usig Editor Brush. You HOPE that plot looks like this: Residuals Versus the Fitted Values (respose is C) 0 0 Residual Fitted Value This picture would be described as patterless. There are a umber of commo pathologies that you will ecouter. The et picture shows curvature: Residuals Versus the Fitted Values (respose is C3) 5 Residual Fitted Value The cure for curvature cosists of revisig the model to be curved i oe or more of the predictor variables. Geerally this is doe by usig X (i additio to X), but it ca also be doe by replacig X by log X, by X, or some other oliear fuctio. X 35

36 XXXXX THE RESIDUAL VERSUS FITTED PLOT X XXXX A very frequet problem is that of residuals that epad rightward o the residual versus fitted plot. Residuals Versus the Fitted Values (respose is C3) Residual Fitted Value O this picture there is cosiderably more scatter o the right side. It appears that large values of Y also have widely scattered residuals. Sice the residuals are supposed to be idepedet of all other parts of the problem, this picture suggests that there is a violatio of assumptios. This problem ca be cured by replacig Y by its logarithm. The picture below shows a destructive outlier. Residuals Versus the Fitted Values (respose is C) 0-0 Residual Fitted Value The poit at the lower right has a very uusual value i the X-space, ad it also has a iappropriate Y. This poit must be eamied. If it is icorrect, it should be corrected. If the correct value caot be determied, the poit must certaily be removed. If the poit is i fact correct as coded, the it must be set aside. You ca observe that this poit is maskig a straight-lie relatioship which would otherwise be icorporated i the regressio coefficiet estimates. X 36

37 XXXXX THE RESIDUAL VERSUS FITTED PLOT X XXXX I the picture below, there is a very uusual poit, but it is ot destructive. Residuals Versus the Fitted Values (respose is C) 30 0 Residual Fitted Value 80 The uusual poit i this plot must be checked, of course, but it is ot particularly harmful to the regressio. At worst, it slightly elevates the value of b 0, the estimated itercept. O the picture below, there are several vertical stripes. This idicates that the X-values cosisted of a small fiite umber (here 4) of differet patters. Residuals Versus the Fitted Values (respose is C) 0 Residual Fitted Value The required actio depeds o the discreteess i the X-values that caused the stripes. If the X-values are quatitative, the the regressio is appropriate. The user should check for curvature, of course. If the X-values are umerically-valued steps of a ordered categorical variable, the the regressio is appropriate. The user should check for curvature. If the X-values are umerically-valued levels of a qualitative o-ordial categorical variable, the the problem should be recast as a aalysis of variace or as a aalysis of covariace. X 37

38 XXXXX THE RESIDUAL VERSUS FITTED PLOT X XXXX Fially, we have a patter i which imperfect stripes go across the residual-versus-fitted patter. Residuals Versus the Fitted Values (respose is C5) 0.5 Residual Fitted Value These stripes ca be geerally horizotal, as above, or they ca be oblique: Residuals Versus the Fitted Values (respose is C4) 0.5 Residual Fitted Value.0 These pictures suggest that there are oly two values of Y. I such a situatio, the appropriate tool is logistic regressio. If there are three or more such stripes, idicatig the same umber of possible Y-values, the the recommeded techique is multiple logistic regressio. Multiple logistic regressio ca be either ordial or omial accordig to whether Y is omial or ordial. X 38

39 AN EXAMPLE OF THE RESIDUAL VERSUS FITTED PLOT The file X:\SOR\B0305\M\SALARY.MTP gives SALARY ad YEARS of eperiece for a umber of middle-level eecutives. Let s first fid the regressio of SALARY o YEARS. The data set cosists of 50 poits ad looks like this: CASE SALARY YEARS CASE SALARY YEARS Here is a scatterplot: SALARY YEARS 0 30 The regressio results from Miitab are these: The regressio equatio is SALARY = YEARS Predictor Coef SE Coef T P Costat YEARS S = 864 R-Sq = 78.7% R-Sq(adj) = 78.% Aalysis of Variace Source DF SS MS F P Regressio Residual Error Total Uusual Observatios Obs YEARS SALARY Fit SE Fit Residual St Resid X R R R deotes a observatio with a large stadardized residual X deotes a observatio whose X value gives it large ifluece. 39

40 AN EXAMPLE OF THE RESIDUAL VERSUS FITTED PLOT The fitted regressio equatio is certaily SALAR Y =,369 +,4 YEARS. What iterpretatio ca we give for the estimated regressio slope? The slope,4 suggests that each year of eperiece traslates ito $,4 of salary. Here is the residual versus fitted plot. It shows a commo pathological patter. Residuals Versus the Fitted Values (respose is SALARY) Residual Fitted Value The residuals spread out to a greater degree at the right ed of the graph. This is very commo. This same observatio might have bee made i the origial plot of Salary vs Years. I the case of simple regressio (just oe idepedet variable) the residual-versus-fitted plot is a rescaled ad tilted versio of the origial plot. We ll hadle this violatio of model assumptios by makig a trasformatio o SALARY. The usual solutio is to use the logarithm of salary. Here, we ll let LSALARY be the log (base e) of SALARY. The output from Miitab is this: Regressio Aalysis The regressio equatio is LSALARY = YEARS Predictor Coef SE Coef T P Costat YEARS S = 0.54 R-Sq = 86.4% R-Sq(adj) = 86.% Aalysis of Variace Source DF SS MS F P Regressio Error Total

41 AN EXAMPLE OF THE RESIDUAL VERSUS FITTED PLOT Uusual Observatios Obs YEARS LSALARY Fit StDev Fit Residual St Resid R X R R deotes a observatio with a large stadardized residual X deotes a observatio whose X value gives it large ifluece. The residual-versus-fitted plot for this revised regressio is show et: Residuals Versus the Fitted Values (respose is LSALARY) Residual Fitted Value.0 This plot shows that the problem of epadig residuals has bee cured. Let s also ote that i the origial regressio, R = 78.7%, whereas i the regressio usig LSALARY, R = 86.4%. The fitted regressio is ow LSALARY = YEARS, which ca be rouded to LSALARY = YEARS. This suggests the iterpretatio that each year of eperiece is worth 0.05 i the logarithm of salary. You ca epoetiate the epressio above to get ˆ LSALARY YEARS e = e = SA = e ( e ) + ˆLARY YEARS I this form, icreasig YEARS by causes the fitted salary to be multiplied by e You ca use a calculator to get a umber out of this, but we have a simple approimatio e Thus, accumulatig a year of eperiece suggests that salary is to be multiplied by.05; this is a 5% raise. You might be helped by this useful approimatio. If t is ear zero, the e t + t. Thus, gettig a coefficiet of 0.05 i this regressio leads directly to the 5% raise iterpretatio. 4

42 AN EXAMPLE OF THE RESIDUAL VERSUS FITTED PLOT Let s recosider the cosequeces of our work i regressig SALARY o YEARS. The regressio model for this problem (i origial uits) was SALARY i = β 0 + β YEARS i + ε i [] The oise terms ε, ε,, ε were assumed to be a sample from a populatio with mea zero ad with stadard deviatio σ. Our fitted regressio was ˆ SALARY =,369 +,4.3 YEARS The estimate of σ was computed as 8,64.9. This was labeled S by Miitab, but you ll also see the symbols s or s Y or s ε. Predicted salaries ca be obtaied by direct plug-i. For someoe with 5 years of eperiece, the predictio would be,369 + (,4.3 5) =, Here is a short table showig some predictios; this refers to the colum Predicted SALARY with basic model []. YEARS Predicted SALARY with basic model [] Predicted SALARY with logarithm model [] 95% predictio iterval for SALARY, usig basic model [] 95% predictio iterval for SALARY, usig logarithm model [] 5,076 4,30 4,00.8 to 40,30.7 7,487.6 to 33, ,78 30,980 5,037.6 to 50,57.,577.3 to 4, ,488 39,775 5,90.5 to 6,067. 9,07. to 54, ,95 5,066 36,635.5 to 7, ,335.5 to 69, ,90 65,563 47,. to 8, ,84.8 to 89,877. It happes that model [] will lead to a residual versus fitted plot with the very commo patter of epadig residuals. The cure comes i replacig SALARY with LSALARY ad cosiderig the model LSALARY i = β 0 + β YEARS i + ε i [] For this model, the residual versus fitted plot shows a beig patter, ad we are able to believe the assumptio that the ε i s are a sample from a populatio with mea 0 ad stadard deviatio σ. The LOG here is base e. Please ote that we have recycled otatio. The parameters β 0, β, ad σ, alog with the radom variables ε through ε, do ot have the same meaigs i [] ad []. 4

43 AN EXAMPLE OF THE RESIDUAL VERSUS FITTED PLOT The fitted model correspodig to [] is ˆ LSALARY = YEARS For this model, the predicted log-salary for someoe with 5 years of eperiece would be ( ) = I origial uits, this would be e ,30. This is the first etry i the colum Predicted SALARY with logarithm model []. So what s the big deal about makig a distictio betwee models [] ad []? The fitted values are differet, with model [] givig larger values at the ed ad smaller values i the middle. Here are the major aswers: (a) (b) (c) The differece of about $4,000 i the model is ot trivial. The two models are certaily ot givig very similar aswers. Model [] has assumed equal oise stadard deviatios throughout the etire rage of YEARS values. The plot of (SALARY, YEARS) suggests that this is ot realistic. The residual versus fitted plot for model [] makes this paifully obvious. The cosequeces will be see i predictios. Eamie the colum 95% predictio iterval for SALARY, usig basic model []. Each of these predictio itervals is about $36,000 wide. This may be realistic for those people with high seiority (high values of YEARS), but it s clearly off the mark for those with low seiority. Model [] has assumed equal oise stadard deviatios, i terms of LSALARY, throughout the etire rage of YEARS values. This is more believable. I the colum 95% predictio iterval for SALARY, usig logarithm model [] the legths vary, with the loger itervals associated with loger seiority. 43

44 TRANSFORMING THE DEPENDENT VARIABLE Cosider the liear regressio problem with data (, Y ), (, Y ),, (, Y ). We form the model Y i = β 0 + β i + ε i i =,,, The assumptios which accompay this model iclude these statemets about the ε i s : The oise terms ε, ε,, ε are idepedet of each other ad of all the other symbols i the problem. The oise terms ε, ε,, ε are draw from a populatio with mea zero. The oise terms ε, ε,, ε are draw from a populatio with stadard deviatio σ. Situatios i which SD(ε i ) appears to systematically vary would violate these assumptios. The residual versus fitted plot ca detect these situatios; this plot shows the poits (e, Y ˆ ), (e, Y ˆ ),, (e, Y ˆ ). We will sometimes see that the residuals have greater variability whe the fitted values are large. The recommeded cure is that Y i be replaced by log(y i ). If some of the Y i s are zero or egative, we d use log(y i + c) for some c big eough to make all values of Y i + c positive. Why does this work? It s a bit of isaely tricky math, but it s fu. Suppose that the residual versus fitted plot suggests that SD(Y i ) is big whe β 0 + β i is big. We ll use the symbol µ i = β 0 + β i for the epected value of Y i. The observatio is that SD(Y i ) is big whe µ i is big. The are may ways of operatioalizig the statemet SD(Y i ) is big whe µ i is big. The descriptio that will work for us is SD(Y i ) = αµ i That is, the stadard deviatio grows proportioal to the mea. The symbol α is just a proportioality costat, ad it s quite irrelevat to the ultimate solutio. Let s seek a fuctioal trasformatio Y g(y) that will solve our problem. The symbol g represets a fuctio to be foud; perhaps we ll decide g(t) = t 3 or t g(t) = cos(πt) or g(t) = or somethig else. t + 44

45 TRANSFORMING THE DEPENDENT VARIABLE It s coveiet to drop the symbol i for ow. We ll be talkig about the geeral pheomeo rather that about our specific data poits. By solve our problem we are suggestig these two otios: If µ k µ the E( g(y k ) ) E( g(y ) ); that is, the fuctio g preserves differetess of meas (epected values). If µ k µ the SD( g(y k ) ) SD( g(y ) ); that is, the fuctio g allows Y k ad Y to have uequal meas but approimately equal stadard deviatios. We re goig to use two mathematical facts ad oe statistical fact. MATH FACT : If g is ay well-behaved fuctio, the g(y) g(µ) + g (µ) (y - µ) This is just Taylor s theorem. It ca be thought of as a approimate versio of the mea value theorem i calculus. The symbol µ ca be ay coveiet value. Whe we use this with a radom variable Y, we ll take µ as the mea of the radom variable. MATH FACT : If the fuctio g has a derivative for which g (t) = k t the the fuctio g(t) = log t is oe possible solutio. (There are may possible solutios, but this oe is simplest. Also, we ll use base-e logs. The most detailed solutio would be g(t) = A + k log t.) 45

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

NATIONAL SENIOR CERTIFICATE GRADE 12

NATIONAL SENIOR CERTIFICATE GRADE 12 NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets BENEIT-CST ANALYSIS iacial ad Ecoomic Appraisal usig Spreadsheets Ch. 2: Ivestmet Appraisal - Priciples Harry Campbell & Richard Brow School of Ecoomics The Uiversity of Queeslad Review of basic cocepts

More information

3. If x and y are real numbers, what is the simplified radical form

3. If x and y are real numbers, what is the simplified radical form lgebra II Practice Test Objective:.a. Which is equivalet to 98 94 4 49?. Which epressio is aother way to write 5 4? 5 5 4 4 4 5 4 5. If ad y are real umbers, what is the simplified radical form of 5 y

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

OMG! Excessive Texting Tied to Risky Teen Behaviors

OMG! Excessive Texting Tied to Risky Teen Behaviors BUSIESS WEEK: EXECUTIVE EALT ovember 09, 2010 OMG! Excessive Textig Tied to Risky Tee Behaviors Kids who sed more tha 120 a day more likely to try drugs, alcohol ad sex, researchers fid TUESDAY, ov. 9

More information

Building Blocks Problem Related to Harmonic Series

Building Blocks Problem Related to Harmonic Series TMME, vol3, o, p.76 Buildig Blocks Problem Related to Harmoic Series Yutaka Nishiyama Osaka Uiversity of Ecoomics, Japa Abstract: I this discussio I give a eplaatio of the divergece ad covergece of ifiite

More information

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection

The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

Confidence Intervals for Linear Regression Slope

Confidence Intervals for Linear Regression Slope Chapter 856 Cofidece Iterval for Liear Regreio Slope Itroductio Thi routie calculate the ample ize eceary to achieve a pecified ditace from the lope to the cofidece limit at a tated cofidece level for

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

MATH 083 Final Exam Review

MATH 083 Final Exam Review MATH 08 Fial Eam Review Completig the problems i this review will greatly prepare you for the fial eam Calculator use is ot required, but you are permitted to use a calculator durig the fial eam period

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2 74 (4 ) Chapter 4 Sequeces ad Series 4. SEQUENCES I this sectio Defiitio Fidig a Formula for the th Term The word sequece is a familiar word. We may speak of a sequece of evets or say that somethig is

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

Forecasting techniques

Forecasting techniques 2 Forecastig techiques this chapter covers... I this chapter we will examie some useful forecastig techiques that ca be applied whe budgetig. We start by lookig at the way that samplig ca be used to collect

More information

How To Solve The Homewor Problem Beautifully

How To Solve The Homewor Problem Beautifully Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

TO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC

TO: Users of the ACTEX Review Seminar on DVD for SOA Exam MLC TO: Users of the ACTEX Review Semiar o DVD for SOA Eam MLC FROM: Richard L. (Dick) Lodo, FSA Dear Studets, Thak you for purchasig the DVD recordig of the ACTEX Review Semiar for SOA Eam M, Life Cotigecies

More information

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork

Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

Ekkehart Schlicht: Economic Surplus and Derived Demand

Ekkehart Schlicht: Economic Surplus and Derived Demand Ekkehart Schlicht: Ecoomic Surplus ad Derived Demad Muich Discussio Paper No. 2006-17 Departmet of Ecoomics Uiversity of Muich Volkswirtschaftliche Fakultät Ludwig-Maximilias-Uiversität Müche Olie at http://epub.ub.ui-mueche.de/940/

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Institute of Actuaries of India Subject CT1 Financial Mathematics

Institute of Actuaries of India Subject CT1 Financial Mathematics Istitute of Actuaries of Idia Subject CT1 Fiacial Mathematics For 2014 Examiatios Subject CT1 Fiacial Mathematics Core Techical Aim The aim of the Fiacial Mathematics subject is to provide a groudig i

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

7.1 Finding Rational Solutions of Polynomial Equations

7.1 Finding Rational Solutions of Polynomial Equations 4 Locker LESSON 7. Fidig Ratioal Solutios of Polyomial Equatios Name Class Date 7. Fidig Ratioal Solutios of Polyomial Equatios Essetial Questio: How do you fid the ratioal roots of a polyomial equatio?

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

MARTINGALES AND A BASIC APPLICATION

MARTINGALES AND A BASIC APPLICATION MARTINGALES AND A BASIC APPLICATION TURNER SMITH Abstract. This paper will develop the measure-theoretic approach to probability i order to preset the defiitio of martigales. From there we will apply this

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level A1 of challege: C A1 Mathematical goals Startig poits Materials required Time eeded Iterpretig algebraic expressios To help learers to: traslate betwee words, symbols, tables, ad area represetatios

More information

A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES

A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES A GUIDE TO LEVEL 3 VALUE ADDED IN 2013 SCHOOL AND COLLEGE PERFORMANCE TABLES Cotets Page No. Summary Iterpretig School ad College Value Added Scores 2 What is Value Added? 3 The Learer Achievemet Tracker

More information

Theorems About Power Series

Theorems About Power Series Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

HCL Dynamic Spiking Protocol

HCL Dynamic Spiking Protocol ELI LILLY AND COMPANY TIPPECANOE LABORATORIES LAFAYETTE, IN Revisio 2.0 TABLE OF CONTENTS REVISION HISTORY... 2. REVISION.0... 2.2 REVISION 2.0... 2 2 OVERVIEW... 3 3 DEFINITIONS... 5 4 EQUIPMENT... 7

More information

CHAPTER 11 Financial mathematics

CHAPTER 11 Financial mathematics CHAPTER 11 Fiacial mathematics I this chapter you will: Calculate iterest usig the simple iterest formula ( ) Use the simple iterest formula to calculate the pricipal (P) Use the simple iterest formula

More information

Chapter 5: Basic Linear Regression

Chapter 5: Basic Linear Regression Chapter 5: Basic Liear Regressio 1. Why Regressio Aalysis Has Domiated Ecoometrics By ow we have focused o formig estimates ad tests for fairly simple cases ivolvig oly oe variable at a time. But the core

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL.

Annuities Under Random Rates of Interest II By Abraham Zaks. Technion I.I.T. Haifa ISRAEL and Haifa University Haifa ISRAEL. Auities Uder Radom Rates of Iterest II By Abraham Zas Techio I.I.T. Haifa ISRAEL ad Haifa Uiversity Haifa ISRAEL Departmet of Mathematics, Techio - Israel Istitute of Techology, 3000, Haifa, Israel I memory

More information