Chapter 5: Basic Linear Regression

Size: px
Start display at page:

Download "Chapter 5: Basic Linear Regression"

Transcription

1 Chapter 5: Basic Liear Regressio 1. Why Regressio Aalysis Has Domiated Ecoometrics By ow we have focused o formig estimates ad tests for fairly simple cases ivolvig oly oe variable at a time. But the core task of the huma scieces is to study the simultaeous iterrelatioships amog several variables. How will a icrease i price affect quatity demaded, how will law eforcemet affect deviat behavior, how will a chage i the Federal deficit affect iflatio these are all questios about the effect of oe variable upo aother. The oly tool we ve discussed for such questios is correlatio, ad we ve see that it has serious drawbacks. If we were doig a experimetal atural sciece, we might solve this problem by coductig cotrolled experimets, i which we oly chage oe variable at a time. This might isolate the relatioships amog variables without eedig too much statistical artillery. But usually this is t a optio i the huma scieces. So we must develop a statistical virtual laboratory, a meas of coordiatig our data so that we ca draw coclusios as if the data had bee geerated by a cotrolled experimet, teasig out the uique effect of each variable. Ad regressio aalysis is the most commo approach used by ecoomists to costruct such a virtual laboratory to explore simultaeous relatioships amog several variables. We ll start with the simplest case: oe variable affectig oe other variable. I the ext chapter we cosider the more realistic case, where several idepedet variables affect a depedet variable.. The Basic Regressio Paradigm Let s say that I lead you out to the wester sidelie of the college soccer field tomorrow morig at 9: a.m. with the followig istructios: Last eveig at 1: p.m. I buried a tiy 1cm caister uder 1/cm of soil, somewhere alog the opposite edge of this field. The caister cotais the etire college edowmet--$5,,. It s yours if you walk directly to it ad pick it up o your first try. As I buried the moey, I marked the path from this side of the field to the caister with a arrow piece of double-sided tape. Ufortuately, this tape completely disitegrates 1 hour after use. But, fortuately for you, the tape was covered with the eggs of a large Amazoia flea. These fleas hatch, take oe leap, ad die. Your missio, therefore, is to recostruct the positio of the tape, usig the flea carcasses as your guide. Why would I do this? OK, I m a little eccetric sometimes. But this exercise would exactly parallel the duties of regressio aalysis: We believe, for theoretical reasos, that there is some relatioship i the world betwee two variables say price ad quatity demaded at a Farmers Market booth o a Saturday morig: (Notice that I ve put the depedet variable o the vertical axis. Looks like tape o a field, o? ) Ufortuately, the itercept ad slope of the lie that measures this relatioship are populatio parameters we ca ot directly observe them. Istead we observe samples from this populatio, ad our samples are ulikely to lie exactly o top of the populatio regressio lie:

2 (This kid of picture is called a scatter plot or sample scatter diagram. ) There will be radom errors that move our observatios away from the regressio lie, because some thigs we ve left out of our model also affect quatity demaded (like chages i icome), the populatio relatioship may ot be perfectly liear (as we re assumig here), we may make some errors i measurig the two variables, ad there may simply be a truly radom, odetermiistic compoet to demad. I ay evet, the thig we observe is a cloud of measuremet aroud the populatio regressio lie, ot the lie itself. From this cloud of data, our job is to make the best guess about where the ivisible populatio lie actually lies: How to proceed? The majority of ecoometric work approaches this i typical ecoomist fashio: Let s reduce this problem to a simpler oe by makig several simplifyig assumptios. The i the succeedig chapters we will lear how to discover whether each assumptio is actually justified, idetify the problems that arise whe each assumptio is violated, ad try to costruct a fix for these problems. It will be a bit like practicig medicie: Wheever you do regressio aalysis, you ll fall ito the patter of diagosig violatios of assumptios, recogizig the symptoms of these violatios, prescribig a treatmet, ad moitorig the results. That s the most commo approach to doig ecoometrics. We should metio that there are several lesscommo approaches to the basic problem of iferrig populatio relatioships from sample observatios o-parametric estimatio, vector-autoregressive models, ad others. These are usually cosidered beyod the scope of a udergraduate course (ad eve may graduate courses) but might make good term-paper topics for the right perso. 3. The Classical Regressio Assumptios

3 If we were out o the soccer field tryig to recostruct the positio of the double-sided tape, we d have to make some assumptios about how the tape was placed ad how Amazoia fleas jump whe they hatch. The same is true of recostructig a populatio regressio lie from our sample observatios. Eight basic assumptios have prove useful: Assume that the relatioship betwee the variables is liear We ll fid easy ways to ease this assumptio, but it s a helpful place to start. I symbols, we re assumig that the populatio relatioship looks like this: Y X 5.1. That is, the depedet variable is a liear fuctio of a idepedet variable, plus-or-mius a radom error that we are callig (the Greek lower-case epsilo). ad are kow as the regressio coefficiets. represets the margial effect of X upo Y, the slope of the regressio lie. represets the itercept of the lie, which icorporates the effect upo Y of all the variables that ifluece Y but do ot appear i our equatio. (For a demad curve, these would iclude icome ad populatio size). If ay of these abset variables were to chage, would also chage the etire lie would shift. To be a bit picky, Equatio 5.1. is a referece to the etire populatio relatioship, the whole lie. We d deote ay particular occurrece of the depedet variable alog this lie with lower-case letters bearig subscripts, y, x where the lower-case subscript idicates that this particular occurrece of Y is oe of the (upper-case) N occurreces we ve measured. I pictures: Remember: It s usually lower-case letters for refereces to idividual observatios, upper-case letters for refereces to totalities. Of course we ever actually observe the or the because they re populatio parameters, so we ca ever actually kow the populatio lie s positio or measure the (the distace of our observatio from the populatio regressio lie). Istead we will evetually make estimates of these items, costructig a sample regressio lie that we ll deote Y X e So you might say that the basic regressio problem is to calculate estimates â, b ad ê for the populatio parameters, ad. We ll also wat to calculate the stadard errors of these three estimators, so we ca coduct hypothesis tests ad build cofidece itervals.

4 Havig explaied the first major assumptio ad clarified otatio, the remaiig seve assumptios will go more quickly. Assume that the error term is a radom variable with a mea of zero: E ) This assumes that, o average, the error terms cacel each other out some positive (placig our observatios above the true regressio lie), some egative, i roughly equal umbers ad distaces. I other words, the expected value of y depeds upo the value of x, ad this coditioal mea of y is equal to x, give a value of x. O the soccer field, this is like assumig that the fleas jump, i roughly equal umbers, toward either goal post. If you assumed this but they were istead magetized ad all jumped to the orth, your estimate of the populatio lie would be mistake. You d dig your hole a bit orth of the treasure. Assume that X is ot a radom variable; it is measured without error Our sample observatios do t lie precisely o top of the populatio regressio lie because of radom errors, ad ow we make aother limitig assumptio about those errors: They all pertai to the measuremet of Y oly, ot the measuremet of X. I our graphs, this meas that Price is measured with certaity, but the respose of Quatity to Price is somewhat radom. I effect, we assume that the sample data have jumped vertically away from the regressio lie before we could measure them. If this is true, it simplifies the process of fidig the best estimate of the regressio lie: Just fid a lie that miimizes the vertical distaces betwee your data ad your lie. This assumptio also has a pleasat implicatio about the covariace betwee the error term ad the idepedet variable: Cov x, ) E( x ) E( x ) E( ) x E( ) x E( ) 5.3. ( I words: The covariace betwee our idepedet variable ad the error term is zero; they are ot correlated. Equatio 5.3. will lead to some ice properties for the OLS estimators of the regressio lie. Maybe you ca picture them ow: Imagie how hard it would be to fid the treasure if the directio of the fleas jumps (the error term) were correlated with the idepedet variable (the distace you ve walked across the field searchig for treasure). If there were, say, a positive correlatio, the the closer you get to the treasure (that s icreasig the idepedet variable i our picture), the more the fleas ted to jump above the true regressio lie, ot below it; they would lead you to the left of where the treasure lies, ad you d ed up with a biased estimate of the true regressio lie. By the way, this assumptio is t quite the same thig as assumig that chages i X cause chages i Y, though it s close. Strictly speakig, regressio aalysis oly idicates a associatio betwee two variables, ot a cause-ad-effect relatioship betwee them. But, though there are formal tests of causality betwee two variables, the assumptio that oly oe variable experieces radom error early forces us to thik of that variable (Y) as somehow respodig to chages i the other variable (X). The ext chapter will cosider cases i which more tha oe idepedet variable affect a sigle depedet variable (Y). Later i the course we will ecouter simultaeous equatios models, which allow for more tha oe depedet variable. Assume that all radom errors are idetically distributed (costat variace): Var ( ) for all This property is called homoscedasticity, meaig equal scatter. If istead, for example, the variace of the error term icreased as X icreased, our data would look somethig like this: (

5 Assume that all radom errors are idepedetly distributed: Cov ) for all s ( s This property is called serial idepedece, meaig that the distributio of ay oe error term does't deped o the radom errors elsewhere i the series of data. This assumptio is most likely to be violated i time-series data, where oe observatio may be iflueced by precedig observatios. Cosider atioal GDP data: If oe quarter s GDP is well below potetial GDP, it s likely that the ext quarter will also be below potetial: Take together, 5..1, ad imply that the radom errors are idetically ad idepedetly distributed (i.i.d.) radom variables with a mea of zero ad fixed, fiite variace. Oe more assumptio about the error terms will prove helpful: Assume that the radom errors are ormally distributed This assumptio will allow us to get started i testig hypotheses ad formig cofidece itervals. Combiig these first seve assumptios, we ca summarize the basic liear regressio model: Y X, where ~ N (, ), Cov, ) for all, ( x Cov, ) for all s. ( s Folks sometimes summarize these assumptios by sayig that they ve assumed that the radom error term is well behaved. Those are all the major assumptios of the model per se, but we must make two additioal assumptios about the data we isert ito this model: Not all observed values of x may be idetical It would be hard to figure out how price iflueces quatity demaded if the price ever chaged. The

6 data would look like this, ad ay umber of straight lies would fit those observatios equally well: Hece it would be impossible to idetify the best estimates of ad. Stated differetly, this assumptio implies that the sample variace of the idepedet variable is ot zero: ( x x) 5.7. N 1 The umber of model parameters must ot be larger tha the umber of observatios (N) I the simple liear case, we have two model parameters to estimate: ad. We must have at least two sample observatios i order to estimate them. Two poits determie a lie, ad ay more poits are gravy. But try fittig a lie to a sigle poit. Ay lie through that poit will work, so it agai becomes impossible to idetify the best estimates of ad. There you have the eight classical assumptios of simple regressio aalysis. It s time to thik about how to actually calculate estimates of ad uder these assumptios. 4. Estimatig the Regressio Populatio Parameters: Ordiary Least Squares As was suggested while discussig Assumptio 5.3.1, we could fid parameter estimates by searchig for the lie that miimizes the vertical distaces betwee this lie ad the data. OLS estimatio is the most commo way of doig this, i which we miimize the squared vertical distaces betwee the data ad the estimated regressio lie. Why square the distaces? This coverts all errors to positive umbers (which makes the computatios more straightforward), ad amplifies the ifluece of outliers, observatios that lie farther from the core of our observatios. (You might thik it s ot a good idea to exaggerate these least-typical observatios, ad we will evetually ecouter other ways to estimate the regressio

7 parameters.) It also just happes that the OLS estimators have some desirable properties, which we explore i Sectio 5 of this chapter. Let s derive the OLS estimators: Our basic liear model holds that each data poit ca be described (from 5.1.4) as y. x Sice we wat to miimize somethig ivolvig the errors, let s solve this equatio for the estimated error (or residual ) term: y ( x ). 5.9 Now square this residual, the sum across all observatios, givig us the sum of squared errors (also called error sum of squares, abbreviated ESS), which depeds upo the umbers we select as estimators of ad. (If you choose a bad ad, you ll have bigger residuals.) I symbols:, ) [ y ( x )] ESS. 5.1 ( Now just choose a ad that will miimize this sum: y ( x )] Mi [ How would we miimize this? By takig first derivatives with respect to ad, settig these derivatives equal to zero, ad solvig for the optimal estimators ad. (The derivatio is rather tedious--9 pages i the textbook I like best--but oly requires the first ad seveth of our eight assumptios. The other assumptios are required to assure some desirable properties for these OLS estimators.) The first derivatives yield so-called ormal equatios, which ca be solved for ad to yield: sx, y sx y x, I prose: The OLS estimator of the slope equals the sample covariace betwee x ad y, divided by the sample variace of x. (Is t that great?! The simple covariace betwee x ad y had serious drawbacks i studyig the relatioship betwee two variables, but if we just divide it by the variace of x ad allow for a itercept, may of those drawbacks disappear! Ad we ll see that regressio aalysis opes up may other aveues of learig that are missed by correlatio ad covariace. Better Livig Through Calculus!) The OLS estimator of the itercept is derived from the estimate of the slope fid the slope first, the fid the oly itercept that allows a lie with that slope to pass through both the average value of y ad the average value of x. Log, log ago (1978), i a galaxy far away, youg college studets like your istructor derived ad calculated such estimators by had. Usig slide-rules to square ad sum deviatios ad cross-products, redefiig variable uits to ease the calculatios, keypuchig programs o eighty-space IBM cards (oe card per lie of program), stadig i log lies for prited output from scree-less maiframe computers... Those were the days. Now STATA will do this for eormous multi-dimesioal data sets, ad have the results o your scree before you ca glace up. For that reaso, we do t do may tedious problems ivolvig the ormal equatios ay more. While this is geerally a good thig, it sometimes leads to The Bubba Effect. It s so easy to do regressios that people become thoughtless, committig what you might call Type III error: the use of a good model i the wrog situatio. Whe aalysis was difficult, there were fewer tools at our disposal, but people did t use them casually. The silver liig is that, freed from some tedium, we ca sped more class time developig your cliical isticts about the actual practice of applied ecoometrics. That should help miimize the probability of

8 Type III error. This is why you are learig to write origial programs i a statistical laguage rather tha just poitig-ad-clickig, why you do a major research project for this class, why we sometimes digress ito the philosophy of sciece surroudig ecoometrics, ad why we sped a good deal of group time studyig cases of good ad bad ecoometrics. These are all ways of makig you a more thoughtful ecoometricia. By the way, machies are huma too, ad whe squarig ad summig lots of big umbers they ted to make roudig errors. It s therefore a good idea to measure your variables i large uits where possible, to keep your umbers small. For example, if GDP is your idepedet variable, measure it i trillios of dollars rather tha i dollars, so that the computer will be workig with smaller squared umbers. Just be sure to remember that the resultig slope coefficiet measures the margial effect of a oe-trillio-dollar chage i GDP, ot a oe-dollar chage. Havig discovered how to compute OLS estimators of the regressio parameters, let s move o to the properties of these estimators. 5. Properties of the OLS Estimators Chapter 4, Sectio 3 outlied some desirable small- ad large-sample properties for estimators. We preset, without proof, a scorecard of some of the OLS estimators attributes: Ubiased ad Cosistet: 5.14 Assumptios ad 3 assure that ad are ubiased. Assumptios 3 ad 7 assure that they are cosistet (as log as the sample variace of the idepedet variable is ot ifiitely large, which is ulikely). Efficiecy: BLUE estimators 5.15 Give Assumptios, 3, 4, 5 ad 7, the OLS estimators are the Best (most efficiet) Liear Ubiased Estimators (they are BLUE ). Amog the may liear combiatios of the data that form ubiased estimators, the OLS estimators are the most efficiet. The proof of this property is kow as the Gauss-Markov Theorem. You should be aware that, whe some of the assumptios do ot hold, the OLS estimators become iferior to other optios. The most commo other optio would ivolve so-called maximum likelihood estimatio. With maximum likelihood estimatio, we completely scrap the idea of miimizig squared deviatios betwee data ad estimated lie. Istead we specify the probability distributio (or likelihood fuctio ) of the error term (we ve assumed it s a ormal distributio, but it could be somethig else). This likelihood fuctio will deped i part o the values of ad. We the choose ad that will maximize the likelihood that we would have observed the data that were collected. You will be relieved to kow that MLE estimators: 5.16 If Assumptios through 8 are met, the OLS estimators are idetical to the maximum likelihood estimators. There are three more topics we should cosider about the simple liear regressio model, ad they all cocer practical matters of measurig the precisio of the OLS estimates. For example, it s cool comfort to kow the OLS estimators are the most efficiet available if their variaces are still very large. The three sectios that roud out this chapter discuss the precisio of the estimators, hypothesis testig, ad forecastig. 6. Estimator Variaces, Covariaces, ad Goodess of Fit The variace ad stadard error of a estimator are idexes of its reliability ad precisio. Estimators like ad are radom variables, of course, because they are liear combiatios of the estimated error terms, the terms, which are ormally-distributed radom variables. Thus the variaces of ad

9 might be expected to deped upo the variace of the radom error term (which we ll simply call rather tha wheever possible). I particular, it ca be show that ad Var ( ) 5.17 ( x x) Var I words: x ( ) ( x x) N The estimator of the regressio slope,, becomes more precise (has smaller variace) as 1)we sample a wider variety of x values (which makes the deomiator grow), or ) the error term s variace (i the umerator) is smaller, packig our data more tightly aroud the regressio lie. The estimator of the regressio itercept,, becomes more precise as 1) we sample a wider variety of x values, ) the error term s variace shriks, 3) our observatios are earer the vertical axis, where the itercept is (which shriks the first term i the umerator), or 4) our sample size (N) icreases. (Of course, a larger sample size will idirectly make the estimator more reliable, too, by icreasig the sum of squared deviatios of x. But there s a additioal direct effect of sample size upo the precisio of.) Our estimates of ad are ot idepedet of each other, as they re computed from the same data sample. Thus they ormally have a covariace. It ca be show that Cov(, ) x ( x x) I words: ad become less iterdepedet as 1) we sample a wider variety of x values (swellig the deomiator), ) the error term s variace shriks, or 3) our observatios are earer the vertical axis, shrikig x. The covariace betwee ad 4) has a sig opposite to the average value of x (sice the expressio begis with a egative sig, ad everythig i the expressio except x is always positive). The covariace is egative, as log as the idepedet variable is o-average positive. That all makes sese if you picture a lie beig fit to a cluster of data. If you were to push dow o the itercept while tryig to make the lie fit well, the slope would have to icrease a egative covariace betwee slope ad itercept (Item 4). Ad this bobbig of the regressio lie would be less proouced if the data are clustered ear the vertical axis (Item 3) ad tightly packed together (Items 1 ad ). All three of these expressios the variace, variace of, ad covariace betwee ad -- ivolve. Ufortuately, that s a populatio variace of the error term, which is usually ivisible, makig it hard to actually calculate the estimators variaces ad covariaces. But we ca estimate this

10 populatio error s variace, usig the sample variace of our sample error,. A ubiased estimator is the sample variace of our error terms: s N [ ( y x )] N, which is also sometimes called. 5. (You might have expected that we d divide by N-1, as we did whe calculatig the variace of a simple radom variable. But i that case we were dividig by the umber of degrees of freedom, which was N-1 because we had calculated oe parameter estimate already, the sample mea x, leavig us with N-1 free bits of iformatio. I this case, before we could calculate the terms we had to calculate two parameter estimates, ad, so we oly have N- bits of free iformatio (degrees of freedom) left.) The square root of this estimated error-term variace is called the stadard error of the residuals or stadard error of the regressio oted or s. As you might expect, if we swap this for the term i Equatios , we achieve estimates of the variaces ad covariaces of the regressio coefficiets ad. If we the take the square root of the estimated variaces, we have the stadard errors of the regressio coefficiet estimates: s s ( x x) ( x x x) N To summarize this sixth sectio of this chapter: We ve leared how to estimate the stadard error of the regressio, s (usually simply oted as s ) ad the stadard errors of the parameter estimators, s ad These are very useful because oce you kow somethig s stadard error you ca do hypothesis tests ad costruct cofidece itervals. Of course, you ll probably ever use Equatios 5.- to calculate these stadard errors, because they are routiely calculated by statistical software. But ow you kow where these umbers come from. Oe more loose ed to tie off: Because ad s ad s s give us a measure of the precisio of our estimates of, you might suppose that s is givig us a measure of the precisio of the whole regressio lie, all at oce, by measurig how tightly the data are packed aroud our estimated lie. That s the right istict, but it must be refied because s, as a estimated stadard error, is sesitive to the uits i which our variables are measured. But we ca costruct a closely-related statistic that s a more reliable measure of the goodess-of-fit betwee our data ad our regressio lie: Why are we doig a regressio? Ii order to explai some of the chages i Y by relatig them to chages i X. The maximum amout of squared deviatios i Y that we could possibly explai would be all of them! Call that umber the Total Sum of Squares, TSS: TSS ( y y) 5.3

11 How well has our regressio doe at aticipatig ad explaiig these variatios i Y? Why ot measure this by addig up our failures to explai, the squared distaces of our data from our regressio lie--the squared errors from our regressio aalysis. Call that umber the Error Sum of Squares: ESS y ( x )] 5.4 [ The differece betwee TSS ad ESS will be equal to the sum of squared deviatios i y that our regressio does correctly aticipate. These are squared distaces betwee our data ad the poits directly above or below them o the estimated regressio lie the Regressio Sum of Squares: RSS ] 5.5 ( y y) [( x ) y Look at the defiitios of TSS ad ESS. The first adds up deviatios aroud our sample mea of y; the secod adds up deviatios aroud the umbers that our regressio predicts to be the mea of y, give the level of x we ve observed. The first measures total variatio i y, the secod measures total uexplaied variatio i y. So you could measure the percetage of the variatio i y that our ESS. regressio fails to explai with the ratio TSS We ca restate this by calculatig the percetage of variatio i y that our regressio does explai, the coefficiet of determiatio, oted R : R 1 ESS TSS RSS TSS, R R is a proportio, so it is uaffected by chages i the uit of measuremet of our variables; it s a uitfree measure of the goodess of fit betwee our data ad our regressio lie, because the umerator ad deomiator are measured i the same uits. R lies betwee ad 1 because we ca t explai more tha 1% of the variatio i y, or less tha %. By the way, if you re straiig to see how this measure of the regressio s goodess of fit, I ca help: R is related to R s, which we set out to improve upo as a 1 s y s ( N If you re stymied by the ame R, it s due to the fact that this statistic is equal to the square of the sample correlatio (abbreviated r) betwee the observed values of y ad our regressio s predicted values for y (which are x, or ŷ ). Though R is widely used as a measure of goodess of fit, we ll see that it has some limitatios. Two of them would become clear after starig at our derivatio: You ca ot use R to compare the goodess of fit betwee two regressios if 1) oe regressio cotais a itercept ad the other does ot, or if ) the depedet variables of the two regressios are ot the same (for example, if oe uses y ad the other uses l(y)). 7. Hypothesis Testig We ve discussed four regressio statistics that you might wat to hypothesize about:,, We ll defer tests cocerig ( N ) 1). ad R. R util the ext chapter, whe we ca be more robust about it. The most commo test asks whether is zero. That would idicate that there s o relatioship betwee X ad Y, ad explorig this relatioship was presumably our reaso for doig the regressio. For completeess I ll summarize the typical tests for all three regressio parameters, the preset oe ew twist o hypothesis testig. The tests rely o three premises (the proofs of which you ll fid i graduate statistics texts):

12 ad are ormally distributed (sice they re derived from regressio errors, which are presumed to be ormally distributed). ad are distributed idepedetly of. ~ N Test for : 5.7 H : (where H A : (two-tailed test), or A : is some umber you supply, frequetly zero) H or (oe-tailed tests) Test Statistic uder H : t ~ t N s Decisio Rule: Reject H if t (two-tailed test) or t tc or t tc (oe-tailed tests), tc where t c is a critical value determied by the level of sigificace. I words: To test whether the regressio slope is equal to a particular umber ( ), fid how may stadard deviatios your estimate lies from that umber. The greater this distace betwee estimate ad, the less believable becomes. Test for : 5.8 Idetical. Replace each i 5.7 with a, ad you have it. Test for H : H : : 5.9 Test Statistic uder H : S Decisio Rule: Reject H if S ( N c ) ~ Ad ow the ew twist: Recall that there s really o magic level of sigificace that s uiversally appropriate. A hypothesis test that simply chooses a level of sigificace ad reports a success or failure leaves us vaguely usatisfied: If the hypothesis failed, by how much? If it did t satisfy your tolerace for sigificace, it might have satisfied mie. For that reaso it s become commo to report a p-value for each estimated coefficiet, where the p-value equals the level of sigificace at which your ull hypothesis would have just barely passed the hypothesis test. Quick Example: Say that you ve doe a regressio of quatity demaded (measured i bushels of cucumbers) o price, with the followig results: (Stadard errors of estimates lie below the parameter estimates) Q P (5.3) (.5) N R. 78 N s 37. 8

13 , your slope estimate i a regressio, is equal to 5., ad the estimated stadard error of is.5. The sample size equals. To test the ull hypothesis that hypothesis that, your test statistic would be i the populatio agaist the alterative 5. t. ~ t. is. stadard.5 deviatios away from the value presumed i your ull hypothesis.. is greater tha the.1-sigificacetest critical value (1.75, from STATA or a t-distributio table), so we d reject the ull hypothesis at a.1 level of sigificace. But. is less tha the.5-sigificace-level critical value (.86), so we d (barely) retai the ull hypothesis at a.5 level of sigificace. The p-value for this test is.5966, which gives us much more precise iformatio tha either of the other tests. 8. Forecastig We ofte do a regressio because we d like to forecast values of the depedet variable. Say you d doe the regressio reported i the last quick example, Q P R. 78 N (5.3) (.5) s, because you woder how may bushels of cucumbers will be bought at a price of $4 per bushel. Sice Q ( quatity, ad are ubiased estimators of ad, you d get a ubiased estimate of P 4. give that price equals 4. ) by just settig price equal to 4. i your regressio equatio ad solvig for the forecast level of Q. This yields a poit estimate equal to 4. 5.(4.) 38.. You d aturally like some idicatio of the reliability of this poit estimate. If we kew the stadard error of this predictor, we could calculate a cofidece iterval. Let s say you d like a 95% cofidece iterval for the quatity demaded at a price of 4.. At this poit we must be rather precise, because there are two differet, closely-related cofidece itervals that might iterest us: We could costruct a iterval that, with 95% certaity, captures the poit o the regressio lie that lies above a price of 4.. We could costruct a iterval that captures 95% the demad coditios we ll actually experiece at a price of 4.. The first optio gives us a rage withi which the average of our sales is likely to fall, a cofidece iterval for the mea predictor. We re 95% sure that, at a price equal to 4., the populatio regressio lie lies betwee these two poits. If we costructed such a rage for each possible level of price, we d have costructed a space withi which we re 95% sure that we ve captured the true populatio regressio lie:

14 (The iterval gets wider as we move farther from our average observatio of X, because we re forecastig farther from the core of the iformatio we ve gathered.) The secod optio gives us a wider rage withi which the actual level of our sales is likely to fall, a cofidece iterval for poit forecasts. This has to be a wider rage, sice actual sales vary aroud their average levels: The stadard error of the mea estimator (the first optio) ca be estimated by 1 ( x x) s s [ ]. y N ( x x) I will spare you the proof. I words, we ca forecast the mea value of y at ay particular value x of the idepedet variable; the stadard error of that forecast is larger 1) if the estimated variace s of our error term is larger if the data are widely dispersed aroud our regressio lie; ) if our sample size, N, is small; 3) if we are tryig to forecast far away from the average observatio of x; ad 4) if our observatios of x are ot very well dispersed. Sice we ca compute the stadard error of our forecasted mea, we ca employ the usual logic to costruct a cofidece iterval for the mea forecast:, y s y t c where t c is a critical t-statistic value that depeds o our level of cofidece. I words, we are c% sure that the forecasted mea lies withi t stadard deviatios of ŷ.

15 For poit forecasts (the secod optio), the relevat stadard error is 1 ( x x) s y s [ 1]. N ( x x) Compared to the previous stadard error, there s oe small differece: We ve added i a extra s, because idividual observatios vary aroud their expected value, with a variace we ve estimated to be s. By usig this s you ca compute cofidece itervals i the ormal way, so that your cofidece iterval for a poit forecast would be. y s y t c ŷ 9. Comparig Forecasts Imagie that two farmers from our market have developed slightly differet regressios to forecast their sales. They might wat a way to compare the approaches, to decide which makes more reliable forecasts. Call the forecasted value of the depedet variable y, ad the actual value that evetually occurred y. Here are three typical scorecards you might compute after makig several forecasts: (For all three, a low score is better tha a high score.) Mea Squared Error (MSE): f ( y y ) MSE N Looks like a variace, does t it? Some prefer its square root, root mea squared error. Mea Absolute Percet Error (MAPE), which oly works if all y are positive: MAPE 1 [ N y f y y ] 1 Mea Squared Percetage Error (MSPE): MSPE 1 [ N ( y f y y ) ] 1 Agai, some prefer its square root, root mea squared percetage error. f Each approach has its champios ad detractors, ad your choice of a scorecard for forecasts probably should deped upo the situatio. I some cases the amout of error is more importat tha the percetage error, i others ot; i some cases you wat to severely pealize large errors by squarig them, but some times this would be iappropriate. If you re impatiet ad wat to evaluate several forecasters without waitig for additioal observatios, all is ot lost. You ca estimate your regressios usig oly a percetage of the observatios you have (say, 9%), use the regressios to make forecasts of y at the x values i your uused observatios, the compute the MSE or MAPE or MSPE for these so-called post-sample forecasts. 1. More Regressios to Come Simple liear regressio is powerful, but ot powerful eough for a complicated world i which we ca t ru cotrolled experimets. I the ext chapter we cosider expadig our model to cases with more tha oe explaatory variable. I the succeedig chapters we will relax most of our eight simplifyig assumptios, learig how to adjust our aalysis whe the assumptios are ot met.

16 Useful STATA Commads: Values you supply are i italics. Words you type literally are i boldface. Optios are i [ ] you do t type. Eterig data: *From the Keyboard, withi STATA: clear /* to clear ay existig data i memory */ iput ames /*ames: 1-8 characters, period for missig values; STATA is case-sesitive */ /* eter observatios oe at a time, space betwee variables, startig ext lie */ ed /* later ew observatios: iput... ed */ /* later ew variables: iput ames */ * Outside STATA, you ca either * eter data i a text file (usig somethig like WordPad), separatig each variable by spaces ad * each observatio by a carriage retur, ad save it as * fileame.asc,, the use the Stata ifile commad: ifile variablelist usig fileame.extesio. /* strig variables: ifile strxx varame where <xx<81 ad xx=striglegth */ /* i file, strigs go i marks if they cotai blaks */ *or * put your data ito a Excel file, usig the first row for variable ames. Save it as a tab-delimited * ASCII file, ad use the Stata commad isheet usig fileame * Stata will automatically read the variable ames from your file. Seeig data: list summary describe Savig data: save fileame /*may ot iclude blaks; saved as fileame.dta */ /* use save fileame, replace if overwritig a existig file*/ Re-usig saved Stata data: use fileame /*save or clear curret data i memory first if ecessary */ Usig o-stata (ASCII) data files that are ot i the same directory as STATA: *For example, if a data file is i a commo directory i the lab, take this approach: *a. Fid the data file you wat to use, i the commo drive. Drag a copy of it oto your desktop if you wish. *b. Ope the STATA program, ad issue the first part of the INFILE commad: ifile variableames usig *Now, rather tha tryig to type out the address of the file correctly, just go up to the FILE meu, *ad choose the FILENAME optio. You'll get a dialog box. Navigate to the desktop, ad click o *the ame of your data file. This ame will automatically appear i the commad lie you were *typig, eclosed i " " marks. Now you ca just tap your ENTER key, ad the commad should *brig the data ito the program for you.

17 Alterig Variables: geerate ewvariable =expressio [if expressio] [i rage] replace oldvariable=expressio [if expressio] [i rage] *For creatig or alterig parameters, use scalar scalarame=expressio /* you ca follow this with scalar list scalarame or scalar drop */ *Expressios ca iclude fuctios: abs(x), exp(x), l(x), log1(x), sqrt(x), or statistical fuctios. *For correlatios, use correlate [variablelist] [weight] [if expressio] [i rage] [,meas covariace] *For pairwise correlatios oly, you ca use pwcorr [variablelist] [weight] [if expressio] [i rage] Regressio: regress depvariable idepvars [weight] [if expressio] [i rage] [ocostat] /* Saved results iclude: e(n) # observatios e(df_m) model degrees of freedom e(df_r) residual degrees of freedom e(r) R-squared e(f) regressio F-statistic e(rmse) root mea square error e(b) coefficiet vector e(v) var-cov matrix of estimators */ *See these with estimates list or, for the matrixes, matrix list matrixame predict ewvariableame [if expressio] [i rage] [, statistic] /* geerates predicted values, where statistic ca be pr(a,b) probability a<y<b residuals residuals rstadard stadardized residuals stdp stadard error of the predictio stdf stadard error of forecast stdr stadard error of the residual Savig your program ad/or results: log usig fileame.do [oproc apped replace] /* saves your file as a do file, which ca be edited ad ru agai */ /* the oproc optio saves oly what you type, ot ay output */ /* apped will apped your work to the ed of a existig file */ /* replace will overwrite a existig file */ *To susped, To resume, To quit loggig for good, log off log o log close To quit usig STATA: exit, clear /*but be sure you ve saved your data first, if you eed to */ Re-ruig your.do files: *Just type do fileame.do /*where fileame.do is the.do file cotaiig the commads you wat to ru */

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean 1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

CHAPTER 3 THE TIME VALUE OF MONEY

CHAPTER 3 THE TIME VALUE OF MONEY CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all

More information

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized? 5.4 Amortizatio Questio 1: How do you fid the preset value of a auity? Questio 2: How is a loa amortized? Questio 3: How do you make a amortizatio table? Oe of the most commo fiacial istrumets a perso

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Department of Computer Science, University of Otago

Department of Computer Science, University of Otago Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships

Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions

Chapter 5 Unit 1. IET 350 Engineering Economics. Learning Objectives Chapter 5. Learning Objectives Unit 1. Annual Amount and Gradient Functions Chapter 5 Uit Aual Amout ad Gradiet Fuctios IET 350 Egieerig Ecoomics Learig Objectives Chapter 5 Upo completio of this chapter you should uderstad: Calculatig future values from aual amouts. Calculatig

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Practice Problems for Test 3

Practice Problems for Test 3 Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

More information

OMG! Excessive Texting Tied to Risky Teen Behaviors

OMG! Excessive Texting Tied to Risky Teen Behaviors BUSIESS WEEK: EXECUTIVE EALT ovember 09, 2010 OMG! Excessive Textig Tied to Risky Tee Behaviors Kids who sed more tha 120 a day more likely to try drugs, alcohol ad sex, researchers fid TUESDAY, ov. 9

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

Simple Annuities Present Value.

Simple Annuities Present Value. Simple Auities Preset Value. OBJECTIVES (i) To uderstad the uderlyig priciple of a preset value auity. (ii) To use a CASIO CFX-9850GB PLUS to efficietly compute values associated with preset value auities.

More information

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets

BENEFIT-COST ANALYSIS Financial and Economic Appraisal using Spreadsheets BENEIT-CST ANALYSIS iacial ad Ecoomic Appraisal usig Spreadsheets Ch. 2: Ivestmet Appraisal - Priciples Harry Campbell & Richard Brow School of Ecoomics The Uiversity of Queeslad Review of basic cocepts

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Confidence intervals and hypothesis tests

Confidence intervals and hypothesis tests Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Systems Design Project: Indoor Location of Wireless Devices

Systems Design Project: Indoor Location of Wireless Devices Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: bcm1@cec.wustl.edu Supervised

More information

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps

Swaps: Constant maturity swaps (CMS) and constant maturity. Treasury (CMT) swaps Swaps: Costat maturity swaps (CMS) ad costat maturity reasury (CM) swaps A Costat Maturity Swap (CMS) swap is a swap where oe of the legs pays (respectively receives) a swap rate of a fixed maturity, while

More information

How to use what you OWN to reduce what you OWE

How to use what you OWN to reduce what you OWE How to use what you OWN to reduce what you OWE Maulife Oe A Overview Most Caadias maage their fiaces by doig two thigs: 1. Depositig their icome ad other short-term assets ito chequig ad savigs accouts.

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

Professional Networking

Professional Networking Professioal Networkig 1. Lear from people who ve bee where you are. Oe of your best resources for etworkig is alumi from your school. They ve take the classes you have take, they have bee o the job market

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10

FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 FOUNDATIONS OF MATHEMATICS AND PRE-CALCULUS GRADE 10 [C] Commuicatio Measuremet A1. Solve problems that ivolve liear measuremet, usig: SI ad imperial uits of measure estimatio strategies measuremet strategies.

More information

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011 15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes high-defiitio

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern. 5.5 Fractios ad Decimals Steps for Chagig a Fractio to a Decimal. Simplify the fractio, if possible. 2. Divide the umerator by the deomiator. d d Repeatig Decimals Repeatig Decimals are decimal umbers

More information

Forecasting techniques

Forecasting techniques 2 Forecastig techiques this chapter covers... I this chapter we will examie some useful forecastig techiques that ca be applied whe budgetig. We start by lookig at the way that samplig ca be used to collect

More information

5 Boolean Decision Trees (February 11)

5 Boolean Decision Trees (February 11) 5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

INVESTMENT PERFORMANCE COUNCIL (IPC)

INVESTMENT PERFORMANCE COUNCIL (IPC) INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks

More information

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value Cocept 9: Preset Value Is the value of a dollar received today the same as received a year from today? A dollar today is worth more tha a dollar tomorrow because of iflatio, opportuity cost, ad risk Brigig

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

FM4 CREDIT AND BORROWING

FM4 CREDIT AND BORROWING FM4 CREDIT AND BORROWING Whe you purchase big ticket items such as cars, boats, televisios ad the like, retailers ad fiacial istitutios have various terms ad coditios that are implemeted for the cosumer

More information

Building Blocks Problem Related to Harmonic Series

Building Blocks Problem Related to Harmonic Series TMME, vol3, o, p.76 Buildig Blocks Problem Related to Harmoic Series Yutaka Nishiyama Osaka Uiversity of Ecoomics, Japa Abstract: I this discussio I give a eplaatio of the divergece ad covergece of ifiite

More information

How To Solve The Homewor Problem Beautifully

How To Solve The Homewor Problem Beautifully Egieerig 33 eautiful Homewor et 3 of 7 Kuszmar roblem.5.5 large departmet store sells sport shirts i three sizes small, medium, ad large, three patters plaid, prit, ad stripe, ad two sleeve legths log

More information

Time Value of Money. First some technical stuff. HP10B II users

Time Value of Money. First some technical stuff. HP10B II users Time Value of Moey Basis for the course Power of compoud iterest $3,600 each year ito a 401(k) pla yields $2,390,000 i 40 years First some techical stuff You will use your fiacial calculator i every sigle

More information

Confidence Intervals for Linear Regression Slope

Confidence Intervals for Linear Regression Slope Chapter 856 Cofidece Iterval for Liear Regreio Slope Itroductio Thi routie calculate the ample ize eceary to achieve a pecified ditace from the lope to the cofidece limit at a tated cofidece level for

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

Modified Line Search Method for Global Optimization

Modified Line Search Method for Global Optimization Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

Asymptotic Growth of Functions

Asymptotic Growth of Functions CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll

More information

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2 74 (4 ) Chapter 4 Sequeces ad Series 4. SEQUENCES I this sectio Defiitio Fidig a Formula for the th Term The word sequece is a familiar word. We may speak of a sequece of evets or say that somethig is

More information

Learning objectives. Duc K. Nguyen - Corporate Finance 21/10/2014

Learning objectives. Duc K. Nguyen - Corporate Finance 21/10/2014 1 Lecture 3 Time Value of Moey ad Project Valuatio The timelie Three rules of time travels NPV of a stream of cash flows Perpetuities, auities ad other special cases Learig objectives 2 Uderstad the time-value

More information

Tradigms of Astundithi and Toyota

Tradigms of Astundithi and Toyota Tradig the radomess - Desigig a optimal tradig strategy uder a drifted radom walk price model Yuao Wu Math 20 Project Paper Professor Zachary Hamaker Abstract: I this paper the author iteds to explore

More information