Confidence intervals and hypothesis tests

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Confidence intervals and hypothesis tests"

Transcription

1 Chapter 2 Cofidece itervals ad hypothesis tests This chapter focuses o how to draw coclusios about populatios from sample data. We ll start by lookig at biary data (e.g., pollig), ad lear how to estimate the true ratio of 1s ad 0s with cofidece itervals, ad the test whether that ratio is sigificatly differet from some baselie value usig hypothesis testig. The, we ll exted what we ve leared to cotiuous measuremets. 2.1 Biomial data Suppose we re coductig a yes/o survey of a few radomly sampled people 1, ad we wat to use the results of our survey to determie the aswers for the overall populatio The estimator The obvious first choice is just the fractio of people who said yes. Formally, suppose we have samples x 1,..., x that ca each be 0 or 1, ad the probability that each x i is 1 is p (i frequetist style, we ll assume p is fixed but ukow: this is what we re iterested i fidig). We ll assume our samples are idedepet ad idetically distributed (i.i.d.), meaig that each oe has o depedece o ay of the others, ad they all have the same probability p of beig 1. The our estimate for p, which we ll call ˆp, or p-hat would be ˆp = 1 x i. Notice that ˆp is a radom quatity, sice it depeds o the radom quatities x i. I statistical ligo, ˆp is kow as a estimator for p. Also otice that except for the factor of 1/ i frot, ˆp is almost a biomial radom variable (that is, (ˆp) B(, p)). We ca compute its expectatio ad variace usig the properties we reviewed: i=1 E[ˆp] = 1 p = p, (2.1) var[ˆp] = 1 p(1 p) p(1 p) =. 2 (2.2) 1 We ll talk about how to choose ad sample those people i Chapter 7. 1

2 Sice the expectatio of ˆp is equal to the true value of what ˆp is tryig to estimate (amely p), we say that ˆp is a ubiased estimator for p. Reassurigly, we ca see that aother good property of ˆp is that its variace decreases as the umber of samples icreases Cetral Limit Theorem The Cetral Limit Theorem, oe of the most fudametal results i probability theory, roughly tells us that if we add up a buch of idepedet radom variables that all have the same distributio, the result will be approximately Gaussia. We ca apply this to our case of a biomial radom variable, which is really just the sum of a buch of idepedet Beroulli radom variables. As a rough rule of thumb, if p is close to 0.5, the biomial distributio will look almost Gaussia with = 10. If p is closer to 0.1 or 0.9 we ll eed a value closer to = 50, ad if p is much closer to 1 or 0 tha that, a Gaussia approximatio might ot work very well util we have much more data. This is useful for a umber of reasos. Oe is that Gaussia variables are completely specified by their mea ad variace: that is, if we kow those two thigs, we ca figure out everythig else about the distributio (probabilities, etc.). So, if we kow a particular radom variable is Gaussia (or approximately Gaussia), all we have to do is compute its mea ad variace to kow everythig about it Samplig Distributios Goig back to biomial variables, let s thik about the distributio of ˆp (remember that this is a radom quatity sice it depeds o our observatios, which are radom). Figure 2.1a shows the samplig distributio of ˆp for a case where we flip a coi that we hypothesize is fair (i.e. the true value p is 0.5). There are typically two ways we use such samplig distributios: to obtai cofidece itervals ad to perform sigificace tests Cofidece itervals Suppose we observe a value ˆp from our data, ad wat to express how certai we are that ˆp is close to the true parameter p. We ca thik about how ofte the radom quatity ˆp will ed up withi some distace of the fixed but ukow p. I particular, we ca ask for a iterval aroud ˆp for ay sample so that i 95% of samples, the true mea p will lie iside this iterval. Such a iterval is called a cofidece iterval. Notice that we chose the umber 95% arbitrarily: while this is a commoly used value, the methods we ll discuss ca be used for ay cofidece level. We ve established that the radom quatity ˆp is approximately Gaussia with mea p ad variace p(1 p)/. We also kow from last time that the probability of a Gaussia radom variable beig withi about 2 stadard deviatios of its mea is about 95%. This meas that there s a 95% chace of ˆp beig less tha 2 p(1 p)/ away from p. So, we ll defie 2

3 (a) The samplig distributio of the estimator ˆp: i.e. the distributio of values for ˆp give a fixed true value p = 0.5. (b) The 95% cofidece iterval for a particular observed ˆp of 0.49 (with a true value of p = 0.5). Note that i this case, the iterval cotais the true value p. Wheever we draw a set of samples, there s a 95% chace that the iterval that we get is good eough to cotai the true value p. Figure 2.1 the iterval ˆp ± 2 }{{} coeff. p(1 p). (2.3) } {{ } std. dev. With probability 95%, we ll get a ˆp that gives us a iterval cotaiig p. What if we wated a 99% cofidece iterval? Sice ˆp is approximately Gaussia, its probability of beig withi 3 stadard deviatios from its mea is about 99%. So, the 99% cofidece iterval for this problem would be p(1 p) ˆp ± 3 }{{} coeff.. (2.4) } {{ } std. dev. We ca defie similar cofidece itervals, where the stadard deviatio remais the same, but the coefficiet depeds o the desired cofidece. While our variables beig Gaussia makes this relatioship easy for 95% ad 99%, i geeral we ll have to look up or have our software compute these coefficiets. But, there s a problem with these formulas: they requires us to kow p i order to compute cofidece itervals! Sice we do t actually kow p (if we did, we would t eed a cofidece iterval), we ll approximate it with ˆp, so that (2.3) becomes ˆp(1 ˆp) ˆp ± 2. (2.5) This approximatio is reasoable if ˆp is close to p, which we expect to ormally be the case. If the approximatio is ot as good, there are several more robust (but more complex) ways to compute the cofidece iterval. 3

4 p Figure 2.2: Multiple 95% cofidece itervals computed from differet sets of data, each with the same true parameter p = 0.4 (show by the horizotal lie). Each cofidece iterval represets what we might have gotte if we had collected ew data ad the computed a cofidece iterval from that ew data. Across differet datasets, about 95% of them cotai the true iterval. But, oce we have a cofidece iterval, we ca t draw ay coclusios about where i the iterval the true value is. Iterpretatio It s importat ot to misiterpret what a cofidece iterval is! This iterval tells us othig about the distributio of the true parameter p. I fact, p is a fixed (i.e., determiistic) ukow umber! Imagie that we sampled values for x i ad computed ˆp alog with a 95% cofidece iterval. Now imagie that we repeated this whole process a huge umber of times (icludig samplig ew values for x i ). The about 5% of the cofidece itervals costructed wo t actually cotai the true p. Furthermore, if p is i a cofidece iterval, we do t kow where exactly withi the iterval p is. Furthermore, addig a extra 4% to get from a 95% cofidece iterval to a 99% cofidece iterval does t mea that there s a 4% chace that it s i the extra little area that you added! The ext example illustrates this. I summary, a 95% cofidece iterval gives us a regio where, had we redoe the survey from scratch, the 95% of the time, the true value p will be cotaied i the iterval. This is illustrated i Figure Hypothesis testig Suppose we have a hypothesized or baselie value p ad obtai from our data a value ˆp that s smaller tha p. If we re iterested i reasoig about whether ˆp is sigificatly smaller tha p, oe way to quatify this would be to assume the true value were p ad the compute the probability of gettig a value smaller tha or as small as the oe we observed (we ca do the same thig for the case where ˆp is larger). If this probability is very low, we might thik the hypothesized value p is icorrect. This is the hypothesis testig framework. We begi with a ull hypothesis, which we call H 0 (i this example, this is the hypothesis that the true proportio is i fact p) ad a alterative hypothesis, which we call H 1 or H a (i this example, the hypothesis that the true mea is sigificatly smaller tha p). 4

5 Usually (but ot always), the ull hypothesis correspods to a baselie or borig fidig, ad the alterative hypothesis correspods to some iterestig fidig. Oce we have the two hypotheses, we ll use the data to test which hypothesis we should believe. Sigificace is usually defied i terms of a probability threshold α, such that we deem a particular result sigificat if the probability of obtaiig that result uder the ull distributio is less tha α. A commo value for α is 0.05, correspodig to a 1/20 chace of error. Oce we obtai a particular value ad evaluate its probability uder the ull hypothesis, this probability is kow as a p-value. This framework is typically used whe we wat to disprove the ull hypothesis ad show the value we obtaied is sigificatly differet from the ull value. I the case of pollig, this may correspod to showig that a cadidate has sigificatly more tha 50% support. I the case of a drug trial, it may correspod to showig that the recovery rate for patiets give a particular drug is sigificatly more tha some baselie rate. Here are some defiitios: I a oe-tailed hypothesis test, we choose oe directio for our alterative hypothesis: we either hypothesize that the test statistic is sigificatly big, or that the test statistic is sigificatly small. I a two-tailed hypothesis test, our alterative hypothesis ecompasses both directios: we hypothesize that the test statistic is simply differet from the predicted value. A false positive or Type I error happes whe the ull hypothesis is true, but we reject it. Note that the probability of a Type I error is α. A false egative or Type II error happes whe the ull hypothesis is false, but we fail to reject it 2 The statistical power of a test is the probability of rejectig the ull hypothesis whe it s false (or equivaletly, 1 (probability of type II error). Power is usually computed based o a particular assumed value for the quatity beig tested: if the value is actually, the the power of this test is. It also depeds o the threshold determied by α. It s ofte useful whe decidig how may samples to acquire i a experimet, as we ll see later. 2 Notice our careful choice of words here: if our result is t sigificat, we ca t say that we accept the ull hypothesis. The hypothesis testig framework oly lets us say that we fail to reject it. 5

6 p p p a Figure 2.3: A illustratio of statistical power i a oe-sided hypothesis test o variable p. Example The cocepts above are illustrated i Figure 2.3. Here, the ull hypothesis H 0 is that p = p 0, ad the alterative hypothesis H a is that p > p 0 : this is a oe-sided test. I particular, we ll use the value p a as the alterative value so that we ca compute power. The ull distributio is show o the left, ad a alterative distributio is show o the right. The α = 0.05 threshold for the alterative hypothesis is show as p. Whe the ull hypothesis is true, ˆp is geerated from the ull (left) distributio, ad we make the correct decisio if ˆp < p ad make a Type I error (false positive) otherwise. Whe the alterative hypothesis is true, ad if the true proportio p is actually p a, ˆp is geerated from the right distributio, ad we make the correct decisio whe ˆp > p ad make a Type II error (false egative) otherwise. The power is the probability of makig the correct decisio whe the alterative hypothesis is true. The probability of a Type I error (false positive) is show i blue, the probability of a Type II error (false egative) is show i red, ad the power is show i yellow ad blue combied (it s the area uder the right curve mius the red part). Notice that a threshold usually balaces betwee Type I ad Type II errors: if we always reject the ull hypothesis, the the probability of a Type I error is 1, ad the probability of a Type II error is 0, ad vice versa if we always fail to reject the ull hypothesis. 6

7 Example: Drug therapy results: a warig about data collectio Figure 2.4: Results of a simulated drug trial measurig the effects of stati drugs o lifespa. The top figure shows the lifespa of subjects who did ot receive treatmet, ad the bottom figure shows the lifespa of subjects who did receive it. Figure 2.4 shows results from a simulated drug trial a. At first glace, it seems clear that people who received the drug (bottom) teded to have a higher lifespa tha people who did t (top), but it s importat to look at hidde cofouds! I this simulatio, the drug actually had o effect, but the disease occurred more ofte i older people: these older people had a higher average lifespa simply because they had to live loger to get the drug. Ay statistical test we perform will say that the secod distributio has a higher mea tha the first oe, but this is ot because of the treatmet, but istead because of how we sampled the data! a Figure from: Støvrig, et al. Stati Use ad Age at Death: Evidece of a Flawed Aalysis. The America Joural of Cardiology, Cotiuous radom variables So far we ve oly talked about biomial radom variables, but what about cotiuous radom variables? Let s focus o estimatig the mea of a radom variable give observatios of it. As you ca probably guess, our estimator will be ˆµ = 1 i=1 x i. We ll start with the case where we kow the true populatio stadard deviatio; call it σ. This is somewhat urealistic, but it ll help us set up the more geeral case Whe σ is kow Cosider radom i.i.d. Gaussia samples x 1,..., x, all with mea µ ad variace σ 2. We ll compute the sample mea ˆµ, ad use it to draw coclusios about the true mea µ. 7

8 Just like p, ˆµ is a radom quatity. Its expectatio, which we computed i Chapter 1, is µ. Its variace is [ 1 ] var[ˆµ] = var x i i=1 = 1 var[x 2 i ] i=1 = 1 2 i=1 σ 2 = σ2. (2.6) This quatity (or to be exact, the square root of this quatity) is kow as the stadard error of the mea. I geeral, the stadard deviatio of the samplig distributio of the a particular statistic is called the stadard error of that statistic. Sice ˆµ is the sum of may idepedet radom variables, it s approximately Gaussia. If we subtract its mea µ ad divide by its stadard deviatio σ/ (both of which are determiistic), we ll get a stadard ormal radom variable. This will be our test statistic: Hypothesis testig z = ˆµ µ σ/. (2.7) I the case of hypothesis testig, we kow µ (it s the mea of the ull distributio), ad we ca compute the probability of gettig z or somethig more extreme. Your software of choice will typically do this by usig the fact that z has a stadard ormal distributio ad report the probability to you. This is kow as a z-test. Cofidece itervals What about a cofidece iterval? Sice z is a stadard ormal radom variable, it has probability 0.95 of beig withi 2 stadard deviatios of its mea. We ca compute the cofidece iterval by maipulatig a bit of algebra: P (ˆµ 2 }{{} coeff. P ( 2 z 2) 0.95 P ( 2 ˆµ µ σ/ 2) 0.95 P ( 2 σ ˆµ µ 2 σ ) 0.95 σ }{{} std. dev. µ ˆµ + 2 }{{} coeff. σ }{{} std. dev. ) 0.95 This says that the probability that µ is withi the iterval ˆµ ± 2 σ is But remember: the oly thig that s radom i this story is ˆµ! So whe we use the word probability here, it s referrig oly to the radomess i ˆµ. Do t forget that µ is t radom! 8

9 Also, remember that we chose the cofidece level 0.95 (ad therefore the threshold 2) somewhat arbitrarily, ad we could just as easily compute a 99% cofidece iterval (which would correspod to a threshold of about 3) or a iterval for ay other level of cofidece: we could compute the threshold by usig the stadard ormal distributio. Fially, ote that for a two-tailed hypothesis test, the threshold at which we declare sigificace for some particular α is the same as the width of a cofidece iterval with cofidece level 1 α. Ca you show why this is true? Statistical power If we get to choose the umber of observatios, how do we pick it to esure a certai level of statistical power i a hypothesis test? Suppose we choose α ad a correspodig threshold x. How ca we choose, the umber of samples, to achieve a desired statistical power? Sice the width of the samplig distributio is cotrolled by, by choosig large eough, we ca achieve eough power for particular values of the alterative mea. The followig example illustrates the effect that sample size has o sigificace thresholds. Example: Fertility cliics Figure 2.5: A fuel plot showig coceptio statistics from fertility cliics i the UK. The x-axis idicates the sample size; i this case that s the umber of coceptio attempts (cycles). The y-axis idicates the quatity of iterest; i this case that s the success rate for coceivig. The fuels (dashed lies) idicate thresholds for beig sigificatly differet from the ull value of 32% (the atioal average). This figure comes from Figure 2.5 is a example of a fuel plot. We see that with a small umber of samples, it s difficult to judge ay of the cliics as sigificatly differet from the baselie value, sice exceptioally high/low values could just be due to chace. However, as the umber of cycles icreases, the probability of cosistetly obtaiig large values by chace decreases, ad we ca declare cliics like Lister ad CARE Nottigham sigificatly better tha average: while other cliics have similar success rates over fewer cycles, these two have a high success rate over may cycles. So, we ca be more certai that the higher success rates are ot just due to chace ad are i fact meaigful. 9

10 2.2.2 Whe σ is ukow I geeral, we wo t kow the true populatio stadard deviatio beforehad. We ll solve this problem by usig the sample stadard deviatio. This meas usig ˆσ 2 / istead of σ 2 / for var(ˆµ). Throughout these otes, we ll refer to this quatity as the stadard error of the mea (as opposed to the versio give i Equatio (2.6)). But oce we replace the fixed σ with the radom ˆσ (which we ll also write as s), our test statistic (Equatio (2.7)) becomes t = ˆµ µ ˆσ/. (2.8) Sice the umerator ad deomiator are both radom, this is o loger Gaussia. The deomiator is roughly χ 2 -distributed quatity 3, ad the overall statistic is t-distributed. I this case, our t distributio has 1 degrees of freedom. Cofidece itervals ad hypothesis tests proceed just as i the kow-σ case with oly two chages: usig ˆσ istead of σ ad usig a t distributio with 1 degrees of freedom istead of a Gaussia distributio. The cofidece iterval requires oly ˆµ ad the stadard error s, while the hypothesis test also requires a hypothesis, i the form of a value for µ. For example, a 95% cofidece iterval might look like ˆµ ± t ˆσ (2.9) To determie the coefficiet t, we eed to kow the value where a t distributio has 95% of its probability. This depeds o the degrees of freedom (the oly parameter of the t distributio) ad ca easily be looked up i a table or computed from ay software package. For example, if = 10, the the t distributio has 1 = 9 degrees of freedom, ad k = Notice that this produces a wider iterval tha the correspodig Gaussia-based cofidece iterval from before. If we do t kow the stadard deviatio ad we estimate it, we re the less certai about our estimate ˆµ. To derive the t-test, we assumed that our data poits were ormally distributed. But, the t-test is fairly robust to violatios of this assumptio. 2.3 Two-sample tests So far, we ve looked at the case of havig oe sample ad determiig whether it s sigificatly greater tha some hypothesized amout. But what about the case where we re iterested i the differece betwee two samples? We re usually iterested i testig whether the differece is sigificatly differet from zero. There are a few differet ways of dealig with this, depedig o the uderlyig data. 3 I fact, the quatity ( 1)ˆσ 2 /σ 2 is χ 2 -distributed with 1 degrees of freedom, ad the test statistic t = ˆµ µ σ/ σ 1 ˆσ 1 is therefore t-distributed. 10

11 I the case of matched pairs, we have a before value ad a after value for each data poit (for example, the scores of studets before ad after a class). Matchig the pairs helps cotrol the variace due to other factors, so we ca simply look at the differeces for each data poit, x post i ull mea of 0. x pre i ad perform a oe-sample test agaist a I the case of two samples with pooled variace, the meas of the two samples might be differet (this is usually the hypothesis we test), but the variaces of each sample are assumed to be the same. This assumptio allows us to combie, or pool, all the data poits whe estimatig the sample variace. So, whe computig the stadard error, we ll use this formula: Our test statistic is the s 2 = ( 1 1)s ( 2 1)s 2 2. ( ) t = ˆµ (1) ˆµ (2) s p (1/1 ) + (1/ 2 ). This test still provides reasoably good power, sice we re usig all the data to estimate s p. I this settig, where the two groups have the same variace, we say the data are homoskedastic. I the geeral case of two samples with separate (ot pooled) variace, the variaces must be estimated separately. The result is t quite a t distributio, ad this variat is ofte kow as Welch s t-test. It s importat to keep i mid that this test will have lower statistical power sice we are usig less data to estimate each quatity. But, uless you have solid evidece that the variaces are i fact equal, it s best to be coservative ad stick with this test. I this settig, where the two groups have differet variaces, we say the data are heteroskedastic. 2.4 Some importat warigs for hypothesis testig Correctig for multiple comparisos (very importat): suppose you coduct 20 tests at a sigificace level of The o average, just by chace, eve if the ull hypothesis is wrog, oe of the tests will show a sigificat differece (see this relevat xkcd). There are a few stadard ways of addressig this issue: Boferroi correctio: If we re doig m tests, use a sigificace value of α/m istead of α. Note that this is very coservative, ad will dramatically reduce the umber of acceptaces. 11

12 False discovery rate (Bejamii-Hochberg): this techique guaratees α overall error by usig the very small sigificaces to allow slightly larger oes through as well. Rejectig the ull hypothesis: You ca ever be completely sure that the ull hypothesis is false from usig a hypothesis test! Ay statemet stroger tha the data do ot support the ull hypothesis should be made with extreme cautio. Practical vs statistical sigificace: with large eough, ay miutely small differece ca be made statistically sigificat. The first example below demostrates this poit. Sometimes small differeces like this matter (e.g., i close electios), but may times they do t. Idepedet ad idetically distributed: May of our derivatios ad methods deped o samples beig idepedet ad idetically distributed. There are ways of chagig the methods to accout for depedet samples, but it s importat to be aware of the assumptios you eed to use a particular method or test. Example: Practical vs statistical sigificace Suppose we are testig the fairess of a coi. Our ull hypothesis might be p = 0.5. We collect data poits ad observe a sample proportio ˆp = ad ru a sigificace test. The large umber of samples would lead to a p-value of At a 5% sigificace level, we would declare this sigificat. But, for practical purposes, eve if the true mea were i fact 0.501, the coi is almost as good as fair. I this case, the strog statistical sigificace we obtaied does ot correspod to a practically sigificat differece. Figure 2.6 illustrates the ull samplig distributio ad the samplig distributio assumig a proportio of p = Figure 2.6: Samplig distributios for p = 0.5 (black) ad p = (blue) for = Note the scale of the x-axis: the large umber of samples dramatically reduces the variace of each distributio. 12

13 Example: Pitfall of the day: Iterpretatio fallacies ad Sally Clark I the late 1990s, Sally Clark was covicted of murder after both her sos died suddely withi a few weeks of birth. The prosecutors made two mai claims: The probability of two childre idepedetly dyig suddely from atural causes like Sudde Ifat Death Sydrome (SIDS) is 1 i 73 millio. Such a evet would occur by chace oly oce every 100 years, which was evidece that the death was ot atural. If the death was ot due to two idepedet cases of SIDS (as asserted above), the oly other possibility was that they were murdered. The assumptio of idepedece i the first item was later show to be icorrect: the two childre were ot oly geetically similar but also were raised i similar eviromets, causig depedece betwee the two evets. This wrogful assumptio of idepedece is a commo error i statistical aalysis. The probability the goes up dramatically a. Also, showig the ulikeliess of two chace deaths does ot imply ay particular alterative! Eve if it were true, it does t make sese to cosider the 1 i 73 millio claim by itself: it has to be compared to the probability of two murders (which was later estimated to be eve lower). This secod error is kow as the prosecutor s fallacy. I fact, tests later showed bacterial ifectio i oe of the childre! a See Royal Statistical Society cocered by issues raised i Sally Clark Case, October

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval Chapter 8 Tests of Statistical Hypotheses 8. Tests about Proportios HT - Iferece o Proportio Parameter: Populatio Proportio p (or π) (Percetage of people has o health isurace) x Statistic: Sample Proportio

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

One-sample test of proportions

One-sample test of proportions Oe-sample test of proportios The Settig: Idividuals i some populatio ca be classified ito oe of two categories. You wat to make iferece about the proportio i each category, so you draw a sample. Examples:

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Determining the sample size

Determining the sample size Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Lesson 15 ANOVA (analysis of variance)

Lesson 15 ANOVA (analysis of variance) Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi

More information

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights Ceter, Spread, ad Shape i Iferece: Claims, Caveats, ad Isights Dr. Nacy Pfeig (Uiversity of Pittsburgh) AMATYC November 2008 Prelimiary Activities 1. I would like to produce a iterval estimate for the

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Practice Problems for Test 3

Practice Problems for Test 3 Practice Problems for Test 3 Note: these problems oly cover CIs ad hypothesis testig You are also resposible for kowig the samplig distributio of the sample meas, ad the Cetral Limit Theorem Review all

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Chapter 7: Confidence Interval and Sample Size

Chapter 7: Confidence Interval and Sample Size Chapter 7: Cofidece Iterval ad Sample Size Learig Objectives Upo successful completio of Chapter 7, you will be able to: Fid the cofidece iterval for the mea, proportio, ad variace. Determie the miimum

More information

Confidence Intervals

Confidence Intervals Cofidece Itervals Cofidece Itervals are a extesio of the cocept of Margi of Error which we met earlier i this course. Remember we saw: The sample proportio will differ from the populatio proportio by more

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Normal Distribution.

Normal Distribution. Normal Distributio www.icrf.l Normal distributio I probability theory, the ormal or Gaussia distributio, is a cotiuous probability distributio that is ofte used as a first approimatio to describe realvalued

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error

STA 2023 Practice Questions Exam 2 Chapter 7- sec 9.2. Case parameter estimator standard error Estimate of standard error STA 2023 Practice Questios Exam 2 Chapter 7- sec 9.2 Formulas Give o the test: Case parameter estimator stadard error Estimate of stadard error Samplig Distributio oe mea x s t (-1) oe p ( 1 p) CI: prop.

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 13 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Note 13 Itroductio At this poit, we have see eough examples that it is worth just takig stock of our model of probability ad may

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test)

Mann-Whitney U 2 Sample Test (a.k.a. Wilcoxon Rank Sum Test) No-Parametric ivariate Statistics: Wilcoxo-Ma-Whitey 2 Sample Test 1 Ma-Whitey 2 Sample Test (a.k.a. Wilcoxo Rak Sum Test) The (Wilcoxo-) Ma-Whitey (WMW) test is the o-parametric equivalet of a pooled

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

3 Basic Definitions of Probability Theory

3 Basic Definitions of Probability Theory 3 Basic Defiitios of Probability Theory 3defprob.tex: Feb 10, 2003 Classical probability Frequecy probability axiomatic probability Historical developemet: Classical Frequecy Axiomatic The Axiomatic defiitio

More information

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed.

Here are a couple of warnings to my students who may be here to get a copy of what happened on a day that you missed. This documet was writte ad copyrighted by Paul Dawkis. Use of this documet ad its olie versio is govered by the Terms ad Coditios of Use located at http://tutorial.math.lamar.edu/terms.asp. The olie versio

More information

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal)

Parametric (theoretical) probability distributions. (Wilks, Ch. 4) Discrete distributions: (e.g., yes/no; above normal, normal, below normal) 6 Parametric (theoretical) probability distributios. (Wilks, Ch. 4) Note: parametric: assume a theoretical distributio (e.g., Gauss) No-parametric: o assumptio made about the distributio Advatages of assumig

More information

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN

Analyzing Longitudinal Data from Complex Surveys Using SUDAAN Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical

More information

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable Week 3 Coditioal probabilities, Bayes formula, WEEK 3 page 1 Expected value of a radom variable We recall our discussio of 5 card poker hads. Example 13 : a) What is the probability of evet A that a 5

More information

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS

Unit 8: Inference for Proportions. Chapters 8 & 9 in IPS Uit 8: Iferece for Proortios Chaters 8 & 9 i IPS Lecture Outlie Iferece for a Proortio (oe samle) Iferece for Two Proortios (two samles) Cotigecy Tables ad the χ test Iferece for Proortios IPS, Chater

More information

Quadrat Sampling in Population Ecology

Quadrat Sampling in Population Ecology Quadrat Samplig i Populatio Ecology Backgroud Estimatig the abudace of orgaisms. Ecology is ofte referred to as the "study of distributio ad abudace". This beig true, we would ofte like to kow how may

More information

A Mathematical Perspective on Gambling

A Mathematical Perspective on Gambling A Mathematical Perspective o Gamblig Molly Maxwell Abstract. This paper presets some basic topics i probability ad statistics, icludig sample spaces, probabilistic evets, expectatios, the biomial ad ormal

More information

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

15.075 Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011 15.075 Exam 3 Istructor: Cythia Rudi TA: Dimitrios Bisias November 22, 2011 Gradig is based o demostratio of coceptual uderstadig, so you eed to show all of your work. Problem 1 A compay makes high-defiitio

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

OMG! Excessive Texting Tied to Risky Teen Behaviors

OMG! Excessive Texting Tied to Risky Teen Behaviors BUSIESS WEEK: EXECUTIVE EALT ovember 09, 2010 OMG! Excessive Textig Tied to Risky Tee Behaviors Kids who sed more tha 120 a day more likely to try drugs, alcohol ad sex, researchers fid TUESDAY, ov. 9

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation HP 1C Statistics - average ad stadard deviatio Average ad stadard deviatio cocepts HP1C average ad stadard deviatio Practice calculatig averages ad stadard deviatios with oe or two variables HP 1C Statistics

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

A probabilistic proof of a binomial identity

A probabilistic proof of a binomial identity A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two

More information

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here). BEGINNING ALGEBRA Roots ad Radicals (revised summer, 00 Olso) Packet to Supplemet the Curret Textbook - Part Review of Square Roots & Irratioals (This portio ca be ay time before Part ad should mostly

More information

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction

THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,

More information

The Stable Marriage Problem

The Stable Marriage Problem The Stable Marriage Problem William Hut Lae Departmet of Computer Sciece ad Electrical Egieerig, West Virgiia Uiversity, Morgatow, WV William.Hut@mail.wvu.edu 1 Itroductio Imagie you are a matchmaker,

More information

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Research Method (I) --Knowledge on Sampling (Simple Random Sampling) Research Method (I) --Kowledge o Samplig (Simple Radom Samplig) 1. Itroductio to samplig 1.1 Defiitio of samplig Samplig ca be defied as selectig part of the elemets i a populatio. It results i the fact

More information

Hypergeometric Distributions

Hypergeometric Distributions 7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you

More information

CHAPTER 3 DIGITAL CODING OF SIGNALS

CHAPTER 3 DIGITAL CODING OF SIGNALS CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity

More information

STATISTICAL METHODS FOR BUSINESS

STATISTICAL METHODS FOR BUSINESS STATISTICAL METHODS FOR BUSINESS UNIT 7: INFERENTIAL TOOLS. DISTRIBUTIONS ASSOCIATED WITH SAMPLING 7.1.- Distributios associated with the samplig process. 7.2.- Iferetial processes ad relevat distributios.

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville

Page 1. Real Options for Engineering Systems. What are we up to? Today s agenda. J1: Real Options for Engineering Systems. Richard de Neufville Real Optios for Egieerig Systems J: Real Optios for Egieerig Systems By (MIT) Stefa Scholtes (CU) Course website: http://msl.mit.edu/cmi/ardet_2002 Stefa Scholtes Judge Istitute of Maagemet, CU Slide What

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics We leared to describe data sets graphically. We ca also describe a data set umerically. Measures of Locatio Defiitio The sample mea is the arithmetic average of values. We deote

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

Example: Probability ($1 million in S&P 500 Index will decline by more than 20% within a

Example: Probability ($1 million in S&P 500 Index will decline by more than 20% within a Value at Risk For a give portfolio, Value-at-Risk (VAR) is defied as the umber VAR such that: Pr( Portfolio loses more tha VAR withi time period t)

More information

Central Limit Theorem and Its Applications to Baseball

Central Limit Theorem and Its Applications to Baseball Cetral Limit Theorem ad Its Applicatios to Baseball by Nicole Aderso A project submitted to the Departmet of Mathematical Scieces i coformity with the requiremets for Math 4301 (Hoours Semiar) Lakehead

More information

Topic 5: Confidence Intervals (Chapter 9)

Topic 5: Confidence Intervals (Chapter 9) Topic 5: Cofidece Iterval (Chapter 9) 1. Itroductio The two geeral area of tatitical iferece are: 1) etimatio of parameter(), ch. 9 ) hypothei tetig of parameter(), ch. 10 Let X be ome radom variable with

More information

The Fundamental Forces of Nature

The Fundamental Forces of Nature Gravity The Fudametal Forces of Nature There exist oly four fudametal forces Electromagetism Strog force Weak force Gravity Gravity 2 The Hierarchy Problem Gravity is far weaker tha ay of the forces! Why?!?

More information

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as:

A Test of Normality. 1 n S 2 3. n 1. Now introduce two new statistics. The sample skewness is defined as: A Test of Normality Textbook Referece: Chapter. (eighth editio, pages 59 ; seveth editio, pages 6 6). The calculatio of p values for hypothesis testig typically is based o the assumptio that the populatio

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

Present Values, Investment Returns and Discount Rates

Present Values, Investment Returns and Discount Rates Preset Values, Ivestmet Returs ad Discout Rates Dimitry Midli, ASA, MAAA, PhD Presidet CDI Advisors LLC dmidli@cdiadvisors.com May 2, 203 Copyright 20, CDI Advisors LLC The cocept of preset value lies

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

Confidence Intervals for Linear Regression Slope

Confidence Intervals for Linear Regression Slope Chapter 856 Cofidece Iterval for Liear Regreio Slope Itroductio Thi routie calculate the ample ize eceary to achieve a pecified ditace from the lope to the cofidece limit at a tated cofidece level for

More information

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.

*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature. Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern.

Repeating Decimals are decimal numbers that have number(s) after the decimal point that repeat in a pattern. 5.5 Fractios ad Decimals Steps for Chagig a Fractio to a Decimal. Simplify the fractio, if possible. 2. Divide the umerator by the deomiator. d d Repeatig Decimals Repeatig Decimals are decimal umbers

More information

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return

where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The

More information

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find 1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.

More information

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu Multi-server Optimal Badwidth Moitorig for QoS based Multimedia Delivery Aup Basu, Iree Cheg ad Yizhe Yu Departmet of Computig Sciece U. of Alberta Architecture Applicatio Layer Request receptio -coectio

More information

Basic Elements of Arithmetic Sequences and Series

Basic Elements of Arithmetic Sequences and Series MA40S PRE-CALCULUS UNIT G GEOMETRIC SEQUENCES CLASS NOTES (COMPLETED NO NEED TO COPY NOTES FROM OVERHEAD) Basic Elemets of Arithmetic Sequeces ad Series Objective: To establish basic elemets of arithmetic

More information

λ λ(1+δ) e λ 2πe λ(1+δ) = eλδ (1 + δ) λ(1+δ) 1/2 2πλ p(x) = e (x λ)2 /(2λ) 2πλ

λ λ(1+δ) e λ 2πe λ(1+δ) = eλδ (1 + δ) λ(1+δ) 1/2 2πλ p(x) = e (x λ)2 /(2λ) 2πλ 2.1.5 Gaussia distributio as a limit of the Poisso distributio A limitig form of the Poisso distributio (ad may others see the Cetral Limit Theorem below) is the Gaussia distributio. I derivig the Poisso

More information

Hypothesis testing using complex survey data

Hypothesis testing using complex survey data Hypotesis testig usig complex survey data A Sort Course preseted by Peter Ly, Uiversity of Essex i associatio wit te coferece of te Europea Survey Researc Associatio Prague, 5 Jue 007 1 1. Objective: Simple

More information

3. Greatest Common Divisor - Least Common Multiple

3. Greatest Common Divisor - Least Common Multiple 3 Greatest Commo Divisor - Least Commo Multiple Defiitio 31: The greatest commo divisor of two atural umbers a ad b is the largest atural umber c which divides both a ad b We deote the greatest commo gcd

More information

Logistic Regression. Chapter 12. 12.1 Modeling Conditional Probabilities

Logistic Regression. Chapter 12. 12.1 Modeling Conditional Probabilities Chapter 12 Logistic Regressio 12.1 Modelig Coditioal Probabilities So far, we either looked at estimatig the coditioal expectatios of cotiuous variables (as i regressio), or at estimatig distributios.

More information