1 Chapter 16: Confidence intervals Objective (1) Learn how to estimate the errormargin in "proportion" type statistics calculated from a random sample. (2) Learn how to construct confidence intervals. Concept briefs: * Sampling distribution theorem for proportions = When the conditions are met, proportions follow normal model N( p, ), with p=true population parameter, p (1 p)/ n. * Standard Error (SE) = ^p (1 ^p ) / n ( ^p = sampled statistic) * Confidence level = % chosen for constructing confidence interval. * Critical value (z*) = zvalue that corresponds to confidence level chosen. * Margin of error (ME) = z* x SE. It indicates how far ^p may be from true p. * Confidence interval (CI) = ^p ± ME. It gives the interval within which the true value of p lies. * One proportion zinterval = General term for confidence int. for proportions. * Confidence interval tradeoffs: Be aware of effect of confidence level and sample size on margin of error.
2 Confidence Intervals: Concept Summary What is a confidence interval (CI)? It is a numerical range within which the true value of population parameter lies. E.g., (1) The true proportion of students who live offcampus is in the interval [5.8%,19%]. (2) The true mean GPA of students is in the interval [2.90, 3.22]. How accurate are confidence intervals? A CI is estimated based on probability. There is no guarantee that the true value is contained within the CI. However, we can estimate a CI with a very high probability that it contains the true parameter. In fact, we can estimate it to any desired level of probabilistic certainty. E.g., (1) There is a 95% probability that the true proportion of students who live offcampus is in the interval [5.8%,19%]. (2) There is an 80% probability that the true proportion is in the interval [8.2%,16.6%]. What is the margin of error? The width of the CI tells us the margin of error (ME). Technically, the ME is half the width of the confidence interval. How does the width relate to the confidence level? To capture the true parameter with higher probability (i.e., higher confidence level), the CI must be wider. Thus, for example, a 95% CI is always wider than an 80% CI. Therefore, the margin of error always gets bigger with higher confidence level. Two key interpretations of confidence level: (1) It denotes the probability that the true parameter is contained within the CI. (2) It denotes the percentage of all random samples (within the population) that will contain the true parameter within their CI.
3 Illustration: * We survey random sample of 100 students to determine % of EC students who live offcampus. * Suppose our survey estimates this statistic to be 12.4% (so our ^p =0.124). Central Limit Theorem says: If you repeat your study several times with different random samples, and take the mean of all your estimates, you'll get the true value you're seeking (i.e., p). Reality says: What is the best I can do with the statistic calculated from my single study? The goal: Find margin of error in our calculated ^p Issues/problems (1) We don't know true p, so can't get the sampling distribution model. (2) Without p, can't tell how far, or even which direction, ^p is in. ^p p?? p ^p
4 Resolution to problem1 * We know the true sampling distribution would follow the normal model with mean p( 1 p) p and SD. n * Since p is not known, we can't find the mean or SD. * To compromise, we "fudge" the SD and calculate it using ^p instead of p. This is called Standard Error (SE) instead of Standard Deviation (SD). ^p ( 1 SE ^p ). 124( ) E.g: n 100
5 Resolution to problem2 Illustration: 2 * We don't know true p, but we know that 95% of all random samples will give an answer within of this value (provided we know ). * We have a decent approximation for = (SE [^p (1^p ) / n]). * So, we know the true answer must lie (at worst) within ± 2 SE from the sampled result. 2SE = 6.6%. Thus, the true answer is between: % and %. * Confidence interval says: "We are 95% confident that % students who live offcampus is between 5.8% and 19%." Q: Is it possible to get really unlucky & pick a sample outside the 95% that are within 2SE of the true answer?
6 True ^p Confidence Interval Recipe Objective: You have a sampled proportion ^p. You want to predict the range in which the true proportion p lies. Step0: Identify the sample proportion ^p, if you haven't already. Step1: Determine confidence level you want for your prediction. (e.g., 90%). Step2: Find critical value (z*) for this confidence level. Step3: Verify conditions for theorem; find SE= ^p (1 ^p ) / n. Step4: Find margin of error: ME = z* x SE. Step5: Find the confidence interval: ^p ME p +ME. Step6: Write a sentence (or two) that states and interprets your CI. Points to note: (1) Confidence intervals always involve a tradeoff between margin of error & level of confidence. (2) If you want higher confidence, you buy this by increasing the margin of error (think of this as "margin of safety"). E.g., for 100% confidence, the margin of safety must also be 100%. (3) The only way to get high confidence with smaller margins of error is to find a way to decrease SE. The only way to do that is by increasing sample size. (4) There is an important technical name for this confidence interval: One proportion zinterval
7 How to find critical z* values: More examples Objective: For confidence level of x %, we want to find zvalue that encloses the central x % of the std. normal model. Example1: z* for 80% confidence level Confidence level=80%: Lookup zvalue for area=90%. Thus, z* = % 10% 10% Example2: z* for 90% confidence level Confidence level=90%: Lookup zvalue for area=95%. Thus, z* = % 5% 5% Example3: z* for 98% confidence level Confidence level=98%: Lookup zvalue for area=99%. Thus, z* = % 1% 1%
8 Exercise 30, pg. 448 Strategy for (b): * For newspaper: ^p = 0.53, n = 1200; Want 95% confidence; To find z* lookup standard normal table for 97.5% area > z*=1.96 Check conditions: random, independent, sufficiently large SE = ( )/ 1200 = X (calculate this value yourself!) ME = 1.96 X Confidence interval: X to X. * For statistics class: ^p = 0.54, n = 450; Confidence level and z* same; Check conditions; SE = ( )/ 450 = Y (calculate this value yourself!) ME = 1.96 Y Confidence interval: Y to Y. Answers: Newspaper CI: [0.5018, ] OR 50.18% to 55.82% Statistics class: [0.4939, ] OR 49.39% to 58.61% Exercise 38, pg. 448 Strategy: (a) Find z* for 98% confidence level (i.e., lookup zscore for 99% area). Question wants ME=0.05. This requires: 0.05 = z* x SE. You know z*, so you can find SE [Check your answer: SE=0.0215]. Take worst case scenario (i.e., largest SE happens for ^p =0.5). Use SE = 0. 5 ( )/ n to find n. > n = 0.5 (10.5) / SE 2. [Answer: n ~ 543] (b) Very similar strategy. Get answers: SE=0.0129, n ~ 1503.
More information