Stat 104 Lecture 16. Statistics 104 Lecture 16 (IPS 6.1) Confidence intervals  the general concept


 2 years ago
1 Statistics 104 Lecture 16 (IPS 6.1) Outlie for today Cofidece itervals Cofidece itervals for a mea, µ (kow σ) Cofidece itervals for a proportio, p Margi of error ad sample size Review of mai topics for the exam Cofidece itervals  the geeral cocept We wat to make a statemet (iferece) about a populatio parameter (e.g. µ or p; ukow value) usig iformatio from observed sample data (statistic; a estimate such as x or pˆ ) The geeral form of a cofidece iterval is Poit estimate + margi of error or more specifically Poit estimate + (z quatile) x (SD of estimate) (Lower cofidece limit, Upper cofidece limit) 1 2 Cofidece itervals for a mea, µ Cofidece itervals for estimatig a populatio parameter (e.g., mea, µ) are based o the samplig distributio of a statistic (e.g., sample mea) Cofidece itervals We do t eed to take a lot of radom samples to rebuild the samplig distributio ad fid the populatio mea µ at its ceter 3 Sample Populatio µ All we eed is oe SRS of size ad the rely o the properties of the samplig distributio to make a iferece about the populatio mea 4 Cofidece iterval for a mea, µ Recall the cetral limit theorem: if we draw a sample of size from ay populatio with mea µ ad stadard deviatio σ, whe is large ( > 30) σ x ~ Nµ, Whe the stadard deviatio σ of the populatio is kow, the 95% cofidece iterval for a mea, based o a sample of size, is σ σ , x Example: credit card debt Credit card debt for families i the US Assume we do t kow the populatio mea, µ, but we do kow the populatio SD, σ = $1,420 (Most ofte i practice both µ ad σ are ukow) We took a SRS of size = 25 ad calculated a sample mea of $2,410 How accurate is this estimate of $2,410 from our sample? What is the 95% cofidece iterval for the populatio mea, µ? 5 6
2 Example: credit card debt The 95% CI is of the form σ σ , x Isertig the values for x, ad σ 1, 420 1, 420 2, , 2, So the 95% CI is (2, , 2, ) or ($1,853, $2,967) Notes o cofidece itervals From the previous example: Whe = 25, the 95% CI is (2, , 2, ) Whe = 100, the 95% CI is (2, , 2, ) Whe = 1,000, the 95% CI is (2,41088, 2, ) Whe = 10,000, the 95% CI is (2,41028, 2, ) The larger the sample size, the tighter the CI, ad the more accurate our statemet regardig µ Ofte i practice we do t kow σ so we have to estimate it with s, which also varies For large this works well; for < 30 we eed to use a t distributio, rather tha a zdistributio (i 1 week) 7 8 Cofidece itervals for a proportio, p Cofidece itervals for estimatig a populatio proportio (p) are based o the samplig distributio of a statistic (sample proportio, ˆp ) 9 Cofidece itervals for a proportio, p Recall, if we draw a sample of size from a biary populatio (p = Success)), whe p 10 ad (1 p) 10 the pˆ ~ N p, p(1 p) The 95% cofidece iterval for a proportio, based o a sample of size, is ˆ(1 ˆ) ˆ(1 ˆ) ˆ p p p, pˆ p p 10 Example: poverty rate The poverty rate is the proportio of households i the US for which the family s total icome is less tha that family s threshold (1 perso $9,973, 2 persos $12,755, 3 persos $15,577, etc.) The overall US poverty rate was 12.6% i 2005 What is the poverty rate i orther Maie? A radom sample of 100 families i this regio was studied The poverty rate for this sample was 21% What is the 95% cofidece iterval for the poverty rate i orther Maie? 11 Example: poverty rate The 95% CI is of the form ˆ(1 ˆ) ˆ(1 ˆ) ˆ p p p, pˆ p p Isertig the values for ad ˆp 0.21(1 0.21) 0.21(1 0.21) , So the 95% CI is ( , ) or (13%, 29%) 12
3 Observatios From the previous example: Whe = 100, the 95% CI is (21%  8%, 21% + 8%) Whe = 400, the 95% CI is (21%  4%, 21% + 4%) Whe = 6,000, the 95% CI is (21%  1%, 21% + 1%) The larger the sample size, the tighter the CI, ad the more accurate our statemet regardig p Forp 10 ad (1 p) 10 this works well For other situatios we use biomial tables The margi of error The geeral form of a cofidece iterval Poit estimate + z* x (SD of estimate) Poit estimate + margi of error Margi of error gives the accuracy of the estimate How to reduce the margi of error 1) Icrease sample size 2) Use a lower level of cofidece 3) Reduce SD of estimate Sample size calculatio Cofidece iterval methodology ca be used to determie the sample size eeded Steps 1. Idetify the desired precisio you wat 2. Set the margi of error equal to that precisio 3. Solve for (Always roud up to a iteger) 15 Example  sample size calculatio I a cliical research settig we wat to be able to estimate a respose rate to withi + 10% with 95% cofidece How may patiets do we eed to study? p(1 p) Set 0.10 = 1.96 ad solve for p = 0.5 maximizes p(1 p) (coservative) (0.5) (0.10) = (1.96) So = 100 patiets 16 Example  sample size calculatio We wat to be able to estimate the average credit card debt of US families to withi + $100 with 95% cofidece We kow the stadard deviatio σ = $1,420 How may families do we eed to study? σ Set 100 = 1.96 ad solve for 1,420 = (1.96) = ad = So = 775 families (always roud up) 17 Outlie for today Cofidece itervals Cofidece itervals for a mea, µ (kow σ) Cofidece itervals for a proportio, p Margi of error ad sample size Review of mai topics for the exam 18
4 Geeral cocepts covered so far Visualize (Chapters 12) Orgaizig ad displayig data (descriptive statistics) Coceptualize (Chapter 3) Methods for data collectio (observatioal studies, sample surveys ad cotrolled experimets) Aalyze (Chapters 46.1) Some probability theory ad radom variables Samplig distributios & cofidece itervals Remaider of the course Tests of hypotheses (Chapters 6.27) Populatio parameter µ (z & t procedures) Regressio ad ANOVA (Chapters 1013) Populatio proportio p ad twoway tables IPS Chapter 1 Orgaizig & displayig sigle variables/distributios Categorical versus quatitative variables Fidig the ceter (mea ad media) Fidig the spread (s 2, stadard deviatio (s) & IQR) Effects of liear trasformatios (Y = a + bx) Graphs (boxplots, bar graphs & histograms) Stregths/weakesses of graphig techiques Desity curves smooth curve describes distributio Skewess ad outliers (cout data) & Misc. topics (Chapters 89) IPS Chapter 1 (cotiued) Normal distributio most commo ad importat Geeral Y~N(µ,σ) ad stadard ormal Z~N(0,1) The rule Stadardize Z = (Y µ) / σ ad Z~N(0,1) tables Z to probabilities ad probabilities to Z (quatiles) Always draw a picture of Z~N(0,1) Normal quatile plots to assess ormality Trasformig data to ormality IPS Chapter 2 Relatioships betwee 2 quatitative variables The key graph scatterplots Correlatio (r) ad its properties Respose (Y) ad explaatory (X) variables Leastsquares regressio lie (Y = a +bx) Miimizes the sum of squares i the Y directio sy Slope b= r ad itercept a = y bx s Iterpretig r 2 x IPS Chapter 2 (cotiued) Residuals ad residual plots Outliers, ifluetial poits ad extrapolatio Associatio vs causatio (3 commo relatioships) Establishig causatio by experimet or 5 criteria The ecological fallacy Regressio assumptios to be checked: 1) Y vs X liear, 2) residuals have costat σ 2, 3) residuals have ormal distributio Cosider trasformatios, if ecessary Noliear trasf s Y k & X k for k = ½, log, ½, etc. 23 IPS Chapter 3 Sources of data variatios i reliability observatioal studies (sample surveys) vs experimets 5 major elemets of study desig 3 priciples of desigig experimets  cotrol, radomizatio & replicatio Blockig, placebos, & blidig for experimets Cliical research levels of evidece ad ethics 24
5 IPS Chapter 4 Radomess ad probability Rules of probability ad idepedet evets A c ) = 1 A) For disjoit evets A or B) = A) + B) For idepedet evets A ad B) = A) x B) [for biomial distributio ad samplig] Simple probability (# elemets i A / # elemets i S) [for biomial (p = 1/2) & future procedures] B A) A) Bayes s rule A B) = c c B A) A) + B A ) A ) IPS Chapter 4 (cotiued) Discrete ad cotiuous radom variables probability distributios (for both types) desity curves for cotiuous radom variables meas ad variaces of radom variables Expl: for discrete r.v. s E( X ) = µ = xp i i rules for variaces of idepedet RV s σ 2 X  Y = σ2 X + σ2 Y [will use for samplig distributios from two populatios] IPS Chapter 5.1 Samplig distributio for couts ad proportios The biomial distributio B(,p) 4 characteristics of biomial distributio Probability distributio of # of successes i trials k k X = k) = p (1 p) k Mea = p ad variace = p(1p) Couts equivalet to sample proportio ( ˆp ) Both ca be approximated with ormal distributio if p ad (1p) > 10 (use a correctio for cotiuity) 27 IPS Sectio 5.2 Samplig distributio for sample meas Cetral limit theorem: if > 30 σ Basis of most iferece x ~ Nµ, methods for meas Our goal is to make ifereces about a populatio parameter (e.g. µ, p) from iformatio i a sample We do this by studyig the (theoretical) samplig distributio of sample statistics ( e.g. x ad pˆ ) We oly observe a sigle sample, but after we kow the theoretical samplig distributio, the we kow the reliability of a sigle sample statistic 28 IPS Chapter 6.1 Cofidece itervals (1 st of 2 mai types of iferece) Geeral form: poit estimate + margi of error poit estimate + z α/2 (SD of estimate) Iterpretatio of a 95% CI  i repeated samplig 95% of CIs will cover the populatio parameter Cofidece iterval for mea µ σ σ  z α 2, x + zα 2 CI ca be used to determie sample size Set margi of error = desired precisio & solve for 29 The last word Midterm Exam March 14 th  Brig oe doublesided sheet of otes  You will eed a hadheld calculator  Exam is closedbook  Covers Lectures 116, IPS chapters Before ext class, look over IPS
More information