STAT 145 (Notes) Al Nosedal Department of Mathematics and Statistics University of New Mexico. Fall 2013


 Kristian McBride
 8 years ago
 Views:
Transcription
1 STAT 145 (Notes) Al Nosedal Department of Mathematics and Statistics University of New Mexico Fall 2013
2 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN.
3 Conditions for Inference about mean We can regard our data as a simple random sample (SRS) from the population. This condition is very important. Observations from the population have a Normal distribution with mean µ and standard deviation σ. In practice, it is enough that the distribution be symmetric and singlepeaked unless the sample is very small. Both µ and σ are unknown parameters.
4 Standard Error When the standard deviation of a statistic is estimated from data, the result is called the standard error of the statistic. The standard s error of the sample mean x is n.
5 Travel time to work A study of commuting times reports the travel times to work of a random sample of 1000 employed adults. The mean is x = 49.2 minutes and the standard deviation is s = 63.9 minutes. What is the standard error of the mean?
6 Solution The standard error of the mean is s n = = minutes
7 The onesample t statistic and the t distributions Draw an SRS of size n from a large population that has the Normal distribution with mean µ and standard deviation σ. The onesample t statistic t = x µ s/ n has the t distribution with n 1 degrees of freedom.
8 The t distributions The density curves of the t distributions are similar in shape to the Standard Normal curve. They are symmetric about 0, singlepeaked, and bellshaped. The spread of the t distributions is a bit greater than of the Standard Normal distribution. The t distributions have more probability in the tails and less in the center than does the Standard Normal. This is true because substituting the estimate s for the fixed parameter σ introduces more variation into the statistic. As the degrees of freedom increase, the t density curve approaches the N(0, 1) curve ever more closely. This happens because s estimates σ more accurately as the sample size increases. So using s in place of σ causes little extra variation when the sample is large.
9 Density curves Std. Normal t(2) t(9)
10 Density curves Std. Normal t(29)
11 Density curves Std. Normal t(49)
12 Critical values Use Table C or software to find a) the critical value for a onesided test with level α = 0.05 based on the t(4) distribution. b) the critical value for 98% confidence interval based on the t(26) distribution.
13 Solution a) t = b) t = 2.479
14 More critical values You have an SRS of size 30 and calculate the onesample tstatistic. What is the critical value t such that a) t has probability to the right of t? b) t has probability 0.75 to the left of of t?
15 Solution Here, df = 30 1 = 29. a) t = b) t = 0.683
16 The onesample t confidence interval Draw an SRS of size n from a large population having unknown mean µ. A level C confidence interval for µ is x ± t s n where t is the critical value for the t(n 1) density curve with area C between t and t. This interval is exact when the population distribution is Normal and is approximately correct for large n in other cases.
17 Critical values What critical value t from Table C would you use for a confidence interval for the mean of the population in each of the following situations? a) A 95% confidence interval based on n = 12 observations. b) A 99% confidence interval from an SRS of 18 observations. c) A 90% confidence interval from a sample of size 6.
18 Solution a) df = 12 1 = 11, so t = b) df = 18 1 = 17, so t = c) df = 6 1 = 5, so t =
19 Ancient air The composition of the earth s atmosphere may have changed over time. To try to discover the nature of the atmosphere long ago, we can examine the gas in bubbles inside ancient amber. Amber is tree resin that has hardened and been trapped in rocks. The gas in bubbles within amber should be a sample of the atmosphere at the time the amber was formed. Measurements on specimens of amber from the late Cretaceous era (75 to 95 million years ago) give these percents of nitrogen: Assume (this is not yet agreed on by experts) that these observations are an SRS from the late Cretaceous atmosphere. Use a 90% confidence interval to estimate the mean percent of nitrogen in ancient air (Our presentday atmosphere is about 78.1% nitrogen).
20 Solution µ = mean percent of nitrogen in ancient air. We will estimate µ with a 90% confidence interval. With x = , s = , and t = (df = 9 1 = 8), the 90% confidence interval for µ is ( ) ± ± to
21 The onesample t test Draw an SRS of size n from a large population having unknown mean µ. To test the hypothesis H 0 : µ = µ 0, compute the onesample t statistic t = x µ 0 s/ n In terms of a variable T having the t(n 1) distribution, the Pvalue for a test of H 0 against H a : µ > µ 0 is P(T t ). H a : µ < µ 0 is P(T t ). H a : µ µ 0 is 2P(T t ). These Pvalues are exact if the population distribution is Normal and are approximately correct for large n in other cases.
22 Is it significant? The onesample t statistic for testing H 0 : µ = 0 H a : µ > 0 from a sample of n = 20 observations has the value t = a) What are the degrees of freedom for this statistic? b) Give the two critical values t from Table C that bracket t. What are the onesided Pvalues for these two entries? c) Is the value t = 1.84 significant at the 5% level? Is it significant at the 1% level? d) (Optional) If you have access to suitable technology, give the exact onesided Pvalue for t = 1.84?
23 Solution a) df = 201 = 19. b) t = 1.84 is bracketed by t = (with righttail probability 0.05) and t = (with righttail probability 0.025). Hence, because this is a onesided significance test, < Pvalue < c) This test is significant at the 5% level because the Pvalue < It is not significant at the 1% level because the Pvalue > d) From TI84, tcdf(1.84,1000,19), Pvalue =
24 Is it significant? The onesample t statistic from a sample of n = 15 observations for the twosided test of H 0 : µ = 64 H a : µ 64 has the value t = a) What are the degrees of freedom for t? b) Locate the twocritical values t from Table C that bracket t. What are the twosided Pvalues for these two entries? c) is the value t = 2.12 statistically significant at the 10% level? At the 5% level? d) (Optional) If you have access to suitable technology, give the exact twosided Pvalue for t = 2.12.
25 Solution a) df = 151 = 14. b) t = 2.12 is bracketed by t = (with twotail probability 0.10) and t = (with twotail probability 0.05). Hence, because this is a twosided significance test, 0.05 < Pvalue < c) This test is significant at the 10% level because the Pvalue < It is not significant at the 5% level because the Pvalue > d) From TI84, tcdf(2.12,1000,19) = , Pvalue = 2(0.0261) =
26 Ancient air, continued Do the data of Exercise 18.7 (see above) give good reason to think that the percent of nitrogen in the air during the Cretaceous era was different from the present 78.1%? Carry out a test of significance at the 5% significance level.
27 Solution Is there evidence that the percent of nitrogen in ancient air was different from the present 78.1%? 1. State hypotheses. H 0 : µ = 78.1% vs H a : µ 78.1%. 2. Test statistic. t = x µ 0 σ/ n = / 9 = Pvalue. For df = 8, this is beyond anything shown in Table C, so Pvalue < (TI 84 gives ). 4. Conclusion. We have very strong evidence (Pvalue < 0.001) that Cretaceous air is different from nitrogen than modern air.
28 Matched pairs t procedures To compare the responses to the two treatments in a matched pairs design, find the difference between the responses within each pair. Then apply the onesample t procedures to these differences.
29 Genetic engineering for cancer treatment Here s a new idea for treating advanced melanoma, the most serious kind of skin cancer. Genetically engineer white blood cells to better recognize and destroy cancer cells, then infuse these cells into patients. The subjects in a small initial study were 11 patients whose melanoma had not responded to existing treatments. One question was how rapidly the new cells would multiply after infusion, as measured by the doubling time in days. Here are the doubling times: a) Examine the data. Is it reasonable to use the t procedures? b) Give a 90% confidence interval for the mean doubling time. Are you willing to use this interval to make an inference about the mean doubling time in a population of similar patients?
30 Solution a) The stemplot does not show any severe evidence of nonnormality, so t procedures should be safe
31 Solution b) With x = days, s = days, and t = (df = 10), the 90% confidence interval is ( ) ± ± to days
32 Solution There is no indication that the sample represents an SRS of all patients whose melanoma has not responded to existing treatments, so inferring to the population of similar patients may not be reasonable.
33 Genetic engineering for cancer treatment, continued Another outcome in the cancer experiment described above is measured by a test for the presence of cells that trigger an immune response in the body and so may help fight cancer. Here are data for the 11 subjects: counts of active cells per 100,000 cells before and after infusion of the modified cells. The difference (after minus before) is the response variable.
34 Data Before After Difference Difference/
35 Genetic engineering for cancer treatment, continued a) Examine the data. Is it reasonable to use the t procedures? b) If your conclusion in part a) is Yes, do the data give convincing evidence that the count of active cells is higher after treatment?
36 Solution a) The stem plot of differences shows an extreme right skew, and one or two high outliers. The t procedures should not be used
37 Solution b) 1. State hypotheses. H 0 : µ = 0 vs H a : µ > Test statistic. t = x µ 0 σ/ n = / = Pvalue. For df = 10, using Table C we have that: < t < Which implies that: < Pvalue < TI 84 gives Conclusion. We would have strong evidence against H 0 (Pvalue < 0.05). We would conclude that the count of active cells is higher after treatment.
38 Robust Procedures A confidence interval or significance test is called robust if the confidence level or Pvalue does not change very much when the conditions for use of the procedure are violated.
39 Using t procedures Except in the case of small samples, the condition that the data are an SRS from the population of interest is more important than the condition that the population distribution is Normal. Sample size less than 15: Use t procedures if the data appear close to Normal (roughly symmetric single peak, no outliers). If the data are clearly skewed or if outliers are present, do not use t. Sample size at least 15: The t procedures can be used except in the presence of outliers or strong skewness. Large samples: The t procedures can be used even for clearly skewed distributions when the sample is large, roughly n 40.
40 CHAPTER 19 TWOSAMPLE PROBLEMS
41 Twosample problems The goal of inference is to compare the responses to two treatments or to compare the characteristics of two populations. We have a separate sample from each treatment or each population.
42 Conditions for inference comparing two means We have two SRSs, from two distinct populations. The samples are independent. That is, one sample has no influence on the other. Matching violates independence, for example. We measure the same response variable for both samples. Both populations are Normally distributed. The means and standard deviations of the populations are unknown. In practice, it is enough that the distributions have similar shapes and that the data have no strong outliers.
43 The TwoSample Procedures Draw an SRS of size n 1 from a Normal population with unknown mean µ 1, and draw and independent SRS of size n 2 from another Normal population with unknown mean µ 2. A confidence interval for µ 1 µ 2 is given by ( x 1 x 2 ) ± t s 2 1 n 1 + s2 2 n 2 Here t is the critical value for the t(k) density curve with area C between t and t. The degrees of freedom k are equal to the smaller of n 1 1 and n 2 1.
44 The TwoSample Procedures To test the hypothesis H 0 : µ 1 = µ 2, calculate the twosample t statistic t = x 1 x 2 s 2 1 n 1 + s2 2 n 2 and use Pvalues or critical values for the t(k) distribution.
45 Degrees of freedom (Option 1) Option 1. With software, use the statistic t with accurate critical values from the approximating t distribution. The distribution of the twosample t statistic is very close to the t distribution with degrees of freedom df given by df = ( ) s n 1 + s2 2 n 2 ( ) ( ) 1 s 2 2 ( ) ( ) 1 n 1 1 n s n 2 1 n 2 This approximation is accurate when both sample sizes n 1 and n 2 are 5 or larger.
46 Degrees of freedom (Option2) Option 2. Without software, use the statistic t with critical values from the t distribution with degrees of freedom equal to the smaller of n 1 1 and n 2 1. These procedures are always conservative for any two Normal populations.
47 Logging in the rain forest Conservationists have despaired over destruction of tropical rain forest by logging, clearing, and burning. These words begin a report on a statistical study of the effects of logging in Borneo. Here are data on the number of tree species in 12 unlogged forest plots and 9 similar plots logged 8 years earlier: Unlogged: Logged : Does logging significantly reduce the mean number of species in a plot after 8 years? State the hypotheses and do a t test. Is the result significant at the 5% level?
48 Solution Does logging significantly reduce the mean number of species in a plot after 8 years? 1. State hypotheses. H 0 : µ 1 = µ 2 vs H a : µ 1 > µ 2, where µ 1 is the mean number of species in unlogged plots and µ 2 is the mean number of species in plots logged 8 years earlier. 2. Test statistic.t = x 1 x 2 = ( x 1 = 17.5, x 2 = , s s2 2 n 1 n2 s 1 = , s 2 = 4.5, n 1 = 12 and n 2 = 9) 3. Pvalue. Using Table C, we have df = 8, and < Pvalue < Using TI84, we have df = , and Pvalue = Conclusion. Since Pvalue < 0.05, we reject H 0. There is strong evidence that the mean number of species in unlogged plots is greater than that for logged plots 8 year after logging.
49 Logging in the rainforest, continued Use the data from the previous exercise to give a 99% confidence interval for the difference in mean number of species between unlogged and logged plots.
50 Solution 1. Find x 1 x 2. From what we did earlier: x 1 x 2 = = Find SE = Standard Error. We already know that: s 1 = , s 2 = 4.5, n 1 = 12 and n 2 = 9 s1 2 SE = + s = n 1 n = Find m = t SE. From Table C, we have df = 8 and 99% confidence level, then t = Hence, m = (3.355)(1.8132) = Find Confidence Interval. x 1 x 2 ± t SE = ± = to TI 84 gives a 99% confidence interval of 1.52 to species, using df =
51 Alcohol and zoning out Healthy men aged 21 to 35 were randomly assigned to one of two groups: half received 0.82 grams of alcohol per kilogram of body weight; half received a placebo. Participants were then given 30 minutes to read up to 34 pages of Tolstoy s War and Peace (beginning at chapter 1, with each page containing approximately 22 lines of text). Every two to four minutes participants were prompted to indicate whether they were zoning out. The proportion of time participants indicated they were zoning out was recorded for each subject. The table below summarizes data on the proportion of episodes of zoning out. (The study report gave the standard error of the mean s/ n, abbreviated as SEM, rather than the standard deviation s.)
52 Alcohol and zoning out Group n x SEM Alcohol Placebo a) What are the two sample standard deviations? b) What degrees of freedom does the conservative Option 2 use for twosample t procedures for these samples? c) Using Option 2, give a 90% confidence interval for the mean difference between the two groups.
53 Solution a) Call group 1 the Alcohol group,and group 2 the Placebo group. Then, because SEM = s/ n, we have s = SEM n. Hence, s 1 = = 0.5(5) = 0.25 and s 2 = = 0.03(5) = b) Using the conservative option 2, df = min{25 1, 25 1} = min{24, 24} = 24. c) Here, with n 1 = n 2 = 25, SE = s 2 1 n 1 + s2 2 n 2 = = With df = 24, we have t = 1.711, and a 90% confidence interval for the mean difference in proportions is given by ± 1.711(0.0583) = 0.13 ± , that is, from to
54 Stress and weight in rats In a study of the effects of stress on behavior in rats, 71 rats were randomly assigned to either a stressful environment or a control (non stressful) environment. After 21 days, the change in weight (in grams) was determined for each rat. The table below summarizes data on weight gain. (The study report gave the standard error of the mean s/ n, abbreviated as SEM, rather than the standard deviation s).
55 Stress and weight in rats Group n x SEM Stress No stress a) What are the standard deviations for the two groups? b) What degrees of freedom does the conservative Option 2 use for the twosample procedures for these data? c) Test the null hypothesis of no difference between the two group means against the twosided alternative at the 5% significance level. Use the degrees of freedom from part b).
56 Solution a) Call group 1 the Stress group,and group 2 the No stress group. Then, because SEM = s/ n, we have s = SEM n. Hence, s 1 = 3 20 = and s 2 = 2 51 = b) Using the conservative option 2, df = min{20 1, 51 1} = min{19, 50} = 19. c) 1. Set hypotheses. H 0 : µ 1 = µ 2 vs H a : µ 1 µ 2 2. Calculate test statistic. With n 1 = 20 and n 2 = 51, s 2 SE = 1 n 1 + s2 2 n 2 = = , t = x 1 x 2 SE = Pvalue. With df = 19, using Table C, 0.10 < Pvalue < Using TI84 (2SampTTest), Pvalue = Conclusion. Since Pvalue > 0.05 we CAN T reject H 0. There is little evidence in support of a conclusion that mean weights of rats in stressful environments differ from those of rats without stress.
57 CHAPTER 20 INFERENCE ABOUT A POPULATION PROPORTION
58 Problem A market research consultant hired by the PepsiCola Co. is interested in determining the proportion of UNM students who favor PepsiCola over Coke Classic. A random sample of 100 students shows that 40 students favor Pepsi over Coke. Use this information to construct a 95% confidence interval for the proportion of all students in this market who prefer Pepsi.
59 Bernoulli Distribution x i = { 1 ith person prefers Pepsi 0 ith person prefers Coke µ = E(x i ) = p σ 2 = V (x i ) = p(1 p), i.e. σ = p(1 p) Let ˆp be our estimate of p. Note that ˆp = n i=1 x i = x. If n is big, by the Central Limit Theorem, we know that: x is roughly ) ( ) σ N (µ, n, that is, ˆp is roughly N p, p(1 p) n
60 Sampling distribution of a sample proportion Draw an SRS of size n from a large population that contains proportion p of successes. Let ˆp be the sample proportion of successes, Then: ˆp = number of successes in the sample n The mean of the sampling distribution of ˆp is p. The standard deviation of the sampling distribution is p(1 p) n. As the sample size increases, the sampling distribution of ˆp becomes approximately Normal. That is, for large n, ˆp has approximately the N(p, p(1 p)/n).
61 Largesample confidence interval for a population proportion Draw an SRS of size n from a large population that contains an unknown proportion p of successes. An approximate level C confidence interval for p is ˆp ± z ˆp(1 ˆp) n where z is the critical value for the standard Normal density curve with area C between z and z. Use this interval only when the numbers of successes and failures in the sample are both 15 or greater.
62 Problem A survey of 611 office workers investigated telephone answering practices, including how often each office worker was able to answer incoming telephone calls and how often incoming telephone calls went directly to voice mail. A total of 281 office workers indicated that they never need voice mail and are able to take every telephone call. a. What is the point estimate of the proportion of the population of office workers who are able to take every telephone call? b. At 90% confidence, what is the margin of error? c. What is the 90% confidence interval for the proportion of the population of office workers who are able to take every telephone call?
63 Solution p = proportion of the population of office workers who are able to take every telephone call. a. ˆp = = 0.46 b. Margin of error = z 1.645(0.0201) = ˆp(1 ˆp) (0.46)(0.54) n = = c. ˆp ± z ˆp(1 ˆp) n 0.46 ±.0330 From to We are 90% confident that the proportion of the population of office workers who are able to take every phone call lies between 42.70% and 49.30%.
64 Computer/Internetbased crime With over 50% of adults spending more than an hour a day on the internet, the number experiencing computeror Internet based crime continues to rise. A survey in 2010 of a random sample of 1025 adults, aged 18 and older, reached by random digit dialing found 113 adults in the sample who said that they or a household member had been the victim of a computer or Internet crime on their home computer in the past year. Give the 99% confidence interval for the proportion p of all households that have experienced computer or Internet crime during the year before the survey was conducted.
65 Solution p = proportion of all households that have experienced computer or Internet crime during the year before the survey was conducted. Step 1. Find ˆp. ˆp = number of successes = sample size = Step 2. Find m = margin of error z ˆp(1 ˆp) ( ) n = = 2.576(0.0097) = Step 3. Find confidence interval, i.e. ˆp m and ˆp + m. ˆp m = = ˆp + m = = Based on this sample, we are 99% confident that between 8.53% and 13.51% of Internet users were victims of computer or Internet crime during the year before this survey was taken.
66 Sample Size for desired margin of error The level C confidence interval for a population proportion p will have margin of error approximately equal to a specified value m when the sample size is ( ) z 2 n = p (1 p ) m where p is a guessed value for the sample proportion. The margin of error will always be less than or equal to m if you take the guess p to be 0.5.
67 Problem The percentage of people not covered by health care insurance in 2003 was 15.6%. A congressional committee has been charged with conducting a sample survey to obtain more current information. a. What sample size would you recommend if the committee s goal is to estimate the current proportion of individuals without health care insurance with a margin of error of 0.03? Use a 95% confidence level. b. Repeat part a) assuming that you have no prior information (i.e. p = 0.5). c. Repeat part a) using a 99% confidence level. d. Repeat part c) assuming that you have no prior information (i.e. p = 0.5).
68 Solution a. n = ( ) z 2 m p (1 p ) = ( ) (0.156) ( ) = 563 b. n = ( ) z 2 m p (1 p ) = ( ) (0.5) (1 0.5) = 1068 ) 2 p (1 p ) = ( ) (0.156) ( ) = 971 c. n = ( z m c. n = ( z m 0.03 ) 2 p (1 p ) = ( ) 2 (0.5) (1 0.5) = 1844
69 Graph of p (1 p ) p*(1p*) p*= p*
70 Canadians and doctorassisted suicide A Gallup Poll asked a sample of Canadian adults if they thought the law should allow doctors to end the life of a patient who is in great pain and near death if the patient makes a request in writing. The poll included 270 people in Quebec, 221 of whom agreed that doctorassisted suicide should be allowed. a) What is the margin of error of the largesample 95% confidence interval for the proportion of all Quebec adults who would allow doctorassisted suicide? b) How large a sample is needed to get the common ±3 percentage point margin of error? Use the previous sample as a pilot study to get p.
71 Solution a) We find that ˆp = = and SEˆp ˆp(1 ˆp) = n = Hence, the margin of error for 95% confidence is (1.96)(0.0234) = b) For a ±0.03 margin of error, we need n = ( ) (0.8185)( ) = Round this up in order to obtain a required sample size of 635.
72 Significance tests for a proportion Draw an SRS of size n from a large population that contains an unknown proportion p of successes. To test the hypothesis H 0 : p = p 0, compute the z statistic z = ˆp p 0 p 0 (1 p 0 ) n Find Pvalues from the standard Normal distribution. Use this test when the sample size n is so large that both np 0 and n(1 p 0 ) are 10 or more.
73 Pvalues To test the hypothesis H 0 : p = p 0, compute the z statistic z = ˆp p 0 p 0 (1 p 0 ) n In terms of a variable Z having the standard Normal distribution, the approximate Pvalue for a test of H 0 against H a : p > p 0 is P(Z > z ) H a : p < p 0 is P(Z < z ) H a : p p 0 is 2P(Z > z )
74 Problem A study found that, in 2005, 12.5% of U.S. workers belonged to unions. Suppose a sample of 400 U.S. workers is collected in 2006 to determine whether union efforts to organize have increased union membership. a. Formulate the hypotheses that can be used to determine whether union membership increased in b. If the sample results show that 52 of the workers belonged to unions, what is the pvalue for your hypothesis test? c. At α = 0.05, what is your conclusion?
75 Solution p = proportion of all U.S. workers who belonged to a union in a.h 0 : p = vs H a : p > b. ˆp = = 0.13 z = ˆp p 0 p0 (1 p 0 ) n = (0.125)(0.875) 400 = 0.30 Using Table A, Pvalue = P(Z > z ) = P(Z > 0.30) = = c. Pvalue > 0.05, do not reject H 0. We cannot conclude that there was an increase in union membership.
76 Problem A study by Consumer Reports showed that 64% of supermarket shoppers believe supermarket brands to be as good as national name brands. To investigate whether this result applies to its own product, the manufacturer of a national namebrand ketchup asked a sample of shoppers whether they believed that supermarket ketchup was as good as the national brand ketchup. a. Formulate the hypotheses that could be used to determine whether the percentage of supermarket shoppers who believe that the supermarket ketchup was as good as the national brand ketchup differed from 64%. b. If a sample of 100 shoppers showed 52 stating that the supermarket brand was as good as the national brand, what is the pvalue? c. At α = 0.05, what is your conclusion?
77 Solution p = proportion of all shoppers who believe supermarket brands to be as good as this particular national brand ketchup. a.h 0 : p = 0.64 vs H a : p 0.64 b. ˆp = = 0.52 z = ˆp p 0 p0 (1 p 0 ) n = (0.64)(0.36) 100 = 2.50 Using Table A, Pvalue = 2P(Z > z ) = 2P(Z > 2.50 ) = 2P(Z > 2.50) = 2(0.0062) = c. Pvalue < 0.05, reject H 0. Proportion differs from the reported 0.64.
78 Problem The National Center for Health Statistics released a report that stated 70% of adults do not exercise regularly. A researcher decided to conduct a study to see whether the claim made by the National Center for Health Statistics differed on a statebystate basis. a. State the null and alternative hypotheses assuming the intent of the researcher is to identify states that differ from 70% reported by the National Center for Health Statistics. b. At α = 0.05, what is the research conclusion for the following states?: Wisconsin: 252 of 350 adults did not exercise regularly. California: 189 of 300 adults did not exercise regularly.
79 Solution (Wisconsin) p = proportion of all adults in the state of Wisconsin that do not exercise regularly. a.h 0 : p = 0.70 vs Ha : p 0.70 b. (Wisconsin) ˆp = = 0.72 z = ˆp p 0 p0 (1 p 0 ) n = (0.70)(0.30) 350 = 0.82 Using Table A, Pvalue = 2P(Z > z ) = 2P(Z > 0.82 ) = 2P(Z > 0.82) = 2(0.2061) = c. Pvalue > 0.05, do not reject H 0. We don t have evidence to claim that Wisconsin is one of those states where the proportion of adults that do not exercise regularly differs from 70%.
80 Solution (California) p = proportion of all adults in the state of California that do not exercise regularly. a.h 0 : p = 0.70 vs Ha : p 0.70 b. (California) ˆp = = 0.63 z = ˆp p 0 p0 (1 p 0 ) n = (0.70)(0.30) 300 = 2.65 Using Table A, Pvalue = 2P(Z > z ) = 2P(Z > 2.65 ) = 2P(Z > 2.65) = 2( ) = c. Pvalue < 0.05, we reject H 0. We have strong evidence to claim that California s proportion of adults that do not exercise regularly differs from 70%.
81 CHAPTER 23 TWO CATEGORICAL VARIABLES: THE CHISQUARE TEST
82 Problem A sample of 70 girls and boys was taken at random from a population of children (perhaps of a particular age of interest), and then the numbers of righthanded boys, righthanded girls, lefthanded boys, and lefthanded girls were recorded as shown below: Girls Boys Total Lefthanded Righthanded Total Is there evidence that, in the sampled population, handedness is independent of sex? (Use α = 0.05)
83 Step 1. State Hypotheses H 0 : In the sampled population, there is NO relationship between handedness and sex. H a : In the sampled population, there is a relationship between handedness and sex.
84 Step 2. Compute test statistic. Draw an SRS from a large population and make a twoway table of the sample counts for two categorical variables. To test the null hypotheses H 0 that there is no relationship between the row and column variables in the population, calculate the chisquare statistic X 2 = ( observed count  expected count ) 2 expected count The sum is over all cells in the table. The chisquare test rejects H 0 when χ 2 is large. Sofware finds Pvalues from a chisquare distribution.
85 Step 2. (cont.) Expected Counts. P(lefthanded) = P(righthanded) = Expected number of lefthanded girls = number of girls P(lefthanded) = (36)( ) = (36)(18) 70 = Expected number of righthanded girls = number of girls P(righthanded) = (36)( ) = (36)(52) 70 = Expected number of lefthanded boys = number of boys P(lefthanded) = (34)( ) = (34)(18) 70 = Expected number of righthanded boys = number of boys P(righthanded) = (34)( ) = (34)(52) 70 =
86 Expected Counts The expected counts in any cell of a twoway table when H 0 is true is: expected count = row total x column total table total
87 Table of Expected Counts Girls Boys Total Lefthanded Righthanded Total
88 X 2 = ( ) ( ) (6 8.74) ( ) X 2 = = (Using TI84, X 2 = ).
89 Step 3. Find Pvalue d.f. = (r 1)(c 1) = (2 1)(2 1) = 1 Using Table D, 2.07 < X 2 = < 2.71 which implies that 0.10 < Pvalue < (Using TI84,Pvalue = )
90 Step 4. Conclusion Since Pvalue > 0.10 > 0.05 = α, we CAN T reject H 0. We conclude that there is NO relationship between handedness and sex.
91 Example Suppose the Federal Correction Agency wants to investigate the following question: Does a male released from federal prison make better adjustment to civilian life if he returns to his hometown or if he goes elsewhere to live? To put it another way, is there a relationship between adjustment to civilian life and place of residence after release from prison? The agency s psychologists interviewed 200 randomly selected former prisoners. Using a series of questions, the psychologists classified the adjustment of each individual to civilian life as being outstanding, good, fair, or unsatisfactory. The counts are given in the following contingency table. Examine the hypothesis that there is no relationship between adjustment to civilian life and place of residence after release from prison. Test at the α = 0.01 level.
92 Table of observed counts Outstanding Good Fair Unsatisfactory Total Hometown Not hometown Total
93 Step 1. State hypotheses H 0 : There is NO relationship between adjustment to civilian life and where the individual lives after being released from prison. H a : There is a relationship between adjustment to civilian life and where the individual lives after being released from prison.
94 Table of EXPECTED counts Outstanding Good Fair Unsatisfactory Total Hometown Not hometown Total
95 Step 2. Compute test statistic. X 2 = (27 24) (35 30)2 (15 20) (27 24) (25 20) (33 36) (25 30) (13 16) X 2 = X 2 = (Using TI84, X 2 = ).
96 Step 3. Find Pvalue d.f. = (r 1)(c 1) = (2 1)(4 1) = 3 Using Table D, 5.32 < X 2 = < 6.25 which implies that 0.10 < Pvalue < (Using TI84,Pvalue = )
97 Step 4. Conclusion Since Pvalue > 0.10 > 0.01 = α, we CAN T reject H 0. We conclude that there is NO relationship between adjustment to civilian life and where the prisoner resides after being released from prison. For the Federal Correction Agency s advisement program, adjustment to civilian life is not related to where the exprisoner lives after being released.
98 Cell counts required for the chisquare test You can safely use the chisquare test with Pvalues from the chisquare distribution when no more than 20% of the expected counts are less than 5 and all individual expected counts are 1 or greater. In particular, all four expected counts in a 2 2 table should be 5 or greater.
99 Majors for men and women in business A study of the career plans of young women and men sent questionnaires to all 722 members of the senior class in the College of Business Administration at the University of Illinois. One question asked which major within the business program the student had chosen. Here are the data from the students who responded: Female Male Total Accounting Administration Economics Finance Total This is an example of a single sample classified according to two categorical variables (gender and major).
100 Majors for men and women in business (cont.) a) Test the null hypothesis that there is no relationship between the gender of students and their choice of major. Give a Pvalue and state your conclusion(use α = 0.05.) b) Verify that the expected cell counts satisfy the requirement for use of chisquare. c) Which two cells have the largest terms in the chisquare statistic? How do observed and expected counts differ in these cells?
101 Solution a) Step 1. State Hypotheses. H 0 : There is NO relationship between gender and choice of major. H a : There is a relationship between gender and choice of major.
102 Solution Expected counts. Female Male Total Accounting Administration Economics Finance Total
103 Solution Step 2. Compute test statistic. X 2 = ( ) ( ) ( )2 (5 6.41) (6 4.59) ( ) ( ) ( ) X 2 = X 2 =
104 Solution Step 3. Find Pvalue. d.f. = (r 1)(c 1) = (2 1)(4 1) = 3 Using Table D, 9.84 < X 2 = < which implies that 0.01 < Pvalue < (Using TI84,Pvalue = )
105 Solution Step 4. Conclusion. Since Pvalue < 0.05, we reject H 0. We have fairly strong evidence that gender and choice of major are related. b) One expected count is less than 5, which is only oneeight (12.5%) of the counts in the table, so the chisquare procedure is acceptable. c) The largest chisquare components are the two from the Administration row. Many more women than we expect (91 actual, expected) chose this major, whereas only 40 men chose this (54.64 expected).
106 The Chisquare distributions The chisquare distributions are a family of distributions that take only positive values and are skewed to the right. A specific chisquare distribution is specified by giving its degrees of freedom. The chisquare test for a twoway table with r and c columns uses critical values from the chisquare distribution with (r1)(c1) degrees of freedom. The Pvalue is the area under the density curve of this chisquare distribution to the right of the value of the test statistic.
107 Density curves df=1 df=4 df=
Name: Date: Use the following to answer questions 34:
Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 OneWay ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationMTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
More informationOnline 12  Sections 9.1 and 9.2Doug Ensley
Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics  Ensley Assignment: Online 12  Sections 9.1 and 9.2 1. Does a Pvalue of 0.001 give strong evidence or not especially strong
More informationSTATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4
STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More information3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
More informationIs it statistically significant? The chisquare test
UAS Conference Series 2013/14 Is it statistically significant? The chisquare test Dr Gosia Turner Student Data Management and Analysis 14 September 2010 Page 1 Why chisquare? Tests whether two categorical
More informationChapter 23 Inferences About Means
Chapter 23 Inferences About Means Chapter 23  Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300minute
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationChapter 23. Two Categorical Variables: The ChiSquare Test
Chapter 23. Two Categorical Variables: The ChiSquare Test 1 Chapter 23. Two Categorical Variables: The ChiSquare Test TwoWay Tables Note. We quickly review twoway tables with an example. Example. Exercise
More informationCHAPTER 14 NONPARAMETRIC TESTS
CHAPTER 14 NONPARAMETRIC TESTS Everything that we have done up until now in statistics has relied heavily on one major fact: that our data is normally distributed. We have been able to make inferences
More informationSTAT 350 Practice Final Exam Solution (Spring 2015)
PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects
More informationChapter 23. Inferences for Regression
Chapter 23. Inferences for Regression Topics covered in this chapter: Simple Linear Regression Simple Linear Regression Example 23.1: Crying and IQ The Problem: Infants who cry easily may be more easily
More informationUnit 27: Comparing Two Means
Unit 27: Comparing Two Means Prerequisites Students should have experience with onesample tprocedures before they begin this unit. That material is covered in Unit 26, Small Sample Inference for One
More informationChapter 7 Section 1 Homework Set A
Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the
More informationUnit 26: Small Sample Inference for One Mean
Unit 26: Small Sample Inference for One Mean Prerequisites Students need the background on confidence intervals and significance tests covered in Units 24 and 25. Additional Topic Coverage Additional coverage
More informationGeneral Method: Difference of Means. 3. Calculate df: either WelchSatterthwaite formula or simpler df = min(n 1, n 2 ) 1.
General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either WelchSatterthwaite formula or simpler df = min(n
More informationReview. March 21, 2011. 155S7.1 2_3 Estimating a Population Proportion. Chapter 7 Estimates and Sample Sizes. Test 2 (Chapters 4, 5, & 6) Results
MAT 155 Statistical Analysis Dr. Claude Moore Cape Fear Community College Chapter 7 Estimates and Sample Sizes 7 1 Review and Preview 7 2 Estimating a Population Proportion 7 3 Estimating a Population
More informationMind on Statistics. Chapter 13
Mind on Statistics Chapter 13 Sections 13.113.2 1. Which statement is not true about hypothesis tests? A. Hypothesis tests are only valid when the sample is representative of the population for the question
More information5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives
C H 8A P T E R Outline 8 1 Steps in Traditional Method 8 2 z Test for a Mean 8 3 t Test for a Mean 8 4 z Test for a Proportion 8 6 Confidence Intervals and Copyright 2013 The McGraw Hill Companies, Inc.
More informationMind on Statistics. Chapter 12
Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference
More informationStatistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.11.6) Objectives
More informationTwosample ttests.  Independent samples  Pooled standard devation  The equal variance assumption
Twosample ttests.  Independent samples  Pooled standard devation  The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular
More informationC. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.
Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample
More informationChapter 8 Section 1. Homework A
Chapter 8 Section 1 Homework A 8.7 Can we use the largesample confidence interval? In each of the following circumstances state whether you would use the largesample confidence interval. The variable
More informationIntroduction. Chapter 14: Nonparametric Tests
2 Chapter 14: Nonparametric Tests Introduction robustness outliers transforming data other standard distributions nonparametric methods rank tests The most commonly used methods for inference about the
More informationNonparametric TwoSample Tests. Nonparametric Tests. Sign Test
Nonparametric TwoSample Tests Sign test MannWhitney Utest (a.k.a. Wilcoxon twosample test) KolmogorovSmirnov Test Wilcoxon SignedRank Test TukeyDuckworth Test 1 Nonparametric Tests Recall, nonparametric
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationChi Square Tests. Chapter 10. 10.1 Introduction
Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square
More informationChapter 19 The ChiSquare Test
Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The ChiSquare Test In this chapter, we will discuss the following topics: We will plot
More informationa) Find the five point summary for the home runs of the National League teams. b) What is the mean number of home runs by the American League teams?
1. Phone surveys are sometimes used to rate TV shows. Such a survey records several variables listed below. Which ones of them are categorical and which are quantitative?  the number of people watching
More informationHypothesis Testing. Steps for a hypothesis test:
Hypothesis Testing Steps for a hypothesis test: 1. State the claim H 0 and the alternative, H a 2. Choose a significance level or use the given one. 3. Draw the sampling distribution based on the assumption
More informationUnit 26 Estimation with Confidence Intervals
Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference
More informationMONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010
MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationData Analysis Tools. Tools for Summarizing Data
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
More informationt Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon
ttests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com
More informationTwosample hypothesis testing, II 9.07 3/16/2004
Twosample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For twosample tests of the difference in mean, things get a little confusing, here,
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationTHE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationAn Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10 TWOSAMPLE TESTS
The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10 TWOSAMPLE TESTS Practice
More informationInference for two Population Means
Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example
More informationFinal Exam Practice Problem Answers
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
More informationIn the past, the increase in the price of gasoline could be attributed to major national or global
Chapter 7 Testing Hypotheses Chapter Learning Objectives Understanding the assumptions of statistical hypothesis testing Defining and applying the components in hypothesis testing: the research and null
More informationBA 275 Review Problems  Week 5 (10/23/0610/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380394
BA 275 Review Problems  Week 5 (10/23/0610/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete
More informationExperimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test
Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely
More information9 Testing the Difference
blu49076_ch09.qxd 5/1/2003 8:19 AM Page 431 c h a p t e r 9 9 Testing the Difference Between Two Means, Two Variances, and Two Proportions Outline 9 1 Introduction 9 2 Testing the Difference Between Two
More information3. There are three senior citizens in a room, ages 68, 70, and 72. If a seventyyearold person enters the room, the
TMTA Statistics Exam 2011 1. Last month, the mean and standard deviation of the paychecks of 10 employees of a small company were $1250 and $150, respectively. This month, each one of the 10 employees
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationStats Review Chapters 910
Stats Review Chapters 910 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test
More informationGood luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
STT315 Practice Ch 57 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The length of time a traffic signal stays green (nicknamed
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationMath 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
More informationSolutions to Homework 3 Statistics 302 Professor Larget
s to Homework 3 Statistics 302 Professor Larget Textbook Exercises 3.20 Customized Home Pages A random sample of n = 1675 Internet users in the US in January 2010 found that 469 of them have customized
More informationStats for Strategy Fall 2012 FirstDiscussion Handout: Stats Using Calculators and MINITAB
Stats for Strategy Fall 2012 FirstDiscussion Handout: Stats Using Calculators and MINITAB DIRECTIONS: Welcome! Your TA will help you apply your Calculator and MINITAB to review Business Stats, and will
More information11. Analysis of Casecontrol Studies Logistic Regression
Research methods II 113 11. Analysis of Casecontrol Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
More information= $96 = $24. (b) The degrees of freedom are. s n. 7.3. For the mean monthly rent, the 95% confidence interval for µ is
Chapter 7 Solutions 71 (a) The standard error of the mean is df = n 1 = 15 s n = $96 = $24 (b) The degrees of freedom are 16 72 In each case, use df = n 1; if that number is not in Table D, drop to the
More information5.1 Identifying the Target Parameter
University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying
More informationNCSS Statistical Software
Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, twosample ttests, the ztest, the
More informationMath 108 Exam 3 Solutions Spring 00
Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationP(every one of the seven intervals covers the true mean yield at its location) = 3.
1 Let = number of locations at which the computed confidence interval for that location hits the true value of the mean yield at its location has a binomial(7,095) (a) P(every one of the seven intervals
More informationSection 1.1 Exercises (Solutions)
Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees
More informationCHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression
Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the
More informationTwoSample TTests Allowing Unequal Variance (Enter Difference)
Chapter 45 TwoSample TTests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when no assumption
More informationChicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
More informationIntroduction. Hypothesis Testing. Hypothesis Testing. Significance Testing
Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters
More informationThe ChiSquare Test. STAT E50 Introduction to Statistics
STAT 50 Introduction to Statistics The ChiSquare Test The Chisquare test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
More informationMind on Statistics. Chapter 15
Mind on Statistics Chapter 15 Section 15.1 1. A student survey was done to study the relationship between class standing (freshman, sophomore, junior, or senior) and major subject (English, Biology, French,
More information6: Introduction to Hypothesis Testing
6: Introduction to Hypothesis Testing Significance testing is used to help make a judgment about a claim by addressing the question, Can the observed difference be attributed to chance? We break up significance
More informationRegression Analysis: A Complete Example
Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty
More informationSimulating ChiSquare Test Using Excel
Simulating ChiSquare Test Using Excel Leslie Chandrakantha John Jay College of Criminal Justice of CUNY Mathematics and Computer Science Department 524 West 59 th Street, New York, NY 10019 lchandra@jjay.cuny.edu
More informationAP STATISTICS (WarmUp Exercises)
AP STATISTICS (WarmUp Exercises) 1. Describe the distribution of ages in a city: 2. Graph a box plot on your calculator for the following test scores: {90, 80, 96, 54, 80, 95, 100, 75, 87, 62, 65, 85,
More informationInstitute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
More informationTwoSample TTests Assuming Equal Variance (Enter Means)
Chapter 4 TwoSample TTests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one or twosided twosample ttests when the variances of
More informationCHAPTER 12. ChiSquare Tests and Nonparametric Tests LEARNING OBJECTIVES. USING STATISTICS @ T.C. Resort Properties
CHAPTER 1 ChiSquare Tests and Nonparametric Tests USING STATISTICS @ T.C. Resort Properties 1.1 CHISQUARE TEST FOR THE DIFFERENCE BETWEEN TWO PROPORTIONS (INDEPENDENT SAMPLES) 1. CHISQUARE TEST FOR
More information8 6 X 2 Test for a Variance or Standard Deviation
Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the Pvalue method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion
More informationThe Variability of PValues. Summary
The Variability of PValues Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 276958203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report
More informationBA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420
BA 275 Review Problems  Week 6 (10/30/0611/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394398, 404408, 410420 1. Which of the following will increase the value of the power in a statistical test
More informationIntroduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
More informationTests of Hypotheses Using Statistics
Tests of Hypotheses Using Statistics Adam Massey and Steven J. Miller Mathematics Department Brown University Providence, RI 0292 Abstract We present the various methods of hypothesis testing that one
More informationChi Square Distribution
17. Chi Square A. Chi Square Distribution B. OneWay Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes
More informationCOMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.
277 CHAPTER VI COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES. This chapter contains a full discussion of customer loyalty comparisons between private and public insurance companies
More informationThe Math. P (x) = 5! = 1 2 3 4 5 = 120.
The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct
More informationCONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont
CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency
More informationHow To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. OneWay
More informationChapter 7. Oneway ANOVA
Chapter 7 Oneway ANOVA Oneway ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The ttest of Chapter 6 looks
More informationNeed for Sampling. Very large populations Destructive testing Continuous production process
Chapter 4 Sampling and Estimation Need for Sampling Very large populations Destructive testing Continuous production process The objective of sampling is to draw a valid inference about a population. 4
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Sample Practice problems  chapter 121 and 2 proportions for inference  Z Distributions Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide
More informationBivariate Statistics Session 2: Measuring Associations ChiSquare Test
Bivariate Statistics Session 2: Measuring Associations ChiSquare Test Features Of The ChiSquare Statistic The chisquare test is nonparametric. That is, it makes no assumptions about the distribution
More information