Statistics EXAM II through 2003

Size: px
Start display at page:

Download "Statistics 641 - EXAM II - 1999 through 2003"

Transcription

1 Statistics EXAM II through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the P-value of the data was If α=.01 and the true value of µ was µ=7, then the decision based on the data A. was a Type I error. B. was a Type II error. C. was correct. D. cannot be determined. E. all of the above (2) The P-value of the computed value of a test statistic is A. the weight of evidence in favor of H 1. B. the largest value of α for which the observed data will reject H o. C. the smallest value of α for which the observed data will reject H o. D. the probability of observing a less extreme value of the test statistic. (3) A 95% confidence interval for µ was calculated to be (15,27). Then 95% represents: A. the probability of a type I error. B. the probability that µ is between 15 and 27. C. the probability that we obtain a sample which will yield an interval containing µ. D. the probability that µ 10. (4) The most crucial of the conditions imposed on the sampled data and the populations in order for the pooled t-test to be valid is A. normality. B. equal variance. C. independence. D. all three conditions are equally important. E. none of the conditions are crucial (5) In a hypotheses test of H o : µ 5 vs H 1 : µ < 5, with σ known, if the sample size remains constant, but the level α is increased from.01 to.05, then the probability of a Type II error at µ=4, A. increases. B. decreases. C. remains the same. D. may increase or decrease depending on the sample size. E. cannot be determined with the given information. 1

2 (6) The reason that experimental units are paired in a study to compare the average responses of two treatments A. is to reduce the degrees of freedom of the t-test. B. is to reduce the variance of the difference in the two sample means. C. is to increase the degrees of freedom of the t-test. D. is to make the difference in the two sample means normally distributed. (7) The Wilcoxon rank sum statistic is preferred to the pooled t-test A. if the population distributions are normally distributed. B. for all continuous distributions. C. if the population distributions are symmetric. D. if the population distributions have equal variance. E. if the population distributions have very heavy tails. (8) A 95/99 tolerance interval for a normal population A. has a higher level of confidence than a 95% confidence interval on the population mean. B. is a 95% estimate of µ and a 99% estimate of σ. C. an estimate of a region of values which will contain between 95% and 99% of the population values. D. is a region of values for which we are 99% confident that the region contains 95% of the population values. E. is based on the central limit theorem. (9) An experimenter wants to test H o : F = F o, where F is a process cdf. Which one of the following statements is TRUE? A. The Chi-squared GOF test is the preferred test statistic. B. The most powerful test statistic depends on the shape of F o. C. The Anderson-Darling test has greater power than any other test. D. The Shapiro-Wilk test has greater power than the Chi-squared test. E. The Chi-squared GOF test can only be used when F o is discrete. (10) If f(y; θ) is a pdf which is symmetric about θ, then, amongst the three test statistics discussed in class, the test statistic having greatest power A. is the Wilcoxon Signed Rank test. B. is the sign test C. is the t-test D. depends on the form of f(y; θ) 2

3 II. (28 points) In the following problems, (A) state the null and alternative hypotheses, (B) give the formula for the test statistic but do not compute its value, (C) set-up the rejection region by selecting the proper value from the appropriate table. (1) Two different types of fabrics (A and B) are to be compared on a Martingale wear tester. The wear tester is known to be quite variable from run to run. Thus, on each run of the tester, the operator evaluates a sample of Fabric A and a sample of Fabric B. The weight losses (in milligrams) from seven runs are as follows: Run : X S Fabric A: Fabric B: Is there significant evidence (α = 0.01) that Fabric A has a smaller average weight loss than Fabric B? A. H o : H a : B. Test statistic: C. Rejection Region: (2) An experiment is run to study the effects of PCB, an industrial contaminant, on the reproductive ability of owls. The shell thickness of eggs produced by owls exposed to PCB are compared to the shell thickness of eggs produced by owls which did not have PCB exposures. From previous studies, the shell thickness of eggs has been normally distributed. Owl X S PCB-Exposed: UnExposed: Is there significant(α =.05) evidence that the PCB exposed owls have thinner egg shells than those of the unexposed owls? A. H o : H a : B. Test statistic: C. Rejection Region: (3) Scientists think that robots will play a crucial role in factories in the next 20 years. Suppose that in an experiment to determine whether the use of robots to weave computer cables is feasible, a robot was used to assemble 10 cables and an experienced worker also assembled 10 cables. The cables were examined and the number of defectives on each cable was recorded. The 10 data points for each method were plotted in a normal probability plot and the plot indicated a very heavy-tailed distribution. The data is given here: Cable Mean St.Dev. Robot Human Does this data support the assertion that the average number of defectives per assembled cable is less for robots than for humans? Use α = A. H o : H a : B. Test statistic: C. Rejection Region: 3

4 III. (32 points) A company is developing a new cooling fan for diesel engines. From previous studies involving the old fan, the exponential distribution provided an adequate model for the time until failure of the fan. The existing fan has an average time to failure of 25,000 hours. The company developing the new cooling fan wants to determine if the new fan has a longer average time to failure than the old fan. (1) Suppose 20 of the new fans are evaluated in an accelerated life testing experiment and their times to failure T 1,..., T 20 are determined. Describe a graphical technique to evaluate whether the failure times from the accelerated life testing follow an exponential distribution. Be sure to define all your terms and explicitly label both axes on your graphs. (2) Suppose we have determined that the times to failure from the accelerated life testing procedures satisfy an exponential distribution and the average time to failure of the 20 fans is T = 32, 381 hours using the accelerated life testing procedures. Construct 95% confidence interval for the average time to failure for the new fans. Hint: If T 1,..., T n are iid exponential random variables with mean λ then 2n T λ has a Chi-square distribution with d.f. = 2n. (3) Is there sufficient evidence (α =.05), using the data from part (2) of this problem, that the new fan has a longer average time to failure than the fan currently in use? Compute the p-value of your test. (4) If the average time to failure of the new fan is 30,000 hours, compute the probability that the test in part (3) of this problem will detect that the average time to failure is greater than 25,000 hours based on testing 20 new fans. 4

5 EXAM II, December 1, 2000 I. (40 points) One of the major sources of sulfur dioxide air pollution is coal powered utility plants. Currently only 10% of such power plants meet federal EPA air pollution standards. Several major changes have been made in the power plants, such as using low-sulfur coal, burning the coal at a higher temperature, and installing new scrubbers on the air stacks of the plants. One year after imposing the changes, the EPA randomly selected 25 power plants for a length examination. The investigators found that 5 of the 25 power plants met the EPA air pollution standards, whereas 20 power plants did not meet the standards. 1. Place a 95% confidence interval on the proportion of power plants currently meeting the EPA air pollution standards. 2. Is there significant evidence at the α =.05 level that the proportion of power plants meeting the EPA air pollution standards has increased since the major changes were made in the power plants. 3. Compute the P-value for the test statistic obtained in part Evaluate the power function of the test constructed in part 2 for the proportions.05,.1,.2,.3,.4,.5, and.6. Use these values to sketch a rough plot of the power function. 5. The researchers are interested in a much larger study. How large must the sample size n be in order that the researchers can be 95% confident that the sample estimator of the proportion of power plants meeting the standard is within.05 of the true value? II. (30 points) A company has designed a new type of braking system for sports utility vehicles. To evaluate the effectivenes of the new system, they place n of the braking systems on a test device and recorded the time to failure of the braking systems: Y 1,..., Y n. The pdf of the random variables f(y; θ), depends on a unknown parameter θ. For each of the following situations, state whether the given statement is TRUE or FALSE. If the statement is FALSE, explain VERY BRIEFLY why the statement is FALSE. 1. In order to evaluate the reliability of the braking systems a 95/95 tolerance interval for the population of times to failure for the brakes is to be constructed. In order to construct the interval, it is necessary that the functional form of f(y; θ) be known. 2. The researchers obtain a point estimator of θ. They state that the sampling distribution of the estimator can be adequately approximated by a normal distribution if the sample size, n, is large enough. 3. Under appropriate weather conditions, the pdf f(y; θ) is symmetric about θ. In this type of situation, the sample mean would have a smaller mean squared error (MSE) than the sample median as an estimator of θ. III. (30 points ) Place the letter of the BEST answer in the BLANK to the LEFT of each question. (1) In testing H 0 : π.4 vs H a : π <.4, the P-value of the data was If α=.05 and the true value of π was π=.6, then the decision based on the data A) was a Type I error. B) was a Type II error. C) was correct. D) cannot be determined. 5

6 (2) The P-value of the computed value of a test statistic is A) the probability of observing a less extreme value of the test statistic. B) the largest value of α for which the observed data will reject H o. C) the weight of evidence in favor of H a. D) the smallest value of α for which the observed data will reject H o. (3) An industrial process produces piston rings having a nominal diameter of 9 cm. A 95% confidence interval for the mean diameter of the piston rings produced during July was calculated and a 95%/95% tolerance interval was calculated for the diameters of the piston rings produced during July. A) The probability is.95 that the mean diameter will fall within the confidence interval. B) The width of the 95% confidence interval is generally narrower than the width of the 95%/95% tolerance interval and hence is a more precise estimator. C) If the engineer wanted to set limits such that 95% of the output was within these limits, then the tolerance interval would be more informative than the confidence interval. D) The tolerance interval for the piston diameters could be used to determine if the mean diameter for July s output was equal to 9 cm or not. (4) Suppose we have two normal populations and we want to compare the means of the populations. Random samples of size 9 are selected from each population. The researcher is certain that the standard deviation of the first population is at least 8 times larger than the standard deviation of the second population. The most appropriate procedure for testing if the two population means differ is A. the pooled variance t-test. B. the Wilcoxon rank sum test. C. to transform the data and use a pooled variance t-test. D. the separate variance t-test using the Satterthwaite correction for the degrees of freedom. E. none of the procedures are very useful in this situation since the sample sizes are too small. (5) A biochemist is attempting to estimate the typical length of time it takes a drug to reach the kidney of a mature rat. There are several possible estimators of this parameter. In attempting to select the best estimator the biochemist should A. select the estimator with the smallest average squared distance from the parameter. B. select the estimator with the smallest variance since it would be the most consistent estimator. C. select the estimator with the smallest bias or smallest variance. D. always select the unbiased estimator since on the average it would equal the parameter. E. ask a statistician for advice. (6) In a study to compare the average responses of two drugs for the treatment of heart worms, the researchers were concerned that the available dogs for the study ranged from 1 year to 12 years of age and weighed from 10 kilograms to 55 kilograms. How can the effect of the difference in experimental units be reduced so as to not mask any significant difference in the treatments? A. A procedure based on the ranks of the responses should be used as a test statistic. B. A transformation of the data prior to using a pooled t-test would be effective in reducing variability. C. The separate variance t-test would adjust the test statistic for the unequal variances. D. The dogs should be grouped into similar pairs of dogs prior to assigning the two treatments. E. The effect of the differences in the dogs is not a problem if the treatments are randomly assigned to the dogs. 6

7 (7) The Wilcoxon rank sum test statistic is called a distribution-free test statistic since A. its sampling distribution under H o does not depend on the shape of the population distributions. B. it can be used even if the population distributions are non-normal. C. its sampling distribution is non-normal for small sample sizes. D. its sampling distribution does not require the variances to be equal. E. it has greater power than the pooled t-test when the population distributions are non-normal. (8) The random assignment of treatments to experimental units is crucial in designed experiments A. since it eliminates the effects of nontreatment factors on the experimental responses. B. since the effect of nontreatment factors are averaged over all experimental units, no matter the treatment. C. since it allows us to estimate the amount that nontreatment factors affect the average response. population values. D. since the randomization prevents the experimenter from knowing which experimental units are assigned to which treatments. reasons are correct. (9) An experimenter wants to test H o : F = F o, where F is a process cdf. Which ONE of the following statements is FALSE? A. The Chi-squared GOF test statistic can be used even when F is an absolutely continuous cdf. B. The Cramer-von-Mises test statistic is generally a better test than the Kolmogorov-Smirnov test statistic. C. The Anderson-Darling test has greater power than the Chi-squared GOF test for all choices of F o. D. The Shapiro-Wilk test has greater power than the Anderson-Darling test if F has a normal cdf. E. The Kolmogorov-Smirnov test is not an appropriate test statistic when F o is discrete. (10) If f(y; θ) is a pdf which is symmetric about θ, then, amongst the three test statistics discussed in class, the test statistic having greatest power A. is the Wilcoxon signed rank test. B. is the sign test C. is the t-test D. depends on the form of f(y; θ) E. the tests have essentially the same power 7

8 November 21, 2001 I. (35 points) A company that manufacturers silicon wafers for computer chips is concerned with both the mean thickness of the chips and the fluctuation in the thickness of the chips. In order to monitor the thickness, a random sample of 20 chips is selected every hour and the thickness is measured on each of the chips. The process is considered to be in control provided the process mean, µ, is 200 mm and the process standard deviation, σ is less than or equal to 2.5 mm. The company s statistician develops a test to evaluate whether the process standard deviation is greater than 2.5 mm. She plots the power curve of the test in order to evaluate its performance. The curve is given here: Power Curve for Standard Deviation Power Process Standard Deviation Use the above graph to answer questions a.-d. a. What is the level of significance of the test whose power curve is depicted above? b. What is the probability of a Type I error if σ = 2.4? c. What is the probability of a Type II error if σ = 2.4? d. What is the probability of a Type II error if σ = 3? e. What is the practical consequence to the company if the test commits a Type I error? f. What is the practical consequence to the company if the test commits a Type II error? 8

9 g. The random sample of 20 chips yields a sample standard deviation of Compute the p-value of the test statistic and determine if the process standard deviation is outside of its specification. II. (25 points) The company also needs to meet the specification that the mean thickness of the wafers must be at least 200 mm. The company wants to develop a test which has level of significance of 5% to determine if the mean thickness is less than 200 mm. a. Suppose the sample size is 20 units and we develop a level 0.05 test based on n=20. What is the chance the test will detect that the wafer mean thickness is less than 200 mm when the process mean is 199 mm and mm. (Assume that σ = 2.5 mm.) b. Sketch the power curve for the test. c. What sample size is needed so that the test in part a. will have a 80% chance to detect that the wafer mean thickness is less than 200 mm when the true mean thickness is at most mm? III. (4 points each) Place the letter of the best answer in the blank to the left of each question. (1) Suppose that the company s process engineer informed you that in most samples of 20 wafers that the distribution of wafer thicknesses is highly skewed to the right. In using the Chi-square test of whether σ is greater than 2.5 mm with α = 0.05, A. the actual level of significance will be greater than B. the actual level of significance will be less than C. the actual level of significance will be very close to D. it is completely unknown what the effect will be. (2) The p-value of the observed value of a test statistic is A. the probability of making the correct decision. B. the weight of evidence that the alternative hypothesis is true. C. the smallest level of significance for which the observed data will reject the null hypothesis. D. the probability of making a Type I error. (3) The sample estimator ˆθ of a population parameter θ is unbiased. Unbiased means that the estimator ˆθ A. is 95% certain of being close to θ. B. has a sampling distribution which is symmetric about θ. C. has a smaller mean squared error than a biased estimator. D. has a sampling distribution with mean value θ. E. all the above 9

10 (4) A 95/95 tolerance interval is to be constructed for a population having pdf f( ). A. The tolerance interval is always wider than a 95% confidence interval for the population parameter. B. It is necessary to specify the family for f( ) in order to construct the tolerance interval. C. The distribution-free tolerance interval will generally be wider than the tolerance interval based on a specified family for f( ). D. The normal based tolerance interval will have approximately the correct probabilities provided the sample size is large enough for the central limit theorem to be valid. E. All of the above. (5) In an α = 0.05 test of H o : µ 12 versus H 1 : µ > 12, the probability of a Type II error is A. greater at µ = 13 than at µ = 14. B. smaller at µ = 13 than at µ = 14. C. the same at µ = 13 as at µ = 14. D. always less than (6) In testing H o : µ µ o versus H 1 : µ < µ o, where µ is the median of a population having a symmetric pdf, f(), A. the power of the t-test is greater than the power of the Wilcoxon signed-rank test. B. the power of the t-test is greater than the power of the sign test. C. the power of the t-test is less than the power of the Wilcoxon signed-rank test. D. the power of the t-test is less than the power of the sign test. E. None of the above (7) The observations X 1,, X n are positively correlated with correlation greater than 0.8. An α = 0.05, t-test = X µ o s/ n of H o : µ µ o versus H 1 : µ > µ o will have A. maximum probability of Type I error equal to B. maximum probability of Type I error less than C. maximum probability of Type I error greater than D. maximum probability of Type I cannot be computed. (8) A study is to be conducted to estimate the mean conductivity (in ohms) of a new alloy. What sample size is needed to ensure that the sample mean will estimate the average conductivity to within 5 ohms with a reliability of 99%. Conductivity has a normal distribution with a standard deviation of approximately 15. A. 538 B. 30 C. 60 D. 49 E. cannot be determined with the given information 10

11 (9) A 95/99 tolerance interval for a normal population is A. a 95% C.I. for µ and a 99% C.I. for σ. B. an estimate of the population mean in which we are 99% confident that the sample mean is within 95% of σ from the population mean. C. an estimate of a region of values which will contain between 95% and 99% of the population values. D. a region of values for which we are 99% confident that the region will contain at least 95% of the population values.. (10) In testing H o : σ 2.3 versus H 1 : σ < 2.3, the p-value of the test statistic was computed to be If the level of significance was α = 0.05 and the true value of σ = 1.8 then the decision based on the data A. was a Type I error. B. was a Type II error. C. was a correct decision. D. depends on the power of the test. 11

12 November 21, 2002 I. (36 points) Suppose X 1,, X n are n observations from a population. (A) Suppose the X i s are iid and the population distribution is normal with µ and σ unknown. Describe how to compute the power curve for testing the hypotheses: H o : µ µ o versus H a : µ > µ o. (B) Suppose the X i s are iid but the population distribution is symmetric with very heavy-tails. If n is relatively small, describe an alternative test statistic to the one used in part (A) for testing the hypotheses: H o : µ µ o versus H a : µ > µ o. (C) Suppose the X i s are iid and the population distribution is normal. Determine the smallest sample size necessary for an α = 0.05 test of the hypotheses: H o : µ µ o versus H a : µ > µ o to have power at least.8 whenever µ > µ o +.75 σ. (D) Suppose X 1,, X n s are iid and the population distribution is very skewed to the right. Describe an interval of values for which you are 95% confident that the interval contains 90% of the population values. (E) Describe a procedure to determine if the X i s are positively correlated? (F) Suppose the population distribution is normal. What is the effect of the positive correlation on the standard test of the hypotheses: H o : µ µ o versus H a : µ > µ o? Justify your answer. II. (24 points) A company produces a product whose time to failure T has an exponential distribution with average time to failure λ = 25 (in thousands of hours). They make changes to the product and want to determine if the changes have an effect on the distribution of the average times to failure. An accelerated life test is performed on a random sample of 15 units produced after the product changes yielding a sample mean T = 40 (thousands of hours). (A) Describe a graphical technique to evaluate whether the failure times from the accelerated life tests follow an exponential distribution. Make sure to label your axes. (B) Construct a 90% confidence interval for the average time to failure. You may assume that T 1,, T 15 are iid exponential with T = 40. Hint: If T 1,, T n are iid exponential with average value λ, then d.f. = 2n. 2n T λ has a Chi-squared distribution with (C) Use the confidence interval you constructed in part (B) to test the hypothesis that the average time to failure after the changes is greater than 25. What is the level of significance of your test? (D) Construct a 95% prediction interval for the time to failure, T, of a single unit produced after the changes were made to the product. You may assume that T 1,, T 15 are iid exponential with T = 40. Hint: T and T are independent and T 1,, T n are iid exponential with average value λ, thus Chi-squared distribution with d.f. = 2n. III. (4 points each) Place the letter of the best answer in the blank to the left of each question. 2n T λ has a (1) Suppose that X 1,, X n are highly positively correlated with a N(µ, σ 2 ) distribution. A 95% confidence interval for µ was constructed using the formula X ± (t α/2,n 1 )(s/ n). The true coverage probability of this confidence interval A. is B. is much less than C. is very close to D. is much greater than

13 E. may be greater or less than (2) The p-value of the observed value of a test statistic is A. the probability of making the correct decision. B. the smallest level of significance for which the observed data will reject the null hypothesis. C. the largest level of significance for which the observed data will reject the null hypothesis. D. the probability of making a Type I error. E. the probability of making a Type II error. (3) There are two sample estimators ˆθ 1 and ˆθ 2 of a population parameter θ. ˆθ 1 is unbiased and ˆθ 2 is biased. A. ˆθ1 is always preferred to ˆθ 2 B. ˆθ1 is preferred to ˆθ 2 because it is a more accurate estimator than ˆθ 2. C. ˆθ2 is preferred to ˆθ 1 if it has a smaller variance than ˆθ 1. D. ˆθ2 is preferred to ˆθ 1 if it has a smaller variance and smaller bias than ˆθ 1. E. ˆθ1 is preferred to ˆθ 2 if it has a smaller variance than ˆθ 2. (4) A 95/95 tolerance interval is constructed for a population having pdf f( ) and mean µ. A. The tolerance interval is wider than a 95% confidence interval for µ. B. It is necessary to specify the family for f( ) in order to construct the tolerance interval. C. If we know that the population distribution is normal but use a Distribution-free tolerance interval, then the tolerance interval will contain more than 95% of the population values. D. A normal based tolerance interval will have approximately the correct probabilities for any pdf provided the sample size is large enough for the central limit theorem to be valid. E. All of the above. (5) In an α = 0.05 test of H o : µ 12 versus H 1 : µ < 12, the probability of a Type II error is A. greater at µ = 13 than at µ = 14. B. smaller at µ = 10 than at µ = 11. C. the same at µ = 10 as at µ = 11. D. always less than E. cannot be determined because my dog ate my noncentral t-tables (6) In testing H o : µ µ o versus H 1 : µ < µ o, where µ is the finite expected value for a population having a symmetric pdf, f( ), A. the power of the t-test is greater than the power of the Wilcoxon signed-rank test. B. the power of the t-test is greater than the power of the sign test. C. the power of the t-test is less than the power of the Wilcoxon signed-rank test. D. the power of the t-test is less than the power of the sign test. E. None of the above (7) The Anderson-Darling statistic is preferred to the Chi-square statistic in testing H o : F = F o based on a random sample X 1,, X n from a continuous cdf F because A. the Anderson-Darling is easier to compute. B. the number of degrees of freedom must be approximated for the Chi-squared test. C. the probability of Type I error is greater for the Chi-squared test. 13

14 D. the Anderson-Darling test has greater power. E. all of the above (8) A study is to be conducted to estimate the mean conductivity (in ohms) of a new alloy. What sample size is needed to ensure that the sample mean will estimate the average conductivity to within 10 ohms with a reliability of 99%. Conductivity has a normal distribution with a standard deviation of approximately 30. A. 35 B. 49 C. 60 D. 538 E. cannot be determined since σ is unknown (9) The Kolmogorov-Smirnov, Cramer von Mises, and Anderson-Darling statistics are referred to as Distribution- Free tests for testing H o : F = F o when the population cdf F o is completely specified because A. the three test statistics have the same level of significance. B. any unknown parameters have very little effect on their null distribution. C. the null distribution of the three tests do not depend on the particular form of F o. D. the power function is identical for the three tests.. (10) In testing H o : µ 5 vs H a : µ > 5, the P-value of the test statistic was computed to be If the level of significance was α =.10, and the true value of µ was µ = 4. The decision based on the data A. was a Type I error. B. was a Type II error. C. was a Type III error. D. was correct. E. was either a Type I error or a correct decision. 14

15 November 18, 2003 I. (25 points) Suppose Y 1,, Y 60 are the distances (in thousands of miles) to failure of 60 transmissions produced by a supplier of truck transmissions. The data is given on the next page. (A) Determine a warranty value such that you are 90% confident that at least 75% of all transmissions produced from the production line will have miles to failure greater than the warranty value. (B) Predict with 90% confidence the number of miles to failure of a transmission randomly selected from the production facility. (C) Estimate with 90% confidence the proportion of transmissions from the production facility having miles to failure greater than 50,000 miles. (D) Suppose the Y i s are iid but the population distribution is multimodal due to the supplier having several production facilities with varying levels of quality. Determine a warranty value such that you are 90% confident that at least 75% of all transmissions produced from the production facilities will have miles to failure greater than the warranty value. The following data is the failure times of the 60 transmissions in thousands of miles: The following summary statistics were computed from the 60 failure times: Ȳ = S = Y (1) = ˆQ(.25) = ˆQ(.5) = ˆQ(.75) = Y(60) =

16 Normal Quantile Plot for Failure Times Failure Times Standard Normal Quantiles II. (75 points) Answers each of the following questions in 20 words or less. (1) Suppose that X 1,, X n is a sample from a N(µ, σ 2 ) distribution but the X i s are highly positively correlated. A 95% confidence interval for µ was constructed using the formula X ± (t α/2,n 1 ) S n. What is the effect of the correlation on the true coverage probability of this confidence interval? (2) There are two sample estimators ˆθ 1 and ˆθ 2 of a population parameter θ. ˆθ1 is unbiased and ˆθ 2 is biased. Which of the two estimators is preferred? Justify your answer. (3) A 95/95 tolerance interval is to be constructed for a population having pdf f( ). Is the following statement true: A normal based tolerance interval will have approximately the correct probabilities provided the sample size is large enough for the central limit theorem to be valid.? Justify your answer. (4) Why is the Anderson-Darling statistic preferred to the Chi-square statistic in testing whether a population cdf is F o based on a random sample X 1,, X n from a continuous cdf F? (5) A study is to be conducted to estimate the mean conductivity (in ohms) of a new alloy. What sample size is needed to ensure that the sample mean will estimate the average conductivity to within 10 ohms with a reliability of 99%. Conductivity has a normal distribution with a standard deviation of approximately

17 (6) Explain why the Kolmogorov-Smirnov statistic, is referred to as Distribution-Free method for evaluating whether a population cdf F is equal to F o when F o is completely specified. (7) The Agresti-Coull C.I. for a proportion is somewhat more complex than the standard asymptotic C.I. for a proportion. Why would you recommend using the Agresti-Coull C.I.? (8) What are the two major sources of error in using the bootstrap procedure to the estimate the percentiles of a pivot? (9) In order to construct a 95% C.I. for the mean µ of a population, a random sample X 1,..., X 31 was selected from a process having pdf f( ). Because the sample size is relatively large, the biologist used X S ± t.025 n as the C.I. for µ. You plotted the data and noticed that the data was highly right skewed. What problems may exist in using this C.I.? (10) Referring to question (9), the problem encountered with the interval is due to possible correlation between X and S. If the biologist can not provide you with any information about the population distribution other than the data, describe a method to determine the degree of correlation between X and S using the observed data from the study. (11) Suppose X 1,..., X 20 are iid with a highly right skewed distribution. The transformation Y = g(x) = X yields a Shapiro-Wilk p-value = A 95% C.I. for µ Y is [2.86, 3.97]. Why is [(2.86) 2, (3.97) 2 ] NOT an appropriate 95% C.I. for µ X? (12) For each of the following sentences, state whether the sentence is true or false. If false, explain why. (A). The sampling distribution of an estimator of a parameter θ will have approximately a normal distribution if the sample size is large enough. (B). The bootstrap procedure for constructing a C.I. for a parameter θ is always preferred to using a distribution-based procedure in constructing the C.I. because the distribution-based procedure depends on various conditions being valid. (C). If the population pdf f(y; θ) is symmetric about θ, then the sample mean is a better estimator of θ than is the sample median. (D). A 95% C.I. for a parameter θ is given by (2.75, 3.25). This means that there is a.95 probability that θ is between 2.75 and

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Non-Parametric Tests (I)

Non-Parametric Tests (I) Lecture 5: Non-Parametric Tests (I) KimHuat LIM lim@stats.ox.ac.uk http://www.stats.ox.ac.uk/~lim/teaching.html Slide 1 5.1 Outline (i) Overview of Distribution-Free Tests (ii) Median Test for Two Independent

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp. 394-398, 404-408, 410-420 1. Which of the following will increase the value of the power in a statistical test

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one?

SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? SIMULATION STUDIES IN STATISTICS WHAT IS A SIMULATION STUDY, AND WHY DO ONE? What is a (Monte Carlo) simulation study, and why do one? Simulations for properties of estimators Simulations for properties

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Variables Control Charts

Variables Control Charts MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. Variables

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

Confidence Intervals for Cp

Confidence Intervals for Cp Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Hypothesis Testing: Two Means, Paired Data, Two Proportions Chapter 10 Hypothesis Testing: Two Means, Paired Data, Two Proportions 10.1 Hypothesis Testing: Two Population Means and Two Population Proportions 1 10.1.1 Student Learning Objectives By the end of this

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Exact Confidence Intervals

Exact Confidence Intervals Math 541: Statistical Theory II Instructor: Songfeng Zheng Exact Confidence Intervals Confidence intervals provide an alternative to using an estimator ˆθ when we wish to estimate an unknown parameter

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

Chapter 7 Section 1 Homework Set A

Chapter 7 Section 1 Homework Set A Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

START Selected Topics in Assurance

START Selected Topics in Assurance START Selected Topics in Assurance Related Technologies Table of Contents Introduction Some Statistical Background Fitting a Normal Using the Anderson Darling GoF Test Fitting a Weibull Using the Anderson

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Non-Inferiority Tests for One Mean

Non-Inferiority Tests for One Mean Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

More information

Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).

More information

CHAPTER 13. Experimental Design and Analysis of Variance

CHAPTER 13. Experimental Design and Analysis of Variance CHAPTER 13 Experimental Design and Analysis of Variance CONTENTS STATISTICS IN PRACTICE: BURKE MARKETING SERVICES, INC. 13.1 AN INTRODUCTION TO EXPERIMENTAL DESIGN AND ANALYSIS OF VARIANCE Data Collection

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES SCHOOL OF HEALTH AND HUMAN SCIENCES Using SPSS Topics addressed today: 1. Differences between groups 2. Graphing Use the s4data.sav file for the first part of this session. DON T FORGET TO RECODE YOUR

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

2 ESTIMATION. Objectives. 2.0 Introduction

2 ESTIMATION. Objectives. 2.0 Introduction 2 ESTIMATION Chapter 2 Estimation Objectives After studying this chapter you should be able to calculate confidence intervals for the mean of a normal distribution with unknown variance; be able to calculate

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem) NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions

More information

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

More information

The Variability of P-Values. Summary

The Variability of P-Values. Summary The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS

CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHAPTER 11 CHI-SQUARE AND F DISTRIBUTIONS CHI-SQUARE TESTS OF INDEPENDENCE (SECTION 11.1 OF UNDERSTANDABLE STATISTICS) In chi-square tests of independence we use the hypotheses. H0: The variables are independent

More information

Estimation and Confidence Intervals

Estimation and Confidence Intervals Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e

More information

NCSS Statistical Software

NCSS Statistical Software Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

More information

Skewed Data and Non-parametric Methods

Skewed Data and Non-parametric Methods 0 2 4 6 8 10 12 14 Skewed Data and Non-parametric Methods Comparing two groups: t-test assumes data are: 1. Normally distributed, and 2. both samples have the same SD (i.e. one sample is simply shifted

More information

The Normal distribution

The Normal distribution The Normal distribution The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function f(y) = 1 2σ π e{ 1 2σ

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

A) 0.1554 B) 0.0557 C) 0.0750 D) 0.0777

A) 0.1554 B) 0.0557 C) 0.0750 D) 0.0777 Math 210 - Exam 4 - Sample Exam 1) What is the p-value for testing H1: µ < 90 if the test statistic is t=-1.592 and n=8? A) 0.1554 B) 0.0557 C) 0.0750 D) 0.0777 2) The owner of a football team claims that

More information

NCSS Statistical Software. One-Sample T-Test

NCSS Statistical Software. One-Sample T-Test Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Math 251, Review Questions for Test 3 Rough Answers

Math 251, Review Questions for Test 3 Rough Answers Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. STT315 Practice Ch 5-7 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The length of time a traffic signal stays green (nicknamed

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance

Statistics courses often teach the two-sample t-test, linear regression, and analysis of variance 2 Making Connections: The Two-Sample t-test, Regression, and ANOVA In theory, there s no difference between theory and practice. In practice, there is. Yogi Berra 1 Statistics courses often teach the two-sample

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

How To Compare Birds To Other Birds

How To Compare Birds To Other Birds STT 430/630/ES 760 Lecture Notes: Chapter 7: Two-Sample Inference 1 February 27, 2009 Chapter 7: Two Sample Inference Chapter 6 introduced hypothesis testing in the one-sample setting: one sample is obtained

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl

Statistiek II. John Nerbonne. October 1, 2010. Dept of Information Science j.nerbonne@rug.nl Dept of Information Science j.nerbonne@rug.nl October 1, 2010 Course outline 1 One-way ANOVA. 2 Factorial ANOVA. 3 Repeated measures ANOVA. 4 Correlation and regression. 5 Multiple regression. 6 Logistic

More information