The NeymanPearson lemma. The NeymanPearson lemma


 Adele Bryant
 2 years ago
 Views:
Transcription
1 The NeymanPearson lemma In practical hypothesis testing situations, there are typically many tests possible with significance level α for a null hypothesis versus alternative hypothesis. This leads to some important questions, such as (1) How to decide on the test statistic? () How to know that we selected the best rejection region? The NeymanPearson lemma Definition 7..1 Suppose that W is the test statistic and RR is the rejection region for a test of hypothesis concerning the value of a parameter θ. Then the power of the test is the probability that the test rejects H 0 when the alternative is true. That is, π = Power(θ) = P(W in RR when the parameter value is an alternative θ). If H 0 : θ = θ 0 and H a : θ θ 0, then the power of the test at some θ = θ 1 θ 0 is Power(θ) = P(reject H 0 θ = θ 1 ). But, β(θ 1 )=P(accept H 0 θ = θ 1 ). Therefore, Power(θ 1 ) = 1β(θ 1 ).
2 The NeymanPearson lemma Example 7..1 Let X 1,, X n be a random sample from a Poisson distribution with parameter λ, that is, the pdf is given by f (x) = e λ λ x / (x!). Then the hypothesis H 0 : λ=1 uniquely specifies the distribution, because f (x) = e 1 / (x!) and hence is a simple hypothesis. The hypothesis H a : λ>1 is composite, because f(x) is not uniquely determined. The NeymanPearson lemma Definition 7.. A test at a given α of a simple hypothesis H 0 versus the simple alternative H a that has the largest power among tests with the probability of type I error no larger than the given α is called the most powerful test.
3 The NeymanPearson lemma Theorem 7..1 (NeymanPearson Lemma) Suppose that one wants to test a simple hypothesis H 0 : θ = θ 0 versus the simple alternative hypothesis H a : θ = θ 1 based on a random sample X 1,, X n from a distribution with parameter θ. Let L(θ) L(θ; X 1,, X n )>0 denote the likelihood of the sample when the value of the parameter is θ. If there exist a positive constant K and a subset C of the sample space R n (the Euclidean nspace) such that 1. L θ 0 L θ 1. L θ 0 L θ 1 ( ) ( ) K for ( x, x,, x 1 n ) C ( ) ( ) K for ( x 1, x,, x n ) C ' ( ) C; θ 0 = α 3.P x 1, x,, x n Then the test with critical region C will be the most powerful test for H 0 versus H a. We call α the size of the test and C the best critical region of size α. The NeymanPearson lemma Example 7.. Let X 1,, X n denote an independent random sample from a population with a Poisson distribution with mean λ. Derive the most powerful test for testing H 0 : λ= versus H a : λ=1/. Solution Recall that the pdf of Poisson variable is e λ λ x, λ > 0, x = 0,1,, p(x) = x! 0, otherwise Thus, the likelihood function is L= n ( x i ) i= 1 λn [ λ e n i= 1 x! i ]
4 The NeymanPearson lemma Example 7.. n Solution (cont.) ( xi For λ=, L(θ 0 ) = L(λ = ) = [ ) i=1 e n ] n x i! and for λ=1/, i=1 ( xi L(θ 1 ) = L(λ = 1/ ) = [(1/ ) ) i=1 e n/ ] ( ) Thus, L(θ 0 ) x i e n L(θ 1 ) = ( 1 < K ) x i e n/ Taking natural logarithm, ( x i )ln 4 3n < ln K. Solving for x i and letting {[ln K + (3n / )] / ln 4} = K ', we will reject H 0 whenever x i < K '. n i=1 n x i! The NeymanPearson lemma Procedure for applying the NeymanPearson lemma 1. Determine the likelihood functions under both null and alternative hypotheses.. Take the ratio of the two likelihood functions to be less than a constant K. 3. Simplify the inequality in step to obtain a rejection region.
5 The NeymanPearson lemma Example 7..3 Suppose X 1,, X n is a random sample from a normal distribution with a known mean of µ and an unknown variance of σ. Find the most powerful αlevel test for testing H 0 : σ = σ 0 versus H a : σ = σ 1 (σ 1 >σ 0 ) Show that this test is equivalent to the χ test. Is the test uniformly most powerful for H a : σ > σ 0? The NeymanPearson lemma Example 7..3 Solution To test H 0 : (σ 1 > σ 0 ) versus H a : σ > σ 1. We have n 1 L(σ 0 ) = e (x i µ) σ 0 1 = e i=1 πσ 0 π for some K. (x i µ) ( ) n σ 0 n Similarly, 1 L(σ 1 ) = e σ 1. ( π ) n n σ 1 Therefore, the most powerful test is, reject H 0 if L(σ 0 ) L(σ 1 ) = σ 1 σ 0 n (x i µ) σ 0 exp{ (σ 1 σ 0 ) (x i µ) } K σ 1 σ 0.
6 The NeymanPearson lemma Example 7..3 Solution (cont.) Taking the natural logarithms, we have or nln σ 1 σ 1 σ 0 σ x i µ 1 σ 0 σ 0 ( ) ( ) ( x i µ ) nln σ 1 σ 0 ln K σ 1σ 0 σ 1 σ 0 ln K ( ) = C To find the rejection region for a fixed value of α, write the region as ( x i µ ) C σ = C ' 0 ( x i µ ) σ 0 Note that σ has a χ 0 distribution with n degrees of freedom. Under the H 0 because the same rejection region (does not depend upon the specific value of σ 1 in the alternative) would be used for any (σ 1 > σ 0 ), the test is uniformly most powerful. Likelihood ratio tests In this section, we shall study a general procedure that is applicable when one or both H 0 and H a are composite. We assume that the pdf or pmf of the random variable X is f(x,θ), where θ can be one or more known parameters. Consider the hypotheses H 0 : µ = µ 0 vs. H a : µ µ 0 Where θ is the known population parameter(s) with values in Θ, and Θ 0 is a subset of Θ. Let Θ represent the total parameter space that is the set of all possible values of the parameter θ given by either H 0 or H a. Maximum likelihood Definition max L θ; x 1,, x n θ Θ The likelihood ratio λ is the ratio λ = 0 ( ) L ( θ; x,, x 1 n ) = L * 0 L * max θ Θ Maximum likelihood We note that 0 λ 1. Because λ is the ratio of nonnegative functions, λ 0. Because Θ 0 is a subset of Θ,we know that max L( θ ) max L( θ ) θ Θ Hence, λ 1. 0 θ Θ
7 Likelihood ratio tests Likelihood ratio tests (LRTs) To test H 0 :θ Θ 0 vs. H a :θ Θ a λ = max L θ; x 1,, x n θ Θ 0 L θ; x 1,, x n max θ Θ will be used as the test statistic. ( ) ( ) = L * 0 L * The rejection region for the likelihood ratio test is given by Reject H 0 if λ K. K is selected such that the test has the given significance level α. Likelihood ratio tests Example Let X 1,, X n be a random sample from an N(µ,σ ). Assume that σ is known. We wish to test, at level α, H 0 : µ = µ 0 vs. H a : µ µ 0. Find an appropriate likelihood ratio test. Solution We have seen that to test H 0 : µ = µ 0 vs. H a : µ µ 0 there is no uniformly most powerful test for this case. The likelihood function is 1 L(µ) = πσ Here, Θ ={µ 0 } and Θ =R{µ 0 }. 0 a n exp{ n i=1 (x i µ) σ }.
8 Likelihood ratio tests Example Solution (cont.) Hence, L * 0 = 1 πσ Similarly, L * = max <µ< n exp{ 1 πσ (x i µ 0 ) Because the only unknown parameter in the parameter space Θ is µ,  <µ<, the maximum of the likelihood function is achieved when µ equals its maximum likelihood estimator, that is, ˆµ ml. = X. Therefore, with a simple calculation we have n exp{ (x i µ 0 ) / σ } i=1 λ = n = e n(x µ 0 ) /σ exp{ (x i x) / σ } i=1 n n i=1 σ }. exp{ n i=1 (x i µ) σ }. Likelihood ratio tests Example Solution (cont.) Thus, the likelihood ratio test has the rejection region Reject H 0 if λ K. which is equivalent to n σ (X µ 0 ) ln K (X µ 0 ) σ / n ln K X µ 0 σ / n ln K = c 1. We now compute c 1. Under H 0, [(X µ 0 ) / (σ / n)] ~ N(0,1). Hence, LRT for the given hypothesis is Reject H 0 if Xµ 0 σ / n > z a/. Thus, in this case, the likelihood ratio test is equivalent to the ztest for large random samples.
9 Likelihood ratio tests Procedure for the likelihood ratio test (LRT) 1. Find the largest value of the likelihood L(θ) for any θ 0 Θ 0 by finding the maximum likelihood estimate within Θ 0 and substituting back into the likelihood function.. Find the largest value of the likelihood L(θ) for any θ Θ by finding the maximum likelihood estimate within Θ and substituting back into the likelihood function. 3. Form the ratio λ = λ ( x 1,, x n ) = L( θ )inθ 0 L( θ )inθ 4. Determine a K so that the test has the desired probability of type I error, α. 5. Reject H 0 if λ K. Likelihood ratio tests Example 7.3. Machine I produces 5% defectives. Machine produces 10% defectives. Ten items produced by each of the machines are sampled randomly; X = number of defectives. Let θ be the true proportion of defectives. Test H 0 : θ = 0.05 versus H a : θ = 0.1. Use α = Solution We need to test H 0 : θ = 0.05 vs. H a : θ = 0.1. Let 10 x (0.05)x (0.95) 10 x, if θ = L(θ) = 10 x (0.1)x (0.90) 10 x, if θ = 0.10.
10 Likelihood ratio tests Example 7.3. Solution (cont.) And L 1 = L(0.05) = 10 x (0.05)x (0.95) 10 x and L = L(0.1) = 10 x (0.1)x (0.90) 10 x Thus, we have L 1 = (0.05)x (0.95) 10 x = 1 L (0.1) x (0.9) 10 x x x. L The ratio λ = 1 max(l 1, L ). Likelihood ratio tests Example 7.3. Solution (cont.) Note that if max(l 1,L )=L 1, then λ=1. Because we want to reject for small values of λ, max(l 1,L )=L, and we reject H 0 if (L 1 /L ) K or (L /L 1 ) > K (note that L /L 1 = x (18/19) 10x ). That is, reject H 0 if 18 x x > K (19 / 9) x > K. Hence, reject H 0 if X>C; P(X>C H 0 : θ=0.05) Using the binomial tables, we have P(X > θ = 0.05) = P(X θ = 0.05) = => Reject H 0 if X>. and P.S. The likelihood ratio tests do not always produce a test statistic with a known probability distribution.
11 Take a break. Hypotheses for a single parameter Definition Corresponding to an observed value of a test statistic, the pvalue (or attained significance level) is the lowest level of significance at which the null hypothesis would have been rejected. Pvalue: the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. [from wiki]
12 Hypotheses for a single parameter Steps to find the pvalue 1. Let TS be the test statistic.. Compute the value of TS using the sample X 1,, X n. Say it is a. 3. The pvalue is given by P(TS < a H 0 ), if lower tail test p value = P(TS > a H 0 ), if upper tail test P( TS > a H 0 ), if two tail test. pvalue depends on alternative hypothesis!! Hypotheses for a single parameter Example To test H 0 : µ = 0 vs. H a : µ 0, suppose that the test statistic Z results in a computed value of Then, the pvalue=p( Z >1.58)= *0.0571= That is, we must have a type I error of (or higher) in order to reject H 0. Also, if H a : µ > 0, then the pvalue would be P(Z>1.58)= In this case we must have an α of in order to reject H 0.
13 Hypotheses for a single parameter Reporting test result as pvalues 1. Choose the maximum value of α that you are willing to tolerate.. If the pvalue of the test is less than maximum value of α, reject H 0. Hypotheses for a single parameter Example 7.4. The management of a local health club claims that its members lose on the average 15 pounds or more within the first 3 months after joining the club. To check this claim, a consumer agency took a random sample of 45 members of this health club and found that they lost an average of 13.8 pounds within the first 3 months of membership, with a sample deviation of 4. pounds. (a) Find the pvalue for this test. (b) Based on the pvalue in (a), would you reject the null hypothesis at α =0.01?
14 Hypotheses for a single parameter Example 7.4. Solution (a) Let µ be the true mean weight loss in pounds within the first 3 months of membership in this club. Then we have to test the hypothesis H 0 : µ = 15 vs. H a : µ < 15 Here n = 45, x =13.8, and s = 4.. Because n = 45>30, we can use normal approximation. Hence, the test statistic is z = 4. / 45 = and pvalue=p(z< ) P(Z<1.9)= Thus, we can use an α as small as and still reject H 0. (b) No. Hypotheses for a single parameter Steps in any hypothesis testing problem 1. State the alternative hypothesis, H a (what is believed to be true).. State the null hypothesis, H 0 (what is doubted to be true). 3. Decide on a level of significance α. 4. Choose an appropriate TS and compute the observed test statistic. 5. Using the distribution of TS and α, determine the rejection region(s) (RR). 6. Conclusion: If the observed test statistic falls in the RR, reject H 0 and conclude that based on the sample information, we are (1 α)100% confident that H a is true. Otherwise, conclude that there is not sufficient evidence to reject H 0. In all the applied problems, interpret the meaning of your decision. 7. State any assumptions you made in testing the given hypothesis. 8. Compute the pvalue from the null distribution of the test statistic and interpret it.
15 Hypotheses for a single parameter Summary of hypothesis tests for µ Large Sample ( n 30) To test H H a 0 : µ = µ versus µ > µ, upper tail 0 : µ < µ, lower tail 0 0 test test µ µ, two  tailed test X µ 0 Test statistic : Z = σ / n Replace σ by S, if σ is unknown. 0 z > zα, upper tail RR Rejection region : z < zα, lower tail RR z > zα /, two tail RR Assumption : n 30 Small Sample ( n < 30) To test H H a : µ = µ versus µ > µ, upper tail test : µ < µ, lower tail test X µ 0 Test statistic : T = S / n t > tα, n 1, upper tail RR Rejection region : t < tα, n 1, lower tail RR t > tα /, n 1, two tail RR Assumption : Random sample comes from a normal population µ µ, two  tailed test 0 0 Hypotheses for a single parameter Summary of hypothesis tests for µ (cont.) Decision: Reject H 0, if the observed test statistic falls in the RR and conclude that H a is true with (1α)100% confidence. Otherwise, keep H 0 so that there is no enough evidence to conclude that H a is true for the given α and more experiments may be needed.
16 Hypotheses for a single parameter Example In a frequently traveled stretch of the I75 highway, where the posted speed is 70 mph, it is thought that people travel on the average of at least 75 mph. To check this claim, the following radar measurements of the speeds (in mph) is obtained for 10 vehicles traveling on this stretch of the interstate highway Do the data provide sufficient evidence to indicate that the mean speed at which people travel on this stretch of highway is at least 75 mph? Test the appropriate hypothesis using α = Draw a box plot and normal plot for this data, and comment. Hypotheses for a single parameter
17 Hypotheses for a single parameter Example Solution We need to test H 0 : µ = 75 vs. H a : µ > 75. For this sample, the sample mean is x =74.8 mph and the standard deviation is σ = mph. Hence, the observed test statistic is t = x µ = σ / n / 10 = From the ttable, t 0.01 =.81. Hence, the rejection region is {t >.81}. Because t = does not fall in the rejection region, we do not reject the null hypothesis at α = Hypotheses for a single parameter Example Solution (cont.) The box plot suggests that there are no outliers present. However, the normal plot indicates that the normality assumption for this data set is not justified. Hence, it may be more appropriate to do a nonparametric test.
18 Hypotheses for a single parameter Example A machine is considered to be unsatisfactory if it produces more than 8% defectives. It is suspected that the machine is unsatisfactory. A random sample of 10 items produced by the machine contains 14 defectives. Does the sample evidence support the claim that the machine is unsatisfactory? Use α = Hypotheses for a single parameter Example Solution Let Y be the number of observed defectives. This follows a binomial distribution. However, because np 0 and nq 0 are greater than 5, we can use a normal approximation to the binomial to test the hypothesis. So we need to test H 0 : p = 0.08 versus H a : p > Let the point estimate of p be ˆp = (Y / n) = 0.117, the sample proportion. Then the value of the TS is z = ˆp p = = p 0 q (0.9) n 10 For α = 0.01, z 0.01 =.33. Hence, the rejection region is {z>.33}. Decision: Because is not greater than.33, we do not reject H 0. We conclude that the evidence does not support the claim that the machine is unsatisfactory.
19 Hypotheses for a single parameter Summary of hypothesis test for the proportion p To test H 0 : p = p 0 Versus p > p 0, upper tail test H a : p < p 0, lower tail test Test statistic: Z = ˆp p 0, where σ ˆp = p (1 p ) 0 0 σ ˆp n Rejection region: Z > Z α, upper tail RR Z < Z α, lower tail RR Z > Z α /, two tail RR Hypotheses for a single parameter Summary of hypothesis test for the proportion p (cont.) Assumption: n is large. A good rule of thumb is to use the normal approximation to the binomial distribution only when np 0 and n(1p 0 ) are both greater than 5 Decision: Reject H 0, if the observed test statistic falls in the RR and conclude that H a is true with (1α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for given α and more data are needed.
20 Hypotheses for a single parameter Summary of hypothesis test for the variance σ To test H 0 :σ = σ 0 Versus σ > σ 0, upper tail test H a :σ < σ 0, lower tail test σ σ 0, twotailed test. Test statistic: where S is the sample variance. Observed value of test statistic: Rejection region: (n 1)S χ = σ 0 χ > χ α,n 1, upper tail RR χ < χ 1 α,n 1, lower tail RR χ > χ α /,n 1 or χ > χ 1 α /,n 1, two tail RR (n 1)s σ 0 Hypotheses for a single parameter Summary of hypothesis test for the variance σ (cont.) Assumption: Sample comes from a normal population Decision: Reject H 0, if the observed test statistic falls in the RR and conclude that H a is true with (1α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for given α and more data are needed.
21 Hypotheses for a single parameter Example A physician claims that the variance in cholesterol levels of adult men in a certain laboratory is at most 100. A random sample of 5 adult males from this laboratory produced a sample standard deviation of cholesterol levels as 1. Test the physician s claim at 5% level of significance. Solution To test H 0 : σ =100 versus H a : σ <100 For α = 0.05, and 4 degrees of freedom, the rejection region is RR = {χ < χ 1 α,n 1 } = {χ < 13.84}. The observed value of the TS is χ = (4)(144) 100 = Because the value of the test statistic does not fall in the rejection region, we cannot reject H 0 at 5% level of significance. Testing of hypotheses for two samples Independent Samples Two random samples are drawn independently of each other from two populations, and the sample information is obtained. We are interested in testing a hypothesis about the difference of the true means. Let X 11,, X 1n be a random sample from population 1 with mean µ 1 and variance σ 1, and X 1,, X n be a random sample from population with mean µ and variance σ. Let X represent the respective i,i = 1, sample means and S i,i = 1, represent the respective sample variances. In testing hypotheses about µ 1 and µ : (i) σ 1 and σ are known; (ii) σ 1 and σ are unknown and n 1 30 and n 30 ; (iii) σ 1 and σ are unknown and n and ; (a) 1 < 30 n < 30 σ 1 = σ (b) σ 1 σ Most common; most complicated in computation
22 Testing of hypotheses for two samples Hypothesis test for µ 1  µ for large samples (n 1 & n 30) To test H 0 : µ 1  µ = D 0 versus µ 1 µ > D 0, upper tailed test H a :{ µ 1 µ < D 0, µ 1 µ D 0, lower tailed test twotailed test. The test statistic is Z = X 1 X D 0 σ 1 n 1 + σ Replace σ i by S i, if σ i, i = 1, are not known. n. Testing of hypotheses for two samples Hypothesis test for µ 1  µ for large samples (n 1 & n 30) (cont.) Rejection region is RR :{ z > z α, z < z α, z > z α /, upper tailed RR lower tailed RR twotailed RR, where z is the observed test statistic given by z = x 1 x D 0 σ 1 n 1 + σ n. Assumption: the samples are independent and n 1 and n 30. Decision: Reject H 0, if test statistic falls in the RR and conclude that H a is true with (1 α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for a given α and more experiments are needed.
23 Testing of hypotheses for two samples Example In a salary study of faculty at a center university, sample salaries of 50 male assistant professors and 50 female assistant professors yielded the following basic statistics. Sample mean salary Sample standard deviation Male assistant professors $ 36, Female assistant professors $34,00 0 Test the hypothesis that the mean salary of male assistant professors is more than the mean salary of female assistant professors at this university. Use α = 0.05 Testing of hypotheses for two samples From wiki
24 Testing of hypotheses for two samples Example Solution Let µ 1 be the mean salary for male assistant professors and µ for the mean salary for female assistant professors at this university. To test H 0 : µ 1  µ = 0 vs. H a : µ 1 µ > 0 The test statistics is z = x x D , , 00 σ 1 + σ = = (360) + (0) n The rejection region for α = 0.05 is {z > 1.645}. n Testing of hypotheses for two samples Example Solution (cont.) Because z = > 1.645, we reject the null hypothesis at α = We conclude that the salary of male assistant professors at this university is higher than that of female assistant professors for α = Note that even though σ 1 and σ are unknown, because n 1 30 and n 30, we could replace σ 1 and σ by the respective sample variances. We are assuming that the salaries of male and female are sampled independently of each other.
25 Testing of hypotheses for two samples Comparison of two population means, small sample case (pooled t test); assume variances are equal To test H 0 : µ 1  µ = D 0 versus µ 1 µ > D 0, upper tailed test H a :{ µ 1 µ < D 0, µ 1 µ D 0, lower tailed test twotailed test. The test statistic is T = X X D S p + 1 n 1 n Here the pooled sample variance is S p = (n 1 1)S 1 + (n 1)S n 1 + n. Testing of hypotheses for two samples Comparison of two population means, small sample case (pooled t test) (cont.) Then the rejection region is t > t α, upper tailed RR RR :{ t < t α, lower tailed RR t > t α /, twotailed RR where t is the observed test statistic and t α is based on (n 1 +n ) degrees of freedom, and such that P(T > t α ) = α. Decision: Reject H 0, if test statistic falls in the RR and conclude that H a is true with (1 α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for a given α. Assumption: The samples are independent and come from normal populations with mean µ 1 and µ, and with the (unknown) but equal variances, that is, =. σ 1 σ
26 Testing of hypotheses for two samples Now we shall consider the case where and are unknown and cannot be assumed to be equal. In such case the following test is often used. For the hypothesis µ 1 µ > D 0, H 0 : µ 1  µ = D 0 vs. define the test statistic T v as H a :{ T v = x 1 x D 0 σ 1 σ µ 1 µ < D 0, µ 1 µ D 0, + S n 1 n where T v has a tdistribution with v degrees of freedom, and S 1 v = [(S 1 / n 1 ) + (S / n )] (S 1 / n 1 ) n (S / n ) n 1 Testing of hypotheses for two samples The value of v will not necessarily be an integer. In that case, we will round it down to the nearest integer. This method of hypothesis testing with unequal variances is called the SmithSatterthwaite procedure.
27 Testing of hypotheses for two samples Example 7.5. The intelligence quotients (IQs) of 17 students from one area of a city showed a sample mean of 106 with a sample standard deviation of 10, whereas the IQs of 14 students for another area chosen independently showed a sampled mean of 109 with a standard deviation of 7. Is there a significant difference between the IQs of the two groups at α = 0.0? Assume that the population variance are equal. Testing of hypotheses for two samples Example 7.5. Solution We test H 0 : µ 1 µ = 0 vs. H a : µ 1 µ 0 Here n 1 = 17, x 1 = 106, and s 1 = 10. Also, n = 14, x = 109, and s = 7. We have S p = (n 1)S (n 1)S n 1 + n The test statistic is = (16)(10) + (13)(7) 9 T = X 1 X D 0 S p 1 n n = = 1 ( ) =
28 Testing of hypotheses for two samples Example 7.5. Solution (cont.) For α = 0.0, t 0.01, 9 =.46. Hence, the rejection region is t < .46 or t >.46. Because the observed value of the test statistic, T = , does not fall in the rejection region, there is not enough evidence to conclude that the mean IQs are different for the two groups. Here we assume that the two samples are independent and are taken from normal populations. Testing of hypotheses for two samples Example Infrequent or suspended menstruation can be a symptom of serious metabolic disorder in women. In a study to compare the effect of jogging and running on the number of menses, two independent subgroups were chosen from a large group of women, who were similar in physical activity (aside from running), heights, occupations, distribution of ages, and the type of birth control methods being used. The first group consisted of a random sample of 6 women joggers who jogged slow and easy 5 to 30 miles per week, and the second group consisted of a random sample of 6 women runners who ran more than 30 miles per week and combined with long distance, slow speed walk. The following summary statistics were obtained (E. Dale, D. H. Gerlach, and A. L. Wilhite, Menstrual Dysfunction in Distance Runner, obstet. Gynecol. 54, 4753, 1979). Joggers x 1 = 10.1, S 1 =.1 Runners = 9.1, S =.4 x
29 Testing of hypotheses for two samples Example (cont.) Using α = 0.05, (a) test for differences in mean number of menses for each group assuming equality of population variances, and (b) test for differences in mean number of menses for each group assuming inequality of population variances. Solution Here we need to test H 0 : µ 1 µ = 0 vs. H a : µ 1 µ 0 Here n 1 = 6, x 1 = 10.1, and s 1 =.1. Also, n = 6, x = 9.1, and s =.4. (a) Under the assumption =, we have σ 1 σ S p = (n 1)S (n 1)S n 1 + n = (5)(.1) + (5)(.4) 50 = Testing of hypotheses for two samples Example Solution (cont.) The test statistic is T = X 1 X D 0 S p 1 n n = 1 ( 5.085) = For α = 0.05, t 0.05, Hence, the rejection region is t < or t > Because T = does not fall in the rejection region, we do not reject the null hypothesis. At α = 0.05, there is not enough evidence to conclude that the population mean number of menses for the joggers and runners are different.
30 Testing of hypotheses for two samples Example Solution (cont.) (b) Under the assumption, we have σ 1 σ v = [(S 1 / n 1 ) + (S / n )] (S 1 / n 1 ) n (S / n ) n 1 = ( (.1) 6 + (.4) 6 ) ( (.1) 6 ) + 5 ( (.4) 6 ) 5 = Hence, we have v = 49 degrees of freedom. Because this value is large, the rejection region is still approximately t < and t > Hence, the conclusion is the same as that of part (a). In both parts (a) and (b), we assumed that the samples are independent and came from two normal populations. Testing of hypotheses for two samples Hypothesis test for (p 1 p ) for large samples (n i p i > 5 and n i (1p i ) > 5, for i = 1, ) Assume binomial distribution is approximated by normal distribution. To test H0: p 1 p = D 0 versus p 1 p > D 0, upper tailed test H a :{ p 1 p < D 0, lower tailed test p 1 p D 0, twotailed test at significant level α, the test statistic is Z = ˆp ˆp D 1 0 ˆp 1 ˆq 1 + ˆp ˆq n 1 n where z is the observed value of Z.
31 Testing of hypotheses for two samples Hypothesis test for (p 1 p ) for large samples (n i p i > 5 and n i q i > 5, for i = 1, ) (cont.) The rejection region is z > z α, upper tailed RR RR :{ z < z α, lower tailed RR z > z α /, twotailed RR Assumption: The samples are independent and n i p i > 5 and n i (1p i ) > 5, for i = 1,. Decision: Reject H 0 if test statistic falls in the RR and conclude that H a is true with (1 α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for given α and more experiments are needed. Testing of hypotheses for two samples Example Because of the impact of the global economy on a highwage country such as United States, it is claimed that the domestic content in manufacturing industries fell between 1977 and A survey of 36 randomly picked U.S. companies gave the proportion of domestic content total manufacturing in 1977 as 0.37 and in 1997 as At the 1% level of significance, test the claim that the domestic content really fell during the period
32 Testing of hypotheses for two samples Example Solution Let p 1 be the domestic content in 1977 and p be the domestic content in Given n 1 = n = 36, ˆp 1 = 0.37 and ˆp = We need to test H 0 : p 1 p = 0 vs. H a : p 1 p > 0. The test statistic is Z = ˆp ˆp D 1 0 ˆp 1 ˆq 1 + ˆp ˆq n 1 n = (0.37)(0.63) + (0.36)(0.64) = Testing of hypotheses for two samples Example Solution (cont.) For α = 0.01, z 0.01 =.35. Hence, the rejection region is z >.35. Because the observed value of the test statistic does not fall in the rejection region, at α = 0.01, there is not enough evidence to conclude that the domestic content in manufacturing industries fell between 1977 and 1997.
33 Testing of hypotheses for two samples We have already seen in Ch.4 that F = S 1 /σ 1 S /σ follows the Fdistribution with v 1 = n 1 1 numerator and v = n 1 degrees of freedom. Under the assumption H 0 : =, we have F = S 1 which has an Fdistribution with (v 1, v ) degrees of freedom. S σ 1 σ Testing of hypotheses for two samples Testing for the equality of variances To test H 0 : σ 1 = σ versus σ 1 > σ, upper tailed test H a :{ σ 1 < σ, lower tailed test σ 1 σ, twotailed test at significance level α, the test statistic is F = S 1 S The rejection region is f > F α (v 1, v ), RR :{ f < F 1 α (v 1, v ), f > F α / (v 1, v ) or f < F 1 α / (v 1, v ), upper tailed RR lower tailed RR twotailed RR
34 Testing of hypotheses for two samples Testing for the equality of variances (cont.) s1 where f is the observed test statistic given by f =. s Decision: Reject H 0 if test statistic falls in the RR and conclude that H a is true with (1 α)100% confidence. Otherwise, keep H 0, because there is not enough evidence to conclude that H a is true for given α and more experiments are needed. Assumption: (i) The two random samples are independent. (ii) Both populations are normal. In order to find F 1α (v 1,v ), we use the identity F 1α (v 1,v ) = (1/F α (v,v 1 )) Testing of hypotheses for two samples Example Consider two independent random samples X 1,, X n from an N(µ 1,σ 1 ) distribution and Y 1,, Y n from an N(µ,σ ) distribution. σ 1 Test H 0 : = versus H a : for the following basic statistics: n 1 = 5, x 1 = 410, s 1 = 95, and n = 16, x = 390, s = 300 Use α = 0.0. σ σ 1 σ Solution Test H 0 : σ 1 = σ versus H a : σ 1 σ. This is a twotailed test. Here the degrees of freedom are v 1 = 4 and v = 15. The test statistic is F = S 1 S = =
35 Testing of hypotheses for two samples Example Solution (cont.) From the Ftable, F 0.10 (4,15) = 1.9 and F 0.90 (4,15) = (1/F 0.10 (15,4)) = Hence, the rejection region is F > 1.90 and F < Because the observed value of the test statistic, 0.317, is less than 0.56, we reject the null hypothesis. There is evidence that the population variances are not equal. Take a break.
36 Testing of hypotheses for two samples Dependent Samples Two samples are dependent: each data point in one sample can be coupled in some natural, nonrandom fashion with each data point in the second sample. The pairing may be the result of the individual observations in the two samples: (1) represent before and after program, () sharing the same characteristics, (3) being matched by location, (4) being matched by time, (5) control and experimental, and so forth. Testing of hypotheses for two samples Dependent Samples Let (X 1i,X i ) for i = 1,,, n, be a random sample. X 1i and X j (i j) are independent. To test the significant of the difference between two population means when the samples are dependent, we first calculate for each pair of scores the difference, D i = X 1i X i, i = 1,,, n, between the two scores. Let µ D = E(D i ). Because pairs of observations form a random sample D 1,, D n are i.i.d random variables, if d 1,, d n are the observed values of D 1,, D n, the we define d = 1 n d i and s d = 1 n d i 1 (d i d) i=1 n ( d i) i=1 = n n 1 n 1 i=1 i=1 n n
37 Testing of hypotheses for two samples Testing for matched pairs experiment To test H 0 : µ D = d 0 versus H a :{ µ D > d 0, µ D < d 0, µ D d 0, upper tailed test lower tailed test twotailed test the test statistic: T = D d 0 (this approximately follows a Student t S D / n distribution with (n1) degrees of freedom). The rejection region is RR :{ t > t α,n 1, t < t α,n 1, t > t α /,n 1, upper tailed RR lower tailed RR twotailed RR where t is the observed test statistic. Testing of hypotheses for two samples Testing for matched pairs experiment (cont.) Assumption: The differences are approximately normally distributed. Decision: Reject H 0 if test statistic falls in the RR and conclude that H a is true with (1 α)100% confidence. Otherwise, do not reject H 0, because there is not enough evidence to conclude that H a is true for a given α and more data are needed.
38 Testing of hypotheses for two samples Example A new diet and exercise program has been advertised as remarkable way to reduce blood glucose levels in diabetic patients. Ten randomly selected diabetic patients are put on the program, and the results after 1 month are given by the following table: Before After Do the date provide sufficient evidence to support the claim that the new program reduces blood glucose level in diabetic patients? Use α = Testing of hypotheses for two samples Example Solution We need to test the hypothesis H 0 : µ D = 0 vs. H a : µ D < 0. First we calculate the difference if each pair given in the following table. Before After Diff. (afterbefore) From the table, the mean of the differences is d = 71.9 deviation s d = 56.. The test statistic is T = D d 0 S D / n = 71.9 = / 10 and the standard
39 Testing of hypotheses for two samples Example Solution (cont.) From the ttable, t 0.05,9 = Because the observed value of t = < t 0.05,9 = 1.833, we reject the null hypothesis and conclude that the sample evidence suggests that the new diet and exercise program is effective. Testing of hypotheses for two samples Why must we take paired differences and then calculate the mean and standard deviation for the differences why can t we just take the means of each sample, as we did for independent samples? à σ need not be equal to σ (X1 X ) D Assume that E(X ji ) = µ j, Var(X ji ) = σ j, for j = 1, and Cov(X 1i, X i ) = ρσ 1 σ where ρ denotes the assumed common correlation coefficient of the pair (X 1i, X i ) for i = 1,,, n. Because the value of D i, i = 1,,, n, are i.i.d., µ D = E(D i ) = E(X 1i ) E(X i ) = µ 1 µ and σ D = Var(D i ) = Var(X 1i ) + Var(X i ) Cov(X!i, X i ) = σ 1 + σ ρσ 1 σ
40 Testing of hypotheses for two samples From these calculations, and E(D) = µ D = µ 1 µ σ D = Var(D) = σ D n = 1 n (σ 1 + σ ρσ 1 σ ) Now, if the samples were independent with n 1 = n = n, and E(X 1 X ) = µ 1 µ σ (X1 X = 1 ) n (σ 1 + σ ) Hence, if ρ > 0, then σ D < σ (X1 X. ) ChiSquare Tests for Count Data Suppose that we have outcomes of a multinomial experiment that consists of k mutually exclusive and exhaustive events A 1,.A k. Let P(A i )=p i, i = 1,,,k k p i = 1 Let the experiment be repeated n times, and X i (i=1,,,n) represent the number of times the event A i occurs, then (X 1,,X n ) have a multinomial distribution with parameters k, p 1,,p k. Let Q = k i=1 i=1 (X i np i ) np i It can be shown that for large n, the random variable Q is approximately  distributed with (k1) degrees of freedom. It is usual to demand np i 5 (i=1,,, k) for the approximation to be valid, although the approximation generally works well if for only a few values of i (~0%), np i 1 and the rest (~80%) satisfy the condition np i 5. (Karl Pearson 1900) χ
41 ChiSquare Tests for Count Data Example A plant geneticist grows 00 progeny from a cross that is hypothesized to result in a 3:1 phenotypic ratio of redflowered to whiteflowered plants. Suppose the cross produces 170 red to 30 white flowered plants. Calculate the value of Q for this experiment. ChiSquare Tests for Count Data Example Solution n=00, k=, Let i=1 represent redflowered and i= represent whiteflowered plants. Then X 1 =170, and X =30. Here, H 0 : The flower color population ratio is not different from 3:1, and the alternate is H a : The flower color population sampled has a flower color ratio that is not 3 red: 1 white. Under the null hypothesis, the expected frequencies are np 1 =(00)(3/4)=150, and np =(00)(1/4)=50. Hence, Q = k i=1 (X i np i ) np i ( ) (30 50) = = Often X i is called the observed frequency and np i is called expected frequency. This example gives a measure of how close our observed frequencies come to the expected frequencies and is referred to as a measure of goodness of fit. Smaller values of Q values indicate better fit.
42 The GoodnessofFit Test Summary Let an experiment have k mutually exclusive and exhaustive outcomes A 1, A,, A k. We would like to test the null hypothesis that all the p i =p(a i ), i = 1,,, k are equal to known numbers p i0, i = 1,,, k. That is, to test H 0 :p 1 =p 10, p k =p k0 vs. H a : At least one of the probabilities is different from the hypothesized value. The test is always a onesided upper tail test. Let O i be the observed frequency, E i = np i0 be the expected frequency (frequency under the null hypothesis), and k be the number of classes. The test statistic is Q = k i=1 (O i E i ) E i The test statistic Q has an approximate Chisquare distribution with k1 degrees of freedom. The rejection region is Q χ α,k 1 Assumption: E i 5: Exact methods are available. Computing the power of this test is difficult. The GoodnessofFit Test Summary This test implies that if the observed data are very close to the expected data, we have a very good fit and we accept the null hypothesis. That is, for small Q values, we accept H 0.
43 The GoodnessofFit Test Example A die is rolled 60 times and the face values are recorded. The results are as follows. Up face Frequency Is the die balanced? Test using α = 0.05 The GoodnessofFit Test Example Solution If the die is balanced, we must have p 1 = p = = p 6 = 1/6 where p i =P(face value on the die is i), i = 1,,, 6. This has the discrete uniform distribution. Hence, vs. H 0 : p 1 = p = = p 6 = 1/6 H a : At least one of the probabilities is different from the hypothesized value of 1/6 E 1 =n 1 p 1 = (60)(1/6)=10,, E 6 =10. We summarize the calculation in the following table: Face value Frequency, O i Expected value, E i
44 The GoodnessofFit Test Example Solution (cont.) The test statistic is given by 6 (O Q = i E i ) = 6 i=1 E i From the chisquare table with 5 d.f., χ 0.05,5 = Because the value of the test statistic does not fall in the rejection region, we do not reject H 0. Therefore, we conclude that the die is balanced. Take a break.
45 Contingency Table: Test for Independence χ One of the uses of the statistic is in contingency (dependence) testing where n randomly selected items are classified according to two different criteria, such as when data are classified on the basis of two factors (row factor and column factor) where the row factor has r levels and the column factor has c levels. Our interest is to test for independence of two methods of classification of observed events. For example, we might classify a sample of students by sex and by their grade on a statistics course in order to test the hypothesis that the grades are dependent on sex. More generally the problem is to investigate a dependency between two classification criteria. The obtained data are displayed as shown in the following table, where n ij represents the number of data values under row i and column j. Levels of column factor 1 c Row total Row levels 1 n 11 n 1 n 1c n 1 n 1 n n c n r n r1 n r n rc n r Column total n.1 n. n.c N Contingency Table: Test for Independence where c N = n. j = n i. = n ij j=1 r i=1 r c i=1 j=1 We wish to test that the hypothesis that the two factors are independent.
46 Test for the Independence of Two Factors To test H 0 : The factors are independent vs. H a : The factors are dependent The test statistic is, Q = r c i=1 j=1 (O ij E ij ) E ij where Q ij = n ij and E ij = n i n j N The under the null hypothesis the test statistic Q has an approximate chisquare distribution with (r1)(c1) degrees of freedom. Hence, the rejection region is Q > χ α,(r 1)(c 1) Assumption: E ij 5 Test for the Independence of Two Factors Example The following table gives a classification according to religious affiliation and marital status for 500 randomly selected individuals. Religious affiliation A B C D None Total Marital status Single With spouse Total For α = 0.01, test the null hypothesis that marital status and religious affiliation are independent.
47 Test for the Independence of Two Factors Example Solution We need to test the hypothesis vs. H 0 : Marital status and religious affiliation are independent H a : Marital status and religious affiliation are dependent. Here, c = 5, and r =. For α = 0.01, and for (c1)(r1)=4 degrees of freedom, we have = χ 0.01,4 Hence, the rejection region is Q > We have E ij = n in j N. Thus, E 11 = (116)(11) 500 E 13 = (116)(56) 500 E 15 = (116)(55) 500 E = (384)(80) 500 E 4 = (384)(98) 500 = 48.95;E 1 = (116)(80) = 18.5; 500 = 1.99;E 14 = (116)(98) =.736; 500 = 1.76;E 1 = (384)(11) = 16.05; 500 = 61.44;E 3 = (384)(56) = ; 500 = 75.64;E 5 = (384)(55) = 4.4; 500 Test for the Independence of Two Factors Example Solution (cont.) The value of the test statistic is Q = r c i=1 j=1 = (O ij E ij ) E ij Because the observed value of Q does not fall in the rejection region, we do not reject the null hypothesis at α =0.01. Therefore, based on the observed data, the marital status and religious affiliation are independent.
48 Testing to Identify the Probability Distribution In hypothesis testing problems we often assume that the form of the population distribution is known. For example, in a χ test for variance, we assume that the population is normal. The goodnessoffit tests examine the validity of such an assumption if we have a large enough sample. This is another application of the chisquare statistic used for goodnessof fit tests. GoodnessofFit Test Procedures for Probability Distributions Let X 1,,X n be a sample from a population with cdf F(x), which may depend on the set of unknown parameters θ. We wish to test H 0 : F(x)=F 0 (x), where F 0 (x) is completely specified. 1. Divide the range of values of the random variables X 1 into K nonoverlapping intervals I 1, I,,I k. Let O j be the number of sample values that fall in the interval I j (j = 1,,, K). Assuming the distribution of X to be F 0 (x), find P(X I j ). Let P(X I j ) = π j. Let E j =nπ j be the expected frequency. 3. Compute the test statistic Q given by Q = k i=1 (O i E i ) E i The test statistic Q has an approximate of freedom. 4. Reject of H 0 if Q > χ α,(k 1) 5. Assumptions: E j 5, j = 1,,., k χ distribution with (K1) degrees
49 GoodnessofFit Test Procedures for Probability Distributions If the null hypothesis does not specify F 0 (x) completely, that is, if F 0 (x) contains some unknown parameters θ 1,θ,,θ p, we estimate these parameters by the method of maximum likelihood. Using these estimated values we specify F 0 (x) completely. Denote the estimated F 0 (x) by ˆ 0 ( ). Let The test statistic is ˆπ i = P{X I i ˆ 0 (x)} Q = k i=1 (O i Êi ) Ê i and Ê i = n ˆπ i The statistic Q has an approximate chisquare distribution with (K1p) degrees of freedom. We reject H 0 if Q > χ α,(k 1 p) GoodnessofFit Test Procedures for Probability Distributions Example The grades of students in a class of 00 are given in the following table. Test the hypothesis that the grades are normally distributed with a mean of 75 and a standard deviation of 8. Use α = Range Number of students
50 GoodnessofFit Test Procedures for Probability Distributions Example Solution We have O 1 = 1, O = 36, O 3 = 90, O 4 = 44, O 5 = 18. We now compute π i (i=1,,,5), using the continuity correction factor, And, π 1 = P{X 59.5 H 0 } = P{z } = 0.06, 8 π = 0.189, π 3 = 0.47, π 4 = 0.476, π 5 = , E 1 = 5.4, E = 43.78, E 3 = 94.44, E 4 = 49.5, E 5 = 7.0 The test statistic results in n (O Q = i E i ) = 6. i=1 E i Q has a chisquare distribution with (51)=4 degrees of freedom. The critical value is χ 0.05,4 = Hence the rejection region is Q >7.11. Because the observed value of Q =6.>7.11, we reject H 0 at α =0.05. Thus, we conclude that the population is not normal.
3.4 Statistical inference for 2 populations based on two samples
3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted
More information7 Hypothesis testing  one sample tests
7 Hypothesis testing  one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X
More informationIntroduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.
Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.
More informationHypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...
Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................
More information[Chapter 10. Hypothesis Testing]
[Chapter 10. Hypothesis Testing] 10.1 Introduction 10.2 Elements of a Statistical Test 10.3 Common LargeSample Tests 10.4 Calculating Type II Error Probabilities and Finding the Sample Size for Z Tests
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationChiSquare Test. Contingency Tables. Contingency Tables. ChiSquare Test for Independence. ChiSquare Tests for GoodnessofFit
ChiSquare Tests 15 Chapter ChiSquare Test for Independence ChiSquare Tests for Goodness Uniform Goodness Poisson Goodness Goodness Test ECDF Tests (Optional) McGrawHill/Irwin Copyright 2009 by The
More informationHYPOTHESIS TESTING: POWER OF THE TEST
HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,
More information93.4 Likelihood ratio test. NeymanPearson lemma
93.4 Likelihood ratio test NeymanPearson lemma 91 Hypothesis Testing 91.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental
More informationTesting on proportions
Testing on proportions Textbook Section 5.4 April 7, 2011 Example 1. X 1,, X n Bernolli(p). Wish to test H 0 : p p 0 H 1 : p > p 0 (1) Consider a related problem The likelihood ratio test is where c is
More informationChapter 8. Hypothesis Testing
Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing
More informationNotes for STA 437/1005 Methods for Multivariate Data
Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.
More informationMATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...
MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 20092016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................
More informationJoint Exam 1/P Sample Exam 1
Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question
More informationAn Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10 TWOSAMPLE TESTS
The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10 TWOSAMPLE TESTS Practice
More informationInferential Statistics
Inferential Statistics Sampling and the normal distribution Zscores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
More informationMT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo. 3 MT426 Notebook 3 3. 3.1 Definitions... 3. 3.2 Joint Discrete Distributions...
MT426 Notebook 3 Fall 2012 prepared by Professor Jenny Baglivo c Copyright 20042012 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MT426 Notebook 3 3 3.1 Definitions............................................
More informationChapter 2. Hypothesis testing in one population
Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance
More informationSHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
Ch. 10 Chi SquareTests and the FDistribution 10.1 Goodness of Fit 1 Find Expected Frequencies Provide an appropriate response. 1) The frequency distribution shows the ages for a sample of 100 employees.
More informationMaximum Likelihood Estimation
Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for
More informationMATH4427 Notebook 3 Spring MATH4427 Notebook Hypotheses: Definitions and Examples... 3
MATH4427 Notebook 3 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 20092016 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MATH4427 Notebook 3 3 3.1 Hypotheses: Definitions and Examples............................
More informationModule 5 Hypotheses Tests: Comparing Two Groups
Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationTest of Hypotheses. Since the NeymanPearson approach involves two statistical hypotheses, one has to decide which one
Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win $1 if head and lose $1 if tail. After 10 plays, you lost $2 in net (4 heads
More informationHYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1. used confidence intervals to answer questions such as...
HYPOTHESIS TESTING (ONE SAMPLE)  CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men
More informationHypothesis Testing Introduction
Hypothesis Testing Introduction Hypothesis: A conjecture about the distribution of some random variables. For example, a claim about the value of a parameter of the statistical model. A hypothesis can
More informationPrinciple of Data Reduction
Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then
More informationE205 Final: Version B
Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random
More informationLecture 8. Confidence intervals and the central limit theorem
Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of
More informationLecture 8 Hypothesis Testing
Lecture 8 Hypothesis Testing Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Midterm 1 Score 46 students Highest score: 98 Lowest
More informationStatistics 641  EXAM II  1999 through 2003
Statistics 641  EXAM II  1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the
More informationCATEGORICAL DATA ChiSquare Tests for Univariate Data
CATEGORICAL DATA ChiSquare Tests For Univariate Data 1 CATEGORICAL DATA ChiSquare Tests for Univariate Data Recall that a categorical variable is one in which the possible values are categories or groupings.
More informationTHE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
More information5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives
C H 8A P T E R Outline 8 1 Steps in Traditional Method 8 2 z Test for a Mean 8 3 t Test for a Mean 8 4 z Test for a Proportion 8 6 Confidence Intervals and Copyright 2013 The McGraw Hill Companies, Inc.
More informationPractice problems for Homework 11  Point Estimation
Practice problems for Homework 11  Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:
More informationSampling and Hypothesis Testing
Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus
More information1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96
1 Final Review 2 Review 2.1 CI 1propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
More informationHypothesis testing for µ:
University of California, Los Angeles Department of Statistics Statistics 13 Elements of a hypothesis test: Hypothesis testing Instructor: Nicolas Christou 1. Null hypothesis, H 0 (always =). 2. Alternative
More informationChapter 9: Hypothesis Testing Sections
Chapter 9: Hypothesis Testing Sections  we are still here Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 TwoSided Alternatives 9.5 The t Test 9.6 Comparing the
More information15.0 More Hypothesis Testing
15.0 More Hypothesis Testing 1 Answer Questions Type I and Type II Error Power Calculation Bayesian Hypothesis Testing 15.1 Type I and Type II Error In the philosophy of hypothesis testing, the null hypothesis
More informationChapter 7 Notes  Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:
Chapter 7 Notes  Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a
More information3.6: General Hypothesis Tests
3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally.
More informationChapter Five: Paired Samples Methods 1/38
Chapter Five: Paired Samples Methods 1/38 5.1 Introduction 2/38 Introduction Paired data arise with some frequency in a variety of research contexts. Patients might have a particular type of laser surgery
More informationLecture 13 More on hypothesis testing
Lecture 13 More on hypothesis testing Thais Paiva STA 111  Summer 2013 Term II July 22, 2013 1 / 27 Thais Paiva STA 111  Summer 2013 Term II Lecture 13, 07/22/2013 Lecture Plan 1 Type I and type II error
More information12.5: CHISQUARE GOODNESS OF FIT TESTS
125: ChiSquare Goodness of Fit Tests CD121 125: CHISQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability
More information1. ChiSquared Tests
1. ChiSquared Tests We'll now look at how to test statistical hypotheses concerning nominal data, and specifically when nominal data are summarized as tables of frequencies. The tests we will considered
More informationUnit 29 ChiSquare GoodnessofFit Test
Unit 29 ChiSquare GoodnessofFit Test Objectives: To perform the chisquare hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni
More informationHypothesis Testing. Bluman Chapter 8
CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 81 Steps in Traditional Method 82 z Test for a Mean 83 t Test for a Mean 84 z Test for a Proportion 85 2 Test for
More informationChapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 81 Overview 82 Basics of Hypothesis Testing
Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 81 Overview 82 Basics of Hypothesis Testing 83 Testing a Claim About a Proportion 85 Testing a Claim About a Mean: s Not Known 86 Testing
More informationTechnology StepbyStep Using StatCrunch
Technology StepbyStep Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate
More informationNull Hypothesis Significance Testing Signifcance Level, Power, ttests. 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom
Null Hypothesis Significance Testing Signifcance Level, Power, ttests 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Simple and composite hypotheses Simple hypothesis: the sampling distribution is
More informationMATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/6
MATH 214 (NOTES) Math 214 Al Nosedal Department of Mathematics Indiana University of Pennsylvania MATH 214 (NOTES) p. 1/6 "Pepsi" problem A market research consultant hired by the PepsiCola Co. is interested
More informationTopic 19: Goodness of Fit
Topic 19: November 24, 2009 A goodness of fit test examine the case of a sequence if independent experiments each of which can have 1 of k possible outcomes. In terms of hypothesis testing, let π = (π
More informationAn Introduction to Basic Statistics and Probability
An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random
More informationp1^ = 0.18 p2^ = 0.12 A) 0.150 B) 0.387 C) 0.300 D) 0.188 3) n 1 = 570 n 2 = 1992 x 1 = 143 x 2 = 550 A) 0.270 B) 0.541 C) 0.520 D) 0.
Practice for chapter 9 and 10 Disclaimer: the actual exam does not mirror this. This is meant for practicing questions only. The actual exam in not multiple choice. Find the number of successes x suggested
More informationHypothesis testing  Steps
Hypothesis testing  Steps Steps to do a twotailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =
More informationHypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam
Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests
More informationHypothesis Testing  II
3σ 2σ +σ +2σ +3σ Hypothesis Testing  II Lecture 9 0909.400.01 / 0909.400.02 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S 3σ 2σ +σ +2σ +3σ Review: Hypothesis
More informationThe GoodnessofFit Test
on the Lecture 49 Section 14.3 HampdenSydney College Tue, Apr 21, 2009 Outline 1 on the 2 3 on the 4 5 Hypotheses on the (Steps 1 and 2) (1) H 0 : H 1 : H 0 is false. (2) α = 0.05. p 1 = 0.24 p 2 = 0.20
More informationChapter 9, Part A Hypothesis Tests. Learning objectives
Chapter 9, Part A Hypothesis Tests Slide 1 Learning objectives 1. Understand how to develop Null and Alternative Hypotheses 2. Understand Type I and Type II Errors 3. Able to do hypothesis test about population
More informationChiSquare Tests. In This Chapter BONUS CHAPTER
BONUS CHAPTER ChiSquare Tests In the previous chapters, we explored the wonderful world of hypothesis testing as we compared means and proportions of one, two, three, and more populations, making an educated
More informationTopic 8. Chi Square Tests
BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test
More informationChiSquare Tests and the FDistribution. Goodness of Fit Multinomial Experiments. Chapter 10
Chapter 0 ChiSquare Tests and the FDistribution 0 Goodness of Fit Multinomial xperiments A multinomial experiment is a probability experiment consisting of a fixed number of trials in which there are
More informationHow to Conduct a Hypothesis Test
How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some
More informationHypoTesting. Name: Class: Date: Multiple Choice Identify the choice that best completes the statement or answers the question.
Name: Class: Date: HypoTesting Multiple Choice Identify the choice that best completes the statement or answers the question. 1. A Type II error is committed if we make: a. a correct decision when the
More informationTests for Two Proportions
Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics
More informationGeneral Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test
Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationUnit 26 Estimation with Confidence Intervals
Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference
More informationIntroduction to Hypothesis Testing
I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters  they must be estimated. However, we do have hypotheses about what the true
More informationSection 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)
Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis
More informationLesson 1: Comparison of Population Means Part c: Comparison of Two Means
Lesson : Comparison of Population Means Part c: Comparison of Two Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis
More informationChi Square for Contingency Tables
2 x 2 Case Chi Square for Contingency Tables A test for p 1 = p 2 We have learned a confidence interval for p 1 p 2, the difference in the population proportions. We want a hypothesis testing procedure
More informationChapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)
Chapter 4 Statistical Inference in Quality Control and Improvement 許 湘 伶 Statistical Quality Control (D. C. Montgomery) Sampling distribution I a random sample of size n: if it is selected so that the
More informationModule 9: Nonparametric Tests. The Applied Research Center
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } OneSample ChiSquare Test
More informationChapter 9 Testing Hypothesis and Assesing Goodness of Fit
Chapter 9 Testing Hypothesis and Assesing Goodness of Fit 9.19.3 The NeymanPearson Paradigm and Neyman Pearson Lemma We begin by reviewing some basic terms in hypothesis testing. Let H 0 denote the null
More informationDongfeng Li. Autumn 2010
Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis
More informationBasic concepts and introduction to statistical inference
Basic concepts and introduction to statistical inference Anna Helga Jonsdottir Gunnar Stefansson Sigrun Helga Lund University of Iceland (UI) Basic concepts 1 / 19 A review of concepts Basic concepts Confidence
More informationLAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
More informationIntroduction to Hypothesis Testing. Copyright 2014 Pearson Education, Inc. 91
Introduction to Hypothesis Testing 91 Learning Outcomes Outcome 1. Formulate null and alternative hypotheses for applications involving a single population mean or proportion. Outcome 2. Know what Type
More informationPrep Course in Statistics Prof. Massimo Guidolin
MSc. Finance & CLEFIN A.A. 3/ ep Course in Statistics of. Massimo Guidolin A Few Review Questions and oblems concerning, Hypothesis Testing and Confidence Intervals SUGGESTION: try to approach the questions
More informationIntroduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses
Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the
More informationMath 251, Review Questions for Test 3 Rough Answers
Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,
More informationThe ChiSquare Test. STAT E50 Introduction to Statistics
STAT 50 Introduction to Statistics The ChiSquare Test The Chisquare test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed
More informationCalculating PValues. Parkland College. Isela Guerra Parkland College. Recommended Citation
Parkland College A with Honors Projects Honors Program 2014 Calculating PValues Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating PValues" (2014). A with Honors Projects.
More informationTests of Hypotheses Using Statistics
Tests of Hypotheses Using Statistics Adam Massey and Steven J. Miller Mathematics Department Brown University Providence, RI 0292 Abstract We present the various methods of hypothesis testing that one
More informationChapter 9: Hypothesis Testing Sections
Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 TwoSided Alternatives 9.6 Comparing the
More informationstatistics Chisquare tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals
Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss
More informationStatistics for Management IISTAT 362Final Review
Statistics for Management IISTAT 362Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to
More informationHypothesis Testing. 1 Introduction. 2 Hypotheses. 2.1 Null and Alternative Hypotheses. 2.2 Simple vs. Composite. 2.3 OneSided and TwoSided Tests
Hypothesis Testing 1 Introduction This document is a simple tutorial on hypothesis testing. It presents the basic concepts and definitions as well as some frequently asked questions associated with hypothesis
More informationComparing Multiple Proportions, Test of Independence and Goodness of Fit
Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2
More informationModule 2 Probability and Statistics
Module 2 Probability and Statistics BASIC CONCEPTS Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The standard deviation of a standard normal distribution
More informationLecture Outline. Hypothesis Testing. Simple vs. Composite Testing. Stat 111. Hypothesis Testing Framework
Stat 111 Lecture Outline Lecture 14: Intro to Hypothesis Testing Sections 9.19.3 in DeGroot 1 Hypothesis Testing Consider a statistical problem involving a parameter θ whose value is unknown but must
More informationACTM State ExamStatistics
ACTM State ExamStatistics For the 25 multiplechoice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tiebreaker
More informationINTRODUCTORY STATISTICS
INTRODUCTORY STATISTICS Questions 290 Field Statistics Target Audience Science Students Outline Target Level First or Secondyear Undergraduate Topics Introduction to Statistics Descriptive Statistics
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Open book and note Calculator OK Multiple Choice 1 point each MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Find the mean for the given sample data.
More information1) What is the probability that the random variable has a value greater than 2? A) 0.750 B) 0.625 C) 0.875 D) 0.700
Practice for Chapter 6 & 7 Math 227 This is merely an aid to help you study. The actual exam is not multiple choice nor is it limited to these types of questions. Using the following uniform density curve,
More informationClass 19: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, ChiSquare (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More information