The Neyman-Pearson lemma. The Neyman-Pearson lemma

Size: px
Start display at page:

Download "The Neyman-Pearson lemma. The Neyman-Pearson lemma"

Transcription

1 The Neyman-Pearson lemma In practical hypothesis testing situations, there are typically many tests possible with significance level α for a null hypothesis versus alternative hypothesis. This leads to some important questions, such as (1) How to decide on the test statistic? () How to know that we selected the best rejection region? The Neyman-Pearson lemma Definition 7..1 Suppose that W is the test statistic and RR is the rejection region for a test of hypothesis concerning the value of a parameter θ. Then the power of the test is the probability that the test rejects H 0 when the alternative is true. That is, π = Power(θ) = P(W in RR when the parameter value is an alternative θ). If H 0 : θ = θ 0 and H a : θ θ 0, then the power of the test at some θ = θ 1 θ 0 is Power(θ) = P(reject H 0 θ = θ 1 ). But, β(θ 1 )=P(accept H 0 θ = θ 1 ). Therefore, Power(θ 1 ) = 1-β(θ 1 ).

2 The Neyman-Pearson lemma Example 7..1 Let X 1,, X n be a random sample from a Poisson distribution with parameter λ, that is, the pdf is given by f (x) = e λ λ x / (x!). Then the hypothesis H 0 : λ=1 uniquely specifies the distribution, because f (x) = e 1 / (x!) and hence is a simple hypothesis. The hypothesis H a : λ>1 is composite, because f(x) is not uniquely determined. The Neyman-Pearson lemma Definition 7.. A test at a given α of a simple hypothesis H 0 versus the simple alternative H a that has the largest power among tests with the probability of type I error no larger than the given α is called the most powerful test.

3 The Neyman-Pearson lemma Theorem 7..1 (Neyman-Pearson Lemma) Suppose that one wants to test a simple hypothesis H 0 : θ = θ 0 versus the simple alternative hypothesis H a : θ = θ 1 based on a random sample X 1,, X n from a distribution with parameter θ. Let L(θ) L(θ; X 1,, X n )>0 denote the likelihood of the sample when the value of the parameter is θ. If there exist a positive constant K and a subset C of the sample space R n (the Euclidean n-space) such that 1. L θ 0 L θ 1. L θ 0 L θ 1 ( ) ( ) K for ( x, x,, x 1 n ) C ( ) ( ) K for ( x 1, x,, x n ) C ' ( ) C; θ 0 = α 3.P x 1, x,, x n Then the test with critical region C will be the most powerful test for H 0 versus H a. We call α the size of the test and C the best critical region of size α. The Neyman-Pearson lemma Example 7.. Let X 1,, X n denote an independent random sample from a population with a Poisson distribution with mean λ. Derive the most powerful test for testing H 0 : λ= versus H a : λ=1/. Solution Recall that the pdf of Poisson variable is e λ λ x, λ > 0, x = 0,1,, p(x) = x! 0, otherwise Thus, the likelihood function is L= n ( x i ) i= 1 λn [ λ e n i= 1 x! i ]

4 The Neyman-Pearson lemma Example 7.. n Solution (cont.) ( xi For λ=, L(θ 0 ) = L(λ = ) = [ ) i=1 e n ] n x i! and for λ=1/, i=1 ( xi L(θ 1 ) = L(λ = 1/ ) = [(1/ ) ) i=1 e n/ ] ( ) Thus, L(θ 0 ) x i e n L(θ 1 ) = ( 1 < K ) x i e n/ Taking natural logarithm, ( x i )ln 4 3n < ln K. Solving for x i and letting {[ln K + (3n / )] / ln 4} = K ', we will reject H 0 whenever x i < K '. n i=1 n x i! The Neyman-Pearson lemma Procedure for applying the Neyman-Pearson lemma 1. Determine the likelihood functions under both null and alternative hypotheses.. Take the ratio of the two likelihood functions to be less than a constant K. 3. Simplify the inequality in step to obtain a rejection region.

5 The Neyman-Pearson lemma Example 7..3 Suppose X 1,, X n is a random sample from a normal distribution with a known mean of µ and an unknown variance of σ. Find the most powerful α-level test for testing H 0 : σ = σ 0 versus H a : σ = σ 1 (σ 1 >σ 0 ) Show that this test is equivalent to the χ -test. Is the test uniformly most powerful for H a : σ > σ 0? The Neyman-Pearson lemma Example 7..3 Solution To test H 0 : (σ 1 > σ 0 ) versus H a : σ > σ 1. We have n 1 L(σ 0 ) = e (x i µ) σ 0 1 = e i=1 πσ 0 π for some K. (x i µ) ( ) n σ 0 n Similarly, 1 L(σ 1 ) = e σ 1. ( π ) n n σ 1 Therefore, the most powerful test is, reject H 0 if L(σ 0 ) L(σ 1 ) = σ 1 σ 0 n (x i µ) σ 0 exp{ (σ 1 σ 0 ) (x i µ) } K σ 1 σ 0.

6 The Neyman-Pearson lemma Example 7..3 Solution (cont.) Taking the natural logarithms, we have or nln σ 1 σ 1 σ 0 σ x i µ 1 σ 0 σ 0 ( ) ( ) ( x i µ ) nln σ 1 σ 0 ln K σ 1σ 0 σ 1 σ 0 ln K ( ) = C To find the rejection region for a fixed value of α, write the region as ( x i µ ) C σ = C ' 0 ( x i µ ) σ 0 Note that σ has a χ 0 -distribution with n degrees of freedom. Under the H 0 because the same rejection region (does not depend upon the specific value of σ 1 in the alternative) would be used for any (σ 1 > σ 0 ), the test is uniformly most powerful. Likelihood ratio tests In this section, we shall study a general procedure that is applicable when one or both H 0 and H a are composite. We assume that the pdf or pmf of the random variable X is f(x,θ), where θ can be one or more known parameters. Consider the hypotheses H 0 : µ = µ 0 vs. H a : µ µ 0 Where θ is the known population parameter(s) with values in Θ, and Θ 0 is a subset of Θ. Let Θ represent the total parameter space that is the set of all possible values of the parameter θ given by either H 0 or H a. Maximum likelihood Definition max L θ; x 1,, x n θ Θ The likelihood ratio λ is the ratio λ = 0 ( ) L ( θ; x,, x 1 n ) = L * 0 L * max θ Θ Maximum likelihood We note that 0 λ 1. Because λ is the ratio of nonnegative functions, λ 0. Because Θ 0 is a subset of Θ,we know that max L( θ ) max L( θ ) θ Θ Hence, λ 1. 0 θ Θ

7 Likelihood ratio tests Likelihood ratio tests (LRTs) To test H 0 :θ Θ 0 vs. H a :θ Θ a λ = max L θ; x 1,, x n θ Θ 0 L θ; x 1,, x n max θ Θ will be used as the test statistic. ( ) ( ) = L * 0 L * The rejection region for the likelihood ratio test is given by Reject H 0 if λ K. K is selected such that the test has the given significance level α. Likelihood ratio tests Example Let X 1,, X n be a random sample from an N(µ,σ ). Assume that σ is known. We wish to test, at level α, H 0 : µ = µ 0 vs. H a : µ µ 0. Find an appropriate likelihood ratio test. Solution We have seen that to test H 0 : µ = µ 0 vs. H a : µ µ 0 there is no uniformly most powerful test for this case. The likelihood function is 1 L(µ) = πσ Here, Θ ={µ 0 } and Θ =R-{µ 0 }. 0 a n exp{ n i=1 (x i µ) σ }.

8 Likelihood ratio tests Example Solution (cont.) Hence, L * 0 = 1 πσ Similarly, L * = max <µ< n exp{ 1 πσ (x i µ 0 ) Because the only unknown parameter in the parameter space Θ is µ, - <µ<, the maximum of the likelihood function is achieved when µ equals its maximum likelihood estimator, that is, ˆµ ml. = X. Therefore, with a simple calculation we have n exp{ (x i µ 0 ) / σ } i=1 λ = n = e n(x µ 0 ) /σ exp{ (x i x) / σ } i=1 n n i=1 σ }. exp{ n i=1 (x i µ) σ }. Likelihood ratio tests Example Solution (cont.) Thus, the likelihood ratio test has the rejection region Reject H 0 if λ K. which is equivalent to n σ (X µ 0 ) ln K (X µ 0 ) σ / n ln K X µ 0 σ / n ln K = c 1. We now compute c 1. Under H 0, [(X µ 0 ) / (σ / n)] ~ N(0,1). Hence, LRT for the given hypothesis is Reject H 0 if X-µ 0 σ / n > z a/. Thus, in this case, the likelihood ratio test is equivalent to the z-test for large random samples.

9 Likelihood ratio tests Procedure for the likelihood ratio test (LRT) 1. Find the largest value of the likelihood L(θ) for any θ 0 Θ 0 by finding the maximum likelihood estimate within Θ 0 and substituting back into the likelihood function.. Find the largest value of the likelihood L(θ) for any θ Θ by finding the maximum likelihood estimate within Θ and substituting back into the likelihood function. 3. Form the ratio λ = λ ( x 1,, x n ) = L( θ )inθ 0 L( θ )inθ 4. Determine a K so that the test has the desired probability of type I error, α. 5. Reject H 0 if λ K. Likelihood ratio tests Example 7.3. Machine I produces 5% defectives. Machine produces 10% defectives. Ten items produced by each of the machines are sampled randomly; X = number of defectives. Let θ be the true proportion of defectives. Test H 0 : θ = 0.05 versus H a : θ = 0.1. Use α = Solution We need to test H 0 : θ = 0.05 vs. H a : θ = 0.1. Let 10 x (0.05)x (0.95) 10 x, if θ = L(θ) = 10 x (0.1)x (0.90) 10 x, if θ = 0.10.

10 Likelihood ratio tests Example 7.3. Solution (cont.) And L 1 = L(0.05) = 10 x (0.05)x (0.95) 10 x and L = L(0.1) = 10 x (0.1)x (0.90) 10 x Thus, we have L 1 = (0.05)x (0.95) 10 x = 1 L (0.1) x (0.9) 10 x x x. L The ratio λ = 1 max(l 1, L ). Likelihood ratio tests Example 7.3. Solution (cont.) Note that if max(l 1,L )=L 1, then λ=1. Because we want to reject for small values of λ, max(l 1,L )=L, and we reject H 0 if (L 1 /L ) K or (L /L 1 ) > K (note that L /L 1 = x (18/19) 10-x ). That is, reject H 0 if 18 x x > K (19 / 9) x > K. Hence, reject H 0 if X>C; P(X>C H 0 : θ=0.05) Using the binomial tables, we have P(X > θ = 0.05) = P(X θ = 0.05) = => Reject H 0 if X>. and P.S. The likelihood ratio tests do not always produce a test statistic with a known probability distribution.

11 Take a break. Hypotheses for a single parameter Definition Corresponding to an observed value of a test statistic, the p-value (or attained significance level) is the lowest level of significance at which the null hypothesis would have been rejected. P-value: the probability of obtaining a test statistic at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. [from wiki]

12 Hypotheses for a single parameter Steps to find the p-value 1. Let TS be the test statistic.. Compute the value of TS using the sample X 1,, X n. Say it is a. 3. The p-value is given by P(TS < a H 0 ), if lower tail test p value = P(TS > a H 0 ), if upper tail test P( TS > a H 0 ), if two tail test. p-value depends on alternative hypothesis!! Hypotheses for a single parameter Example To test H 0 : µ = 0 vs. H a : µ 0, suppose that the test statistic Z results in a computed value of Then, the p-value=p( Z >1.58)= *0.0571= That is, we must have a type I error of (or higher) in order to reject H 0. Also, if H a : µ > 0, then the p-value would be P(Z>1.58)= In this case we must have an α of in order to reject H 0.

13 Hypotheses for a single parameter Reporting test result as p-values 1. Choose the maximum value of α that you are willing to tolerate.. If the p-value of the test is less than maximum value of α, reject H 0. Hypotheses for a single parameter Example 7.4. The management of a local health club claims that its members lose on the average 15 pounds or more within the first 3 months after joining the club. To check this claim, a consumer agency took a random sample of 45 members of this health club and found that they lost an average of 13.8 pounds within the first 3 months of membership, with a sample deviation of 4. pounds. (a) Find the p-value for this test. (b) Based on the p-value in (a), would you reject the null hypothesis at α =0.01?

14 Hypotheses for a single parameter Example 7.4. Solution (a) Let µ be the true mean weight loss in pounds within the first 3 months of membership in this club. Then we have to test the hypothesis H 0 : µ = 15 vs. H a : µ < 15 Here n = 45, x =13.8, and s = 4.. Because n = 45>30, we can use normal approximation. Hence, the test statistic is z = 4. / 45 = and p-value=p(z< ) P(Z<-1.9)= Thus, we can use an α as small as and still reject H 0. (b) No. Hypotheses for a single parameter Steps in any hypothesis testing problem 1. State the alternative hypothesis, H a (what is believed to be true).. State the null hypothesis, H 0 (what is doubted to be true). 3. Decide on a level of significance α. 4. Choose an appropriate TS and compute the observed test statistic. 5. Using the distribution of TS and α, determine the rejection region(s) (RR). 6. Conclusion: If the observed test statistic falls in the RR, reject H 0 and conclude that based on the sample information, we are (1- α)100% confident that H a is true. Otherwise, conclude that there is not sufficient evidence to reject H 0. In all the applied problems, interpret the meaning of your decision. 7. State any assumptions you made in testing the given hypothesis. 8. Compute the p-value from the null distribution of the test statistic and interpret it.

15 Hypotheses for a single parameter Summary of hypothesis tests for µ Large Sample ( n 30) To test H H a 0 : µ = µ versus µ > µ, upper tail 0 : µ < µ, lower tail 0 0 test test µ µ, two - tailed test X µ 0 Test statistic : Z = σ / n Replace σ by S, if σ is unknown. 0 z > zα, upper tail RR Rejection region : z < zα, lower tail RR z > zα /, two tail RR Assumption : n 30 Small Sample ( n < 30) To test H H a : µ = µ versus µ > µ, upper tail test : µ < µ, lower tail test X µ 0 Test statistic : T = S / n t > tα, n 1, upper tail RR Rejection region : t < tα, n 1, lower tail RR t > tα /, n 1, two tail RR Assumption : Random sample comes from a normal population µ µ, two - tailed test 0 0 Hypotheses for a single parameter Summary of hypothesis tests for µ (cont.) Decision: Reject H 0, if the observed test statistic falls in the RR and conclude that H a is true with (1-α)100% confidence. Otherwise, keep H 0 so that there is no enough evidence to conclude that H a is true for the given α and more experiments may be needed.

16 Hypotheses for a single parameter Example In a frequently traveled stretch of the I-75 highway, where the posted speed is 70 mph, it is thought that people travel on the average of at least 75 mph. To check this claim, the following radar measurements of the speeds (in mph) is obtained for 10 vehicles traveling on this stretch of the interstate highway Do the data provide sufficient evidence to indicate that the mean speed at which people travel on this stretch of highway is at least 75 mph? Test the appropriate hypothesis using α = Draw a box plot and normal plot for this data, and comment. Hypotheses for a single parameter

17 Hypotheses for a single parameter Example Solution We need to test H 0 : µ = 75 vs. H a : µ > 75. For this sample, the sample mean is x =74.8 mph and the standard deviation is σ = mph. Hence, the observed test statistic is t = x µ = σ / n / 10 = From the t-table, t 0.01 =.81. Hence, the rejection region is {t >.81}. Because t = does not fall in the rejection region, we do not reject the null hypothesis at α = Hypotheses for a single parameter Example Solution (cont.) The box plot suggests that there are no outliers present. However, the normal plot indicates that the normality assumption for this data set is not justified. Hence, it may be more appropriate to do a nonparametric test.

18 Hypotheses for a single parameter Example A machine is considered to be unsatisfactory if it produces more than 8% defectives. It is suspected that the machine is unsatisfactory. A random sample of 10 items produced by the machine contains 14 defectives. Does the sample evidence support the claim that the machine is unsatisfactory? Use α = Hypotheses for a single parameter Example Solution Let Y be the number of observed defectives. This follows a binomial distribution. However, because np 0 and nq 0 are greater than 5, we can use a normal approximation to the binomial to test the hypothesis. So we need to test H 0 : p = 0.08 versus H a : p > Let the point estimate of p be ˆp = (Y / n) = 0.117, the sample proportion. Then the value of the TS is z = ˆp p = = p 0 q (0.9) n 10 For α = 0.01, z 0.01 =.33. Hence, the rejection region is {z>.33}. Decision: Because is not greater than.33, we do not reject H 0. We conclude that the evidence does not support the claim that the machine is unsatisfactory.

19 Hypotheses for a single parameter Summary of hypothesis test for the proportion p To test H 0 : p = p 0 Versus p > p 0, upper tail test H a : p < p 0, lower tail test Test statistic: Z = ˆp p 0, where σ ˆp = p (1 p ) 0 0 σ ˆp n Rejection region: Z > Z α, upper tail RR Z < Z α, lower tail RR Z > Z α /, two tail RR Hypotheses for a single parameter Summary of hypothesis test for the proportion p (cont.) Assumption: n is large. A good rule of thumb is to use the normal approximation to the binomial distribution only when np 0 and n(1-p 0 ) are both greater than 5 Decision: Reject H 0, if the observed test statistic falls in the RR and conclude that H a is true with (1-α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for given α and more data are needed.

20 Hypotheses for a single parameter Summary of hypothesis test for the variance σ To test H 0 :σ = σ 0 Versus σ > σ 0, upper tail test H a :σ < σ 0, lower tail test σ σ 0, two-tailed test. Test statistic: where S is the sample variance. Observed value of test statistic: Rejection region: (n 1)S χ = σ 0 χ > χ α,n 1, upper tail RR χ < χ 1 α,n 1, lower tail RR χ > χ α /,n 1 or χ > χ 1 α /,n 1, two tail RR (n 1)s σ 0 Hypotheses for a single parameter Summary of hypothesis test for the variance σ (cont.) Assumption: Sample comes from a normal population Decision: Reject H 0, if the observed test statistic falls in the RR and conclude that H a is true with (1-α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for given α and more data are needed.

21 Hypotheses for a single parameter Example A physician claims that the variance in cholesterol levels of adult men in a certain laboratory is at most 100. A random sample of 5 adult males from this laboratory produced a sample standard deviation of cholesterol levels as 1. Test the physician s claim at 5% level of significance. Solution To test H 0 : σ =100 versus H a : σ <100 For α = 0.05, and 4 degrees of freedom, the rejection region is RR = {χ < χ 1 α,n 1 } = {χ < 13.84}. The observed value of the TS is χ = (4)(144) 100 = Because the value of the test statistic does not fall in the rejection region, we cannot reject H 0 at 5% level of significance. Testing of hypotheses for two samples Independent Samples Two random samples are drawn independently of each other from two populations, and the sample information is obtained. We are interested in testing a hypothesis about the difference of the true means. Let X 11,, X 1n be a random sample from population 1 with mean µ 1 and variance σ 1, and X 1,, X n be a random sample from population with mean µ and variance σ. Let X represent the respective i,i = 1, sample means and S i,i = 1, represent the respective sample variances. In testing hypotheses about µ 1 and µ : (i) σ 1 and σ are known; (ii) σ 1 and σ are unknown and n 1 30 and n 30 ; (iii) σ 1 and σ are unknown and n and ; (a) 1 < 30 n < 30 σ 1 = σ (b) σ 1 σ Most common; most complicated in computation

22 Testing of hypotheses for two samples Hypothesis test for µ 1 - µ for large samples (n 1 & n 30) To test H 0 : µ 1 - µ = D 0 versus µ 1 µ > D 0, upper tailed test H a :{ µ 1 µ < D 0, µ 1 µ D 0, lower tailed test two-tailed test. The test statistic is Z = X 1 X D 0 σ 1 n 1 + σ Replace σ i by S i, if σ i, i = 1, are not known. n. Testing of hypotheses for two samples Hypothesis test for µ 1 - µ for large samples (n 1 & n 30) (cont.) Rejection region is RR :{ z > z α, z < z α, z > z α /, upper tailed RR lower tailed RR two-tailed RR, where z is the observed test statistic given by z = x 1 x D 0 σ 1 n 1 + σ n. Assumption: the samples are independent and n 1 and n 30. Decision: Reject H 0, if test statistic falls in the RR and conclude that H a is true with (1- α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for a given α and more experiments are needed.

23 Testing of hypotheses for two samples Example In a salary study of faculty at a center university, sample salaries of 50 male assistant professors and 50 female assistant professors yielded the following basic statistics. Sample mean salary Sample standard deviation Male assistant professors $ 36, Female assistant professors $34,00 0 Test the hypothesis that the mean salary of male assistant professors is more than the mean salary of female assistant professors at this university. Use α = 0.05 Testing of hypotheses for two samples From wiki

24 Testing of hypotheses for two samples Example Solution Let µ 1 be the mean salary for male assistant professors and µ for the mean salary for female assistant professors at this university. To test H 0 : µ 1 - µ = 0 vs. H a : µ 1 µ > 0 The test statistics is z = x x D , , 00 σ 1 + σ = = (360) + (0) n The rejection region for α = 0.05 is {z > 1.645}. n Testing of hypotheses for two samples Example Solution (cont.) Because z = > 1.645, we reject the null hypothesis at α = We conclude that the salary of male assistant professors at this university is higher than that of female assistant professors for α = Note that even though σ 1 and σ are unknown, because n 1 30 and n 30, we could replace σ 1 and σ by the respective sample variances. We are assuming that the salaries of male and female are sampled independently of each other.

25 Testing of hypotheses for two samples Comparison of two population means, small sample case (pooled t- test); assume variances are equal To test H 0 : µ 1 - µ = D 0 versus µ 1 µ > D 0, upper tailed test H a :{ µ 1 µ < D 0, µ 1 µ D 0, lower tailed test two-tailed test. The test statistic is T = X X D S p + 1 n 1 n Here the pooled sample variance is S p = (n 1 1)S 1 + (n 1)S n 1 + n. Testing of hypotheses for two samples Comparison of two population means, small sample case (pooled t- test) (cont.) Then the rejection region is t > t α, upper tailed RR RR :{ t < t α, lower tailed RR t > t α /, two-tailed RR where t is the observed test statistic and t α is based on (n 1 +n -) degrees of freedom, and such that P(T > t α ) = α. Decision: Reject H 0, if test statistic falls in the RR and conclude that H a is true with (1- α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for a given α. Assumption: The samples are independent and come from normal populations with mean µ 1 and µ, and with the (unknown) but equal variances, that is, =. σ 1 σ

26 Testing of hypotheses for two samples Now we shall consider the case where and are unknown and cannot be assumed to be equal. In such case the following test is often used. For the hypothesis µ 1 µ > D 0, H 0 : µ 1 - µ = D 0 vs. define the test statistic T v as H a :{ T v = x 1 x D 0 σ 1 σ µ 1 µ < D 0, µ 1 µ D 0, + S n 1 n where T v has a t-distribution with v degrees of freedom, and S 1 v = [(S 1 / n 1 ) + (S / n )] (S 1 / n 1 ) n (S / n ) n 1 Testing of hypotheses for two samples The value of v will not necessarily be an integer. In that case, we will round it down to the nearest integer. This method of hypothesis testing with unequal variances is called the Smith-Satterthwaite procedure.

27 Testing of hypotheses for two samples Example 7.5. The intelligence quotients (IQs) of 17 students from one area of a city showed a sample mean of 106 with a sample standard deviation of 10, whereas the IQs of 14 students for another area chosen independently showed a sampled mean of 109 with a standard deviation of 7. Is there a significant difference between the IQs of the two groups at α = 0.0? Assume that the population variance are equal. Testing of hypotheses for two samples Example 7.5. Solution We test H 0 : µ 1 µ = 0 vs. H a : µ 1 µ 0 Here n 1 = 17, x 1 = 106, and s 1 = 10. Also, n = 14, x = 109, and s = 7. We have S p = (n 1)S (n 1)S n 1 + n The test statistic is = (16)(10) + (13)(7) 9 T = X 1 X D 0 S p 1 n n = = 1 ( ) =

28 Testing of hypotheses for two samples Example 7.5. Solution (cont.) For α = 0.0, t 0.01, 9 =.46. Hence, the rejection region is t < -.46 or t >.46. Because the observed value of the test statistic, T = , does not fall in the rejection region, there is not enough evidence to conclude that the mean IQs are different for the two groups. Here we assume that the two samples are independent and are taken from normal populations. Testing of hypotheses for two samples Example Infrequent or suspended menstruation can be a symptom of serious metabolic disorder in women. In a study to compare the effect of jogging and running on the number of menses, two independent subgroups were chosen from a large group of women, who were similar in physical activity (aside from running), heights, occupations, distribution of ages, and the type of birth control methods being used. The first group consisted of a random sample of 6 women joggers who jogged slow and easy 5 to 30 miles per week, and the second group consisted of a random sample of 6 women runners who ran more than 30 miles per week and combined with long distance, slow speed walk. The following summary statistics were obtained (E. Dale, D. H. Gerlach, and A. L. Wilhite, Menstrual Dysfunction in Distance Runner, obstet. Gynecol. 54, 47-53, 1979). Joggers x 1 = 10.1, S 1 =.1 Runners = 9.1, S =.4 x

29 Testing of hypotheses for two samples Example (cont.) Using α = 0.05, (a) test for differences in mean number of menses for each group assuming equality of population variances, and (b) test for differences in mean number of menses for each group assuming inequality of population variances. Solution Here we need to test H 0 : µ 1 µ = 0 vs. H a : µ 1 µ 0 Here n 1 = 6, x 1 = 10.1, and s 1 =.1. Also, n = 6, x = 9.1, and s =.4. (a) Under the assumption =, we have σ 1 σ S p = (n 1)S (n 1)S n 1 + n = (5)(.1) + (5)(.4) 50 = Testing of hypotheses for two samples Example Solution (cont.) The test statistic is T = X 1 X D 0 S p 1 n n = 1 ( 5.085) = For α = 0.05, t 0.05, Hence, the rejection region is t < or t > Because T = does not fall in the rejection region, we do not reject the null hypothesis. At α = 0.05, there is not enough evidence to conclude that the population mean number of menses for the joggers and runners are different.

30 Testing of hypotheses for two samples Example Solution (cont.) (b) Under the assumption, we have σ 1 σ v = [(S 1 / n 1 ) + (S / n )] (S 1 / n 1 ) n (S / n ) n 1 = ( (.1) 6 + (.4) 6 ) ( (.1) 6 ) + 5 ( (.4) 6 ) 5 = Hence, we have v = 49 degrees of freedom. Because this value is large, the rejection region is still approximately t < and t > Hence, the conclusion is the same as that of part (a). In both parts (a) and (b), we assumed that the samples are independent and came from two normal populations. Testing of hypotheses for two samples Hypothesis test for (p 1 p ) for large samples (n i p i > 5 and n i (1-p i ) > 5, for i = 1, ) Assume binomial distribution is approximated by normal distribution. To test H0: p 1 p = D 0 versus p 1 p > D 0, upper tailed test H a :{ p 1 p < D 0, lower tailed test p 1 p D 0, two-tailed test at significant level α, the test statistic is Z = ˆp ˆp D 1 0 ˆp 1 ˆq 1 + ˆp ˆq n 1 n where z is the observed value of Z.

31 Testing of hypotheses for two samples Hypothesis test for (p 1 p ) for large samples (n i p i > 5 and n i q i > 5, for i = 1, ) (cont.) The rejection region is z > z α, upper tailed RR RR :{ z < z α, lower tailed RR z > z α /, two-tailed RR Assumption: The samples are independent and n i p i > 5 and n i (1-p i ) > 5, for i = 1,. Decision: Reject H 0 if test statistic falls in the RR and conclude that H a is true with (1- α)100% confidence. Otherwise, do not reject H 0 because there is not enough evidence to conclude that H a is true for given α and more experiments are needed. Testing of hypotheses for two samples Example Because of the impact of the global economy on a high-wage country such as United States, it is claimed that the domestic content in manufacturing industries fell between 1977 and A survey of 36 randomly picked U.S. companies gave the proportion of domestic content total manufacturing in 1977 as 0.37 and in 1997 as At the 1% level of significance, test the claim that the domestic content really fell during the period

32 Testing of hypotheses for two samples Example Solution Let p 1 be the domestic content in 1977 and p be the domestic content in Given n 1 = n = 36, ˆp 1 = 0.37 and ˆp = We need to test H 0 : p 1 p = 0 vs. H a : p 1 p > 0. The test statistic is Z = ˆp ˆp D 1 0 ˆp 1 ˆq 1 + ˆp ˆq n 1 n = (0.37)(0.63) + (0.36)(0.64) = Testing of hypotheses for two samples Example Solution (cont.) For α = 0.01, z 0.01 =.35. Hence, the rejection region is z >.35. Because the observed value of the test statistic does not fall in the rejection region, at α = 0.01, there is not enough evidence to conclude that the domestic content in manufacturing industries fell between 1977 and 1997.

33 Testing of hypotheses for two samples We have already seen in Ch.4 that F = S 1 /σ 1 S /σ follows the F-distribution with v 1 = n 1 1 numerator and v = n 1 degrees of freedom. Under the assumption H 0 : =, we have F = S 1 which has an F-distribution with (v 1, v ) degrees of freedom. S σ 1 σ Testing of hypotheses for two samples Testing for the equality of variances To test H 0 : σ 1 = σ versus σ 1 > σ, upper tailed test H a :{ σ 1 < σ, lower tailed test σ 1 σ, two-tailed test at significance level α, the test statistic is F = S 1 S The rejection region is f > F α (v 1, v ), RR :{ f < F 1 α (v 1, v ), f > F α / (v 1, v ) or f < F 1 α / (v 1, v ), upper tailed RR lower tailed RR two-tailed RR

34 Testing of hypotheses for two samples Testing for the equality of variances (cont.) s1 where f is the observed test statistic given by f =. s Decision: Reject H 0 if test statistic falls in the RR and conclude that H a is true with (1- α)100% confidence. Otherwise, keep H 0, because there is not enough evidence to conclude that H a is true for given α and more experiments are needed. Assumption: (i) The two random samples are independent. (ii) Both populations are normal. In order to find F 1-α (v 1,v ), we use the identity F 1-α (v 1,v ) = (1/F α (v,v 1 )) Testing of hypotheses for two samples Example Consider two independent random samples X 1,, X n from an N(µ 1,σ 1 ) distribution and Y 1,, Y n from an N(µ,σ ) distribution. σ 1 Test H 0 : = versus H a : for the following basic statistics: n 1 = 5, x 1 = 410, s 1 = 95, and n = 16, x = 390, s = 300 Use α = 0.0. σ σ 1 σ Solution Test H 0 : σ 1 = σ versus H a : σ 1 σ. This is a two-tailed test. Here the degrees of freedom are v 1 = 4 and v = 15. The test statistic is F = S 1 S = =

35 Testing of hypotheses for two samples Example Solution (cont.) From the F-table, F 0.10 (4,15) = 1.9 and F 0.90 (4,15) = (1/F 0.10 (15,4)) = Hence, the rejection region is F > 1.90 and F < Because the observed value of the test statistic, 0.317, is less than 0.56, we reject the null hypothesis. There is evidence that the population variances are not equal. Take a break.

36 Testing of hypotheses for two samples Dependent Samples Two samples are dependent: each data point in one sample can be coupled in some natural, nonrandom fashion with each data point in the second sample. The pairing may be the result of the individual observations in the two samples: (1) represent before and after program, () sharing the same characteristics, (3) being matched by location, (4) being matched by time, (5) control and experimental, and so forth. Testing of hypotheses for two samples Dependent Samples Let (X 1i,X i ) for i = 1,,, n, be a random sample. X 1i and X j (i j) are independent. To test the significant of the difference between two population means when the samples are dependent, we first calculate for each pair of scores the difference, D i = X 1i X i, i = 1,,, n, between the two scores. Let µ D = E(D i ). Because pairs of observations form a random sample D 1,, D n are i.i.d random variables, if d 1,, d n are the observed values of D 1,, D n, the we define d = 1 n d i and s d = 1 n d i 1 (d i d) i=1 n ( d i) i=1 = n n 1 n 1 i=1 i=1 n n

37 Testing of hypotheses for two samples Testing for matched pairs experiment To test H 0 : µ D = d 0 versus H a :{ µ D > d 0, µ D < d 0, µ D d 0, upper tailed test lower tailed test two-tailed test the test statistic: T = D d 0 (this approximately follows a Student t- S D / n distribution with (n-1) degrees of freedom). The rejection region is RR :{ t > t α,n 1, t < t α,n 1, t > t α /,n 1, upper tailed RR lower tailed RR two-tailed RR where t is the observed test statistic. Testing of hypotheses for two samples Testing for matched pairs experiment (cont.) Assumption: The differences are approximately normally distributed. Decision: Reject H 0 if test statistic falls in the RR and conclude that H a is true with (1- α)100% confidence. Otherwise, do not reject H 0, because there is not enough evidence to conclude that H a is true for a given α and more data are needed.

38 Testing of hypotheses for two samples Example A new diet and exercise program has been advertised as remarkable way to reduce blood glucose levels in diabetic patients. Ten randomly selected diabetic patients are put on the program, and the results after 1 month are given by the following table: Before After Do the date provide sufficient evidence to support the claim that the new program reduces blood glucose level in diabetic patients? Use α = Testing of hypotheses for two samples Example Solution We need to test the hypothesis H 0 : µ D = 0 vs. H a : µ D < 0. First we calculate the difference if each pair given in the following table. Before After Diff. (afterbefore) From the table, the mean of the differences is d = 71.9 deviation s d = 56.. The test statistic is T = D d 0 S D / n = 71.9 = / 10 and the standard

39 Testing of hypotheses for two samples Example Solution (cont.) From the t-table, t 0.05,9 = Because the observed value of t = < t 0.05,9 = 1.833, we reject the null hypothesis and conclude that the sample evidence suggests that the new diet and exercise program is effective. Testing of hypotheses for two samples Why must we take paired differences and then calculate the mean and standard deviation for the differences why can t we just take the means of each sample, as we did for independent samples? à σ need not be equal to σ (X1 X ) D Assume that E(X ji ) = µ j, Var(X ji ) = σ j, for j = 1, and Cov(X 1i, X i ) = ρσ 1 σ where ρ denotes the assumed common correlation coefficient of the pair (X 1i, X i ) for i = 1,,, n. Because the value of D i, i = 1,,, n, are i.i.d., µ D = E(D i ) = E(X 1i ) E(X i ) = µ 1 µ and σ D = Var(D i ) = Var(X 1i ) + Var(X i ) Cov(X!i, X i ) = σ 1 + σ ρσ 1 σ

40 Testing of hypotheses for two samples From these calculations, and E(D) = µ D = µ 1 µ σ D = Var(D) = σ D n = 1 n (σ 1 + σ ρσ 1 σ ) Now, if the samples were independent with n 1 = n = n, and E(X 1 X ) = µ 1 µ σ (X1 X = 1 ) n (σ 1 + σ ) Hence, if ρ > 0, then σ D < σ (X1 X. ) Chi-Square Tests for Count Data Suppose that we have outcomes of a multinomial experiment that consists of k mutually exclusive and exhaustive events A 1,.A k. Let P(A i )=p i, i = 1,,,k k p i = 1 Let the experiment be repeated n times, and X i (i=1,,,n) represent the number of times the event A i occurs, then (X 1,,X n ) have a multinomial distribution with parameters k, p 1,,p k. Let Q = k i=1 i=1 (X i np i ) np i It can be shown that for large n, the random variable Q is approximately - distributed with (k-1) degrees of freedom. It is usual to demand np i 5 (i=1,,, k) for the approximation to be valid, although the approximation generally works well if for only a few values of i (~0%), np i 1 and the rest (~80%) satisfy the condition np i 5. (Karl Pearson 1900) χ

41 Chi-Square Tests for Count Data Example A plant geneticist grows 00 progeny from a cross that is hypothesized to result in a 3:1 phenotypic ratio of red-flowered to white-flowered plants. Suppose the cross produces 170 red- to 30 white- flowered plants. Calculate the value of Q for this experiment. Chi-Square Tests for Count Data Example Solution n=00, k=, Let i=1 represent red-flowered and i= represent white-flowered plants. Then X 1 =170, and X =30. Here, H 0 : The flower color population ratio is not different from 3:1, and the alternate is H a : The flower color population sampled has a flower color ratio that is not 3 red: 1 white. Under the null hypothesis, the expected frequencies are np 1 =(00)(3/4)=150, and np =(00)(1/4)=50. Hence, Q = k i=1 (X i np i ) np i ( ) (30 50) = = Often X i is called the observed frequency and np i is called expected frequency. This example gives a measure of how close our observed frequencies come to the expected frequencies and is referred to as a measure of goodness of fit. Smaller values of Q values indicate better fit.

42 The Goodness-of-Fit Test Summary Let an experiment have k mutually exclusive and exhaustive outcomes A 1, A,, A k. We would like to test the null hypothesis that all the p i =p(a i ), i = 1,,, k are equal to known numbers p i0, i = 1,,, k. That is, to test H 0 :p 1 =p 10, p k =p k0 vs. H a : At least one of the probabilities is different from the hypothesized value. The test is always a one-sided upper tail test. Let O i be the observed frequency, E i = np i0 be the expected frequency (frequency under the null hypothesis), and k be the number of classes. The test statistic is Q = k i=1 (O i E i ) E i The test statistic Q has an approximate Chi-square distribution with k-1 degrees of freedom. The rejection region is Q χ α,k 1 Assumption: E i 5: Exact methods are available. Computing the power of this test is difficult. The Goodness-of-Fit Test Summary This test implies that if the observed data are very close to the expected data, we have a very good fit and we accept the null hypothesis. That is, for small Q values, we accept H 0.

43 The Goodness-of-Fit Test Example A die is rolled 60 times and the face values are recorded. The results are as follows. Up face Frequency Is the die balanced? Test using α = 0.05 The Goodness-of-Fit Test Example Solution If the die is balanced, we must have p 1 = p = = p 6 = 1/6 where p i =P(face value on the die is i), i = 1,,, 6. This has the discrete uniform distribution. Hence, vs. H 0 : p 1 = p = = p 6 = 1/6 H a : At least one of the probabilities is different from the hypothesized value of 1/6 E 1 =n 1 p 1 = (60)(1/6)=10,, E 6 =10. We summarize the calculation in the following table: Face value Frequency, O i Expected value, E i

44 The Goodness-of-Fit Test Example Solution (cont.) The test statistic is given by 6 (O Q = i E i ) = 6 i=1 E i From the chi-square table with 5 d.f., χ 0.05,5 = Because the value of the test statistic does not fall in the rejection region, we do not reject H 0. Therefore, we conclude that the die is balanced. Take a break.

45 Contingency Table: Test for Independence χ One of the uses of the statistic is in contingency (dependence) testing where n randomly selected items are classified according to two different criteria, such as when data are classified on the basis of two factors (row factor and column factor) where the row factor has r levels and the column factor has c levels. Our interest is to test for independence of two methods of classification of observed events. For example, we might classify a sample of students by sex and by their grade on a statistics course in order to test the hypothesis that the grades are dependent on sex. More generally the problem is to investigate a dependency between two classification criteria. The obtained data are displayed as shown in the following table, where n ij represents the number of data values under row i and column j. Levels of column factor 1 c Row total Row levels 1 n 11 n 1 n 1c n 1 n 1 n n c n r n r1 n r n rc n r Column total n.1 n. n.c N Contingency Table: Test for Independence where c N = n. j = n i. = n ij j=1 r i=1 r c i=1 j=1 We wish to test that the hypothesis that the two factors are independent.

46 Test for the Independence of Two Factors To test H 0 : The factors are independent vs. H a : The factors are dependent The test statistic is, Q = r c i=1 j=1 (O ij E ij ) E ij where Q ij = n ij and E ij = n i n j N The under the null hypothesis the test statistic Q has an approximate chi-square distribution with (r-1)(c-1) degrees of freedom. Hence, the rejection region is Q > χ α,(r 1)(c 1) Assumption: E ij 5 Test for the Independence of Two Factors Example The following table gives a classification according to religious affiliation and marital status for 500 randomly selected individuals. Religious affiliation A B C D None Total Marital status Single With spouse Total For α = 0.01, test the null hypothesis that marital status and religious affiliation are independent.

47 Test for the Independence of Two Factors Example Solution We need to test the hypothesis vs. H 0 : Marital status and religious affiliation are independent H a : Marital status and religious affiliation are dependent. Here, c = 5, and r =. For α = 0.01, and for (c-1)(r-1)=4 degrees of freedom, we have = χ 0.01,4 Hence, the rejection region is Q > We have E ij = n in j N. Thus, E 11 = (116)(11) 500 E 13 = (116)(56) 500 E 15 = (116)(55) 500 E = (384)(80) 500 E 4 = (384)(98) 500 = 48.95;E 1 = (116)(80) = 18.5; 500 = 1.99;E 14 = (116)(98) =.736; 500 = 1.76;E 1 = (384)(11) = 16.05; 500 = 61.44;E 3 = (384)(56) = ; 500 = 75.64;E 5 = (384)(55) = 4.4; 500 Test for the Independence of Two Factors Example Solution (cont.) The value of the test statistic is Q = r c i=1 j=1 = (O ij E ij ) E ij Because the observed value of Q does not fall in the rejection region, we do not reject the null hypothesis at α =0.01. Therefore, based on the observed data, the marital status and religious affiliation are independent.

48 Testing to Identify the Probability Distribution In hypothesis testing problems we often assume that the form of the population distribution is known. For example, in a χ -test for variance, we assume that the population is normal. The goodness-of-fit tests examine the validity of such an assumption if we have a large enough sample. This is another application of the chisquare statistic used for goodness-of fit tests. Goodness-of-Fit Test Procedures for Probability Distributions Let X 1,,X n be a sample from a population with cdf F(x), which may depend on the set of unknown parameters θ. We wish to test H 0 : F(x)=F 0 (x), where F 0 (x) is completely specified. 1. Divide the range of values of the random variables X 1 into K nonoverlapping intervals I 1, I,,I k. Let O j be the number of sample values that fall in the interval I j (j = 1,,, K). Assuming the distribution of X to be F 0 (x), find P(X I j ). Let P(X I j ) = π j. Let E j =nπ j be the expected frequency. 3. Compute the test statistic Q given by Q = k i=1 (O i E i ) E i The test statistic Q has an approximate of freedom. 4. Reject of H 0 if Q > χ α,(k 1) 5. Assumptions: E j 5, j = 1,,., k χ -distribution with (K-1) degrees

49 Goodness-of-Fit Test Procedures for Probability Distributions If the null hypothesis does not specify F 0 (x) completely, that is, if F 0 (x) contains some unknown parameters θ 1,θ,,θ p, we estimate these parameters by the method of maximum likelihood. Using these estimated values we specify F 0 (x) completely. Denote the estimated F 0 (x) by ˆ 0 ( ). Let The test statistic is ˆπ i = P{X I i ˆ 0 (x)} Q = k i=1 (O i Êi ) Ê i and Ê i = n ˆπ i The statistic Q has an approximate chi-square distribution with (K-1-p) degrees of freedom. We reject H 0 if Q > χ α,(k 1 p) Goodness-of-Fit Test Procedures for Probability Distributions Example The grades of students in a class of 00 are given in the following table. Test the hypothesis that the grades are normally distributed with a mean of 75 and a standard deviation of 8. Use α = Range Number of students

50 Goodness-of-Fit Test Procedures for Probability Distributions Example Solution We have O 1 = 1, O = 36, O 3 = 90, O 4 = 44, O 5 = 18. We now compute π i (i=1,,,5), using the continuity correction factor, And, π 1 = P{X 59.5 H 0 } = P{z } = 0.06, 8 π = 0.189, π 3 = 0.47, π 4 = 0.476, π 5 = , E 1 = 5.4, E = 43.78, E 3 = 94.44, E 4 = 49.5, E 5 = 7.0 The test statistic results in n (O Q = i E i ) = 6. i=1 E i Q has a chi-square distribution with (5-1)=4 degrees of freedom. The critical value is χ 0.05,4 = Hence the rejection region is Q >7.11. Because the observed value of Q =6.>7.11, we reject H 0 at α =0.05. Thus, we conclude that the population is not normal.

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as...

HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1. used confidence intervals to answer questions such as... HYPOTHESIS TESTING (ONE SAMPLE) - CHAPTER 7 1 PREVIOUSLY used confidence intervals to answer questions such as... You know that 0.25% of women have red/green color blindness. You conduct a study of men

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Math 541: Statistical Theory II Lecturer: Songfeng Zheng Maximum Likelihood Estimation 1 Maximum Likelihood Estimation Maximum likelihood is a relatively simple method of constructing an estimator for

More information

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

More information

Principle of Data Reduction

Principle of Data Reduction Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

More information

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Ch. 10 Chi SquareTests and the F-Distribution 10.1 Goodness of Fit 1 Find Expected Frequencies Provide an appropriate response. 1) The frequency distribution shows the ages for a sample of 100 employees.

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Hypothesis testing - Steps

Hypothesis testing - Steps Hypothesis testing - Steps Steps to do a two-tailed test of the hypothesis that β 1 0: 1. Set up the hypotheses: H 0 : β 1 = 0 H a : β 1 0. 2. Compute the test statistic: t = b 1 0 Std. error of b 1 =

More information

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives

5/31/2013. Chapter 8 Hypothesis Testing. Hypothesis Testing. Hypothesis Testing. Outline. Objectives. Objectives C H 8A P T E R Outline 8 1 Steps in Traditional Method 8 2 z Test for a Mean 8 3 t Test for a Mean 8 4 z Test for a Proportion 8 6 Confidence Intervals and Copyright 2013 The McGraw Hill Companies, Inc.

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so: Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

More information

Topic 8. Chi Square Tests

Topic 8. Chi Square Tests BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test

More information

Math 251, Review Questions for Test 3 Rough Answers

Math 251, Review Questions for Test 3 Rough Answers Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Tests of Hypotheses Using Statistics

Tests of Hypotheses Using Statistics Tests of Hypotheses Using Statistics Adam Massey and Steven J. Miller Mathematics Department Brown University Providence, RI 0292 Abstract We present the various methods of hypothesis testing that one

More information

Tests for Two Proportions

Tests for Two Proportions Chapter 200 Tests for Two Proportions Introduction This module computes power and sample size for hypothesis tests of the difference, ratio, or odds ratio of two independent proportions. The test statistics

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935) Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis

More information

Crosstabulation & Chi Square

Crosstabulation & Chi Square Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses Introduction to Hypothesis Testing 1 Hypothesis Testing A hypothesis test is a statistical procedure that uses sample data to evaluate a hypothesis about a population Hypothesis is stated in terms of the

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is. Some Continuous Probability Distributions CHAPTER 6: Continuous Uniform Distribution: 6. Definition: The density function of the continuous random variable X on the interval [A, B] is B A A x B f(x; A,

More information

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. STT315 Practice Ch 5-7 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The length of time a traffic signal stays green (nicknamed

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22 Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability

More information

The Chi-Square Test. STAT E-50 Introduction to Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

More information

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals

statistics Chi-square tests and nonparametric Summary sheet from last time: Hypothesis testing Summary sheet from last time: Confidence intervals Summary sheet from last time: Confidence intervals Confidence intervals take on the usual form: parameter = statistic ± t crit SE(statistic) parameter SE a s e sqrt(1/n + m x 2 /ss xx ) b s e /sqrt(ss

More information

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery) Chapter 4 Statistical Inference in Quality Control and Improvement 許 湘 伶 Statistical Quality Control (D. C. Montgomery) Sampling distribution I a random sample of size n: if it is selected so that the

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

1 Sufficient statistics

1 Sufficient statistics 1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

PROBABILITY AND SAMPLING DISTRIBUTIONS

PROBABILITY AND SAMPLING DISTRIBUTIONS PROBABILITY AND SAMPLING DISTRIBUTIONS SEEMA JAGGI AND P.K. BATRA Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 seema@iasri.res.in. Introduction The concept of probability

More information

Module 2 Probability and Statistics

Module 2 Probability and Statistics Module 2 Probability and Statistics BASIC CONCEPTS Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The standard deviation of a standard normal distribution

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Non-Inferiority Tests for Two Means using Differences

Non-Inferiority Tests for Two Means using Differences Chapter 450 on-inferiority Tests for Two Means using Differences Introduction This procedure computes power and sample size for non-inferiority tests in two-sample designs in which the outcome is a continuous

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test

Introduction to Analysis of Variance (ANOVA) Limitations of the t-test Introduction to Analysis of Variance (ANOVA) The Structural Model, The Summary Table, and the One- Way ANOVA Limitations of the t-test Although the t-test is commonly used, it has limitations Can only

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Exact Confidence Intervals

Exact Confidence Intervals Math 541: Statistical Theory II Instructor: Songfeng Zheng Exact Confidence Intervals Confidence intervals provide an alternative to using an estimator ˆθ when we wish to estimate an unknown parameter

More information

Chapter 7 Review. Confidence Intervals. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 7 Review. Confidence Intervals. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Chapter 7 Review Confidence Intervals MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) Suppose that you wish to obtain a confidence interval for

More information

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Review #2. Statistics

Review #2. Statistics Review #2 Statistics Find the mean of the given probability distribution. 1) x P(x) 0 0.19 1 0.37 2 0.16 3 0.26 4 0.02 A) 1.64 B) 1.45 C) 1.55 D) 1.74 2) The number of golf balls ordered by customers of

More information

Section 12 Part 2. Chi-square test

Section 12 Part 2. Chi-square test Section 12 Part 2 Chi-square test McNemar s Test Section 12 Part 2 Overview Section 12, Part 1 covered two inference methods for categorical data from 2 groups Confidence Intervals for the difference of

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

1 Prior Probability and Posterior Probability

1 Prior Probability and Posterior Probability Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck! Name: 1. The basic idea behind hypothesis testing: A. is important only if you want to compare two populations. B. depends on

More information

Math 461 Fall 2006 Test 2 Solutions

Math 461 Fall 2006 Test 2 Solutions Math 461 Fall 2006 Test 2 Solutions Total points: 100. Do all questions. Explain all answers. No notes, books, or electronic devices. 1. [105+5 points] Assume X Exponential(λ). Justify the following two

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Normality Testing in Excel

Normality Testing in Excel Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

More information

Hypothesis Testing --- One Mean

Hypothesis Testing --- One Mean Hypothesis Testing --- One Mean A hypothesis is simply a statement that something is true. Typically, there are two hypotheses in a hypothesis test: the null, and the alternative. Null Hypothesis The hypothesis

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

1.1 Introduction, and Review of Probability Theory... 3. 1.1.1 Random Variable, Range, Types of Random Variables... 3. 1.1.2 CDF, PDF, Quantiles...

1.1 Introduction, and Review of Probability Theory... 3. 1.1.1 Random Variable, Range, Types of Random Variables... 3. 1.1.2 CDF, PDF, Quantiles... MATH4427 Notebook 1 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 1 MATH4427 Notebook 1 3 1.1 Introduction, and Review of Probability

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4% Hypothesis Testing for a Proportion Example: We are interested in the probability of developing asthma over a given one-year period for children 0 to 4 years of age whose mothers smoke in the home In the

More information

Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information