Tests concerning one mean : Let us start by discussing EXAMPLE 1 : problem 5 c) from EXAM 2 (example of a hypothesis testing problem).

Size: px
Start display at page:

Download "Tests concerning one mean : Let us start by discussing EXAMPLE 1 : problem 5 c) from EXAM 2 (example of a hypothesis testing problem)."

Transcription

1 Week 0 notes Hypothesis testing WEEK 0 page Tests concerning one mean : Let us start by discussing EAMPLE : problem 5 c) from EAM (example of a hypothesis testing problem). 5. The bolt diameter is normally distributed with mean.00 cm and standard deviation.003 cm. The washer diameter Y is normally distributed with mean.04 cm and standard deviation.004 cm. c) If all we know is that the bolt diameter is normally distributed and that a sample of size n = 9 bolts has (sample) mean diameter.04 and (sample) standard deviation.003 cm, should we believe that the population mean is.00? It may be useful to know the t-critical values (8 degrees of freedom) t.005 =3.355 t.00 =4.50 [ Hint: what kind of random variable is / S / n? ] Explain. t= S / n is a random variable having a t-distribution with parameter =n =8 degrees of freedom. For the given sample this gives a value of t= = / 9.00 =4 which lies between the two t-critical values given. This says that at significance level =.0=.005 we would reject the null hypothesis H 0 : =.00 that the population mean is.00 cm since the observed t value exceeds the t-critical value t / =t.005 =3.355 and instead conclude that the (-sided) alternative hypothesis H a :.00 is true but for a more stringent test with a higher standard having significance level =.00 and t-critical value t / =t.00 =4.50 we would accept the null hypothesis that the mean is.. It is not really right to mix up the language of confidence intervals with that of rejection regions since confidence intervals involve the random quantity (the sample mean) whereas the rejection region is fixed once we fix the type I error probability (significance level). Thus we should not say that we are 99% confident that the null hypothesis that the mean is. is false when the significance level is =.0=.005

2 Note that the p-value which is the probability of seeing data WEEK 0 page as extreme as the observed sample given the null hypothesis is true would lie somewhere between.00 and.0 (the p-value would be twice the probability / represented by the t-critical value t / =4 which we don't know exactly here but only that it lies between twice the probabilities / for the two t-critical values t / with / =.005 and / =.00) since for a two sided test like this one we would also have seen equally strong evidence in favor of rejecting the null hypothesis had we seen a negative value of t less than or equal to t = - 4 instead of a t value greater than or equal to t = 4. It is true that if the null hypothesis holds, the probability of a sample giving a t value as extreme as this one is smaller than.0 and in this sense there is a fair amount of evidence against the null hypothesis (level.0 is usually considered statistically significant). But it depends on how high we set our standards. EAMPLE Derivation of the (large) sample size formulas used in problems 7.34 and 7.35 of the text. The problem is to determine for a given null hypothesis H 0 : = 0 about the population mean how large a sample size we need to achieve the specified and in order to insure that both the type I error probability is bounded by a specified and the type II error probability is bounded by a specified when the actual mean is = and not 0. For a large sample one sided test say with alternative hypothesis H a : 0 the acceptance region for the null hypothesis is determined once we specify and is given by the condition that z 0. / n Note that the statistic on the right is approximately standard normal only when the null hypothesis holds. The acceptance or rejection region is determined by acting as if the null hypothesis does hold. If the actual population mean is = then we can re-write the above acceptance region as z 0 / n = / n 0 / n or z 0 / n Z = / n. Thus the type II error probability o = P z Z =P z = z 0 / n Z= f / n accepting H 0 : = 0 when in fact = or in other words the probability of satisfying the above acceptance region condition is just = P z Z =P z = z 0 / n Z= / n where the first equality is just the definition of the z-critical value z and the second is just the definition of the type II error probability of accepting H 0 : = 0 when

3 =. Thus for the type II error probability to be less than or WEEK 0 page 3 equal to we want the sample size n to be large enough that z z 0 / n or z z 0 n so that n z z 0 as stated in problem Similar reasoning leads to the same result when the alternative hypothesis is H a : 0 so that the acceptance region (for accepting the null hypothesis) is, 0 / n z. For a large sample two sided test (as in problem 7.35) the acceptance region for accepting the null hypothesis is z / 0 / n z /. Again the above statistic is only standard normal when the null hypothesis is true, but when = we re-write this as z / 0 / n = / n 0 / n z or z 0 / / / n Z = / n z 0 / / n The type II error probability is bounded by means P z / 0 / n Z = / n z / 0 / n or in other words the sample size n must be large enough so that the above bound holds. Now for small say.0 as is customary in statistical tests, the difference between the left and right hand sides of the above inequality for the standard normal variable Z is z / z.05 = is larger than 3 (which for a standard normal is 3 standard deviations). When the actual value of the population mean 0 is smaller than the null hypothesis value 0 for large enough n the left most value z / 0 / n for the probability will hold provided or z z / 0 will get large and the inequality z z / 0 / n as stated in problem For small we can safely ignore the right hand bound n so that n z z / 0 z / 0 / n on Z since it is 3 standard deviations to the right of z which will already be large (before adding 3 to it ) when is small. Similar considerations lead to the same sample size result when 0 so that the right hand bound on Z in the inequality z / 0 / n z will get small as n gets large. In this case it is the left hand.

4 bound z / 0 we can ignore since it is 3 standard WEEK 0 page 4 / n deviations to the left of a large negative value z when is small (so it will not appreciably alter the probability if we simply drop this bound on the left). EAMPLE 3 Problem 7.43 of the text (like HW problem 7.4) In a labor management discussion it was brought up that workers at a certain large plant take on average 3.6 minutes to get to work. If a random sample of 60 workers took on average 33.8 minutes with a standard deviation of 6. minutes, can we reject the null hypothesis H 0 : =3.6 in favor of the alternative hypothesis H a : 3.6 at the.05 level of significance? We have a one-sided test with sample size n = 60, sample mean x=33.8, =.05 sample standard deviation s = 6. minutes. We would (falsely) reject the null hypothesis with type I error probability =.05 if (assuming it is really true) we see For the above sample we get x 0 Z= 0 s/ 60 z.05 =.645. s/ 60 = / 60 =.53 z = so based on this sample we would not reject the null hypothesis at level.05. EAMPLE 4 Problem 7.45 (like HW problem 7.44) Given a random sample of 5 pints from different production lots, we want to test whether the fat content of a certain kind of ice cream exceeds 4%. What can we conclude at the.0 level of significance about the null hypothesis H 0 : =.4 ( =4 % ) if the sample mean has the mean x=4.9 % and the (sample) standard deviation s=.004( =.4% )? (The book has a the population standard deviation instead of the sample standard deviation s but judging from the answer in the back involving a t statistic this is a typo ) We must assume that we are sampling from a normal or approximately normal population of fat contents or else we can not say much. If fat contents are normally distributed then with n = 5 the t-statistic with n - = 4 degrees of freedom t= x 0 s/ 5 = =.004/ =4.79 t = exceeds the t-critical value so we reject the null hypothesis at significance level.0. EAMPLE 5 Problem 7.47 (like HW 7.46) A random sample from a company's very extensive files shows that orders for a certain piece of machinery were filled respectively in 0,, 9, 4, 5, 8,, and 3 days.

5 Use the level of significance =.0 to test the claim that orders WEEK 0 page 5 are filled in 0.5 days. Choose the alternative hypothesis so that rejection of the null hypothesis implies that it takes longer than indicated. Assume normality. We have a sample of size n = 8 times whose sample mean is x= =0 4=4 8 and with sample variance s = = 7 7 or standard deviation s=3.07. We test H 0 : =0.5 against H a : 0.5. Using the t-critical value for a t random variable with parameter n = 7 degrees of freedom t= x 0 s/ n = / 8 =3.087 t = exceeds the critical value so we reject the null hypothesis at significance level.0. We conclude that the evidence favors H a : 0.5. The relation between confidence intervals and (-sided) hypothesis testing : Johnson remarks that statisticians usually prefer a confidence interval statement (when one is available) about a population parameter such as the population mean rather than a null hypothesis for a single value H 0 : = 0 since (for two-sided tests) exactly those null hypothesis using a value of 0 that falls inside a 00 % confidence interval are the ones that are accepted at significance level. Thus we are not limited to the particular null value 0 used in a given test, but rather any value falling inside the confidence interval. In EAMPLE 6 below we do Problem 7.53 (Somewhat like HW 7.54 and 7.55) The book asks you to do these type II error probability calculations using the operating characteristic curves given in table 8 in appendix B assuming samples from a normal population. This is a plot for specified (with separate curves for different sample sizes) of the acceptance probability L =P accept H 0 when actual mean is where one plots the probability not against mean directly but rather against its departure d= 0 / from the null hypothesis value 0 relative to the standard deviation. I would prefer that you understand how to calculate these type II error probabilities directly. Calculation of type two error probability for a -sided test given actual population mean = and significance level assuming either a large sample size or when sampling from a normal population. The acceptance region is determined by pretending the null hypothesis holds. The statistic used will not be standard normal unless the actual mean = is employed. Thus = P( accept H 0 : = 0 when really = )

6 If 0 = P z / 0 / n = / n 0 / n z / WEEK 0 page 6 = P z / 0 / n Z = / n z / 0 / n., n is sufficiently large and is small the right hand bound on Z z / 0 / n z will be negative and we will be able to ignore the left hand inequality since this value is more than 3 standard deviations to the left of the negative value (approximately z ) on the right. EAMPLE 6 Problem 7.53 of the text refers to the example on page 56 : Suppose that the length of certain machine parts may be looked upon as a random variable having normal distribution with a mean of.000 cm and a standard deviation of.050 cm. We want to test the null hypothesis H 0 : =.000 ( = 0 ) against the two-sided alternative H a :.000 at significance level =.05. for a sample of size n = 30. Calculate the type II error probability when : a) =.00 ( = ). By the above discussion this is = P z / 0.00 =.96 / n.050/ 30 = Z =.309 P Z.309 =.4086 or about 4%. We can safely ignore the left hand portion of the inequality since it is (.96) or almost 4 standard deviations to the left of the right hand value.309 so dropping it will not change the probability to 4 decimal places accuracy. b) =.030 ( = ) The same reasoning now gives = P Z =.363 P Z.363 =.0768 or about 8% c) =.040 ( = ) = P Z =.47 P Z.47 =.0078 or.78% (a fraction of a percent). Tests comparing the means from two populations If we consider two independent small samples from (approximately) normal populations with population means and or two large samples of sizes n 30 and n 30 so that the central limit theorem (assuming only finite variances) applies, then the two sample means and are each approximately normal (exactly normal when the populations are exactly normal), and their difference being a linear combination of normals is also (approximately) normal. Hence under the assumption of the null hypothesis

7 H 0 : = WEEK 0 page 7 the standardized (approximately) normal variable Z = n n may be used to test the null hypothesis with the usual critical (rejection) regions for one sided or two sided tests. When n 30 and n 30 it is usually safe to replace the population variances by the sample variances so that we may use instead Z = when n S S 30 and n 30 n n Smith-Satterthwaite t-test For small samples from two independent normal populations with possibly different variances, the same statistic given above then has a t-distribution. That is t= S S n n where in the Smith-Satterthwaite test the number of degrees of freedom is estimated by the formula ( see HW problem 7.70 a) in the text ) s s n n =. s / n n s /n n Note that the number of degrees of freedom is a random variable in this test. Two-sample t-test : For small samples from a normal population having approximately equal variances (usually safe to assume if the same variances differ in ratio by less than a factor of 4), it is customary to pool the sample variances thus utilizing the sum of squared deviations from the means for each of the two samples. Using the pooled sample variance S p to estimate common to both populations, the random variable t = S p n n where S p = n S n S n n = i i n n has a t-distribution with =n n degrees of freedom. The usual rejection regions for one and two sided t-tests then apply. Note : aside from the different formulas for the number of degrees of freedom, the above Smith-Satterthwaite t turns into this t random variable when we replace both sample variances by the pooled variance S p.

8 WEEK 0 page 8 In the case of unequal variances it may be possible to transform both of the data samples so that under some function of the old data, the two (transformed) data samples will have approximately equal variances so that the above pooled variance two-sample t-test method then applies. (The transformation must be applied to both sets of data.) EAMPLE 7 Problem 7.64 (similar to HW problem 7.65) Two sample Z test example (large sample). Note : the answer in the back of the book to HW 7.65 is incorrect! An investigation of kinds of photocopying equipment showed that7 failures of the st kind of equipment took on average 83. minutes to repair with a standard deviation of 9.3 minutes while 75 failures of the nd kind of equipment took on average 90.8 minutes to repair with standard deviation.4 minutes. a) Test the null hypothesis H 0 : =0 against H a : 0 at significance level =.05. Since we are dealing with large sample sizes we may use Z = = = = with =0 here. S S n n 7 75 Since z / =z.05 =.96 and the observed value falls outside the interval from -.96 to.96 we reject the null hypothesis and conclude there is a difference in mean repair times for the two kinds of equipment. b) Find the type II error probability when = = This is the probability of accepting H 0 : =0 when = = or = P.96 =Z.96 S S S S n n n n = P Z P = Z or =.0547 We ignored the value on the right of the inequality which is (.96) = 3.9 standard deviations to the right of the value.605 on the left, so more than 5.5 standard deviations to the right of 0 since the probability that Z is greater than 5.5 is negligible. EAMPLE 8 Problem 7.68 (like HW 7.69) two sample (one-sided) t-test example. As part of an industrial training program, some trainees are instructed by Method A, which is straight computer-based instruction, and some are instructed by Method B, which also involves the personal attention of an instructor. If random samples of size 0 are taken from large groups of trainees instructed by each of these two methods, and the

9 scores which they obtained in an appropriate achievement test are ; WEEK 0 page 9 Method A : 7, 75, 65, 69, 73, 66, 68, 7, 74, 68 Method B: 7, 77, 84, 78, 69, 70, 77, 73, 65, 75 use the.05 level of significance to test the claim that method B is more effective. Assume that the populations sampled can be approximated closely with normal distributions having the same variance. Here we test the null hypothesis H 0 : =0 that there is no difference in the means against the alternative hypothesis is H a : 0 that method B is more effective. Direct calculation gives the sample means and variances : x = = 70 x = = 74 0 s = = 0 9 s = = 6 9 Since we have small samples from (approximately) normally distributed populations, we use the two sample t-test with pooled variance S p = n S n S = 9 s 9 s n n 8 t = S p = = 8 n n 45 5 = = = = 8 9 With =n n = 8 degrees of freedom, the t-critical value for this one-sided test at level.05 is.734. That is we will reject the null hypothesis since the test statistic t t.05 =.734 is less than minus this value. We conclude at significance level.05 that the alternative hypothesis (the claim we are trying to establish) holds, namely that method B is more effective (as evidenced by a large negative t-value). EAMPLE 9 Problem 7.70 b) (like HW 7.70 a) Smith-Satterthwaite t-test example : To compare two kinds of bumper guards, 6 of each kind were mounted on a certain kind of compact car. Then each car was run into a concrete wall at 5 miles per hour, and the following are the costs of repairs (in dollars) : Bumper guard : 07, 48, 3, 65, 0, 9 Bumper guard : 34, 5,, 5, 33, 9 Use the.0 level of significance to test whether the difference between the two sample means is significant : We first compute the sample means and variances x = = 7 /3 6

10 x = = 9 WEEK 0 page 0 s = 5 0 /3 /3 4 /3 38 /3 5 /3 8 /3 = /3 /3 4/3 38/3 5/3 8/3 6 /9 = 598-/5 s = =0 Using the null hypothesis value of H 0 : = = 0 difference between the means gives t= 5/3 = S S = n n /5 0 We estimate the number of degrees of freedom in the Smith-Satterthwaite test by s s n n = s / n 33 4/45 = = d.f /5 /6 0/6 n s /n n or approximately 8 degrees of freedom. We clearly cannot reject the null hypothesis at significance level.0 with such a small t-value, since here the sample means are almost equal but there is a lot of variability. The t-critical value t / =t.005 is Matched pair comparisons Unlike the two sample t-test in which the two samples are assumed to be independent and one is comparing population means for two separate populations, typically in the matched pairs t-test (or just paired t-test for short) the variables one is comparing are highly dependent as in a before and after some treatment situation or as in two different waterproofing treatments, one applied to the first and the other to the second of a pair of shoes. Here rather than compare two population means, one looks at the population of differences D i = i Y i of the two variables. For large samples the central limit theorem lets us test the null hypothesis that the population mean difference is some value D,0 (often taken to be 0) versus a one or two sided alternative using the large sample matched pairs Z test statistic : D Z = D,0 for n large, say n 40 S D / n (Of course if we know the population standard deviation D we would prefer to use it in the above statistic in place of the sample standard deviation S D.) For small samples one needs to assume that these differences are normally distributed.

11 Then the small sample matched pairs t test statistic is WEEK 0 page D t= D, 0 with =n degrees of freedom. S D / n For such matched pairs one has a single population of differences. EAMPLE 0 Problem 7.7 (like HW 7.7) Paired t-test example : In a study of the effectiveness of physical exercise in weight reduction, a group of 6 persons engaged in a prescribed program of physical exercise for one month showed the following results : Weights before (in pounds ) : 09, 78, 69,, 80, 9, 58, 80, 70, 53, 83, 65, 0, 79, 43, 44 Weight after : 96, 7, 70, 07, 77, 90, 59, 80, 64, 5, 79, 6, 99, 73, 3, 40 Use the.0 level of significance to test whether the exercise program is effective : The 6 differences (weight after minus weight before) are : -3, -7, +, -5, -3, -, +, 0, -6, -, -4, -3, -, -6, -, -4 The sample mean difference is and the sample variance is D= 33 8 = 4 /8 S = 5 ( 9 /8 3 /8 5 /8 /8 /8 /8 5 /8 4 /8 /8 3 /8 0 /8 /8 /8 /8 8 /8 0 /8 ) = 5 ( /4 3/ 4 5 /4 /4 /4 / 4 5/ 4 4/ 4 /4 3/ 4 0/4 /4 /4 /4 8/4 0/ 4 6 /8 =/4 ) or s D = = Then for a t random variable with n- = 5 degrees of freedom we find (with D,0 =0 ) D t= s/ 6 = 33/ /4 = t =.60.0 which says we should at significance level.0 reject the null hypothesis H 0 : D = D, 0 =0 that there is no difference and conclude that the exercise program results in a mean decrease in weight instead. Randomization and Pairing : An experimenter testing the efficacy of two drugs in lowering blood pressure might want to assign to one group of n patients out of n treatment with drug and treat the remaining n n patients with drug. Randomization of treatments here means that each of the n n possible selections of patients for treatment (drug here) are equally likely to be chosen. Random

12 assignment of treatments helps to prevent uncontrolled sources WEEK 0 page of variation from biasing the response. Practically this selection can be accomplished by randomly selecting n integers between and n. If a repeat occurs we ignore it and choose another. Alternately we could number the n n subsets in some fashion and then choose a random integer between and n n. Matched Pairing ( blocking ): Additionally knowing that blood pressure is influenced by age and weight, the experimenter might decide to pair off patients so that within each pair, age and weight are approximately equal. One patient within the pair is randomly assigned drug while the other gets treated with drug. The purpose of pairing according to some variable (s) thought to influence the response is to remove the effect of that variable from analysis. Without this matching (or blocking ), one drug might appear to outperform the other just because patients in one sample were lighter and younger and thus more prone to a decrease in blood pressure than the heavier and older patients in the second sample. Randomization within the pair ( meaning that we randomly assign which of the first or second individuals in the pair gets which treatment) helps to prevent other uncontrolled variables from biasing the response to the treatments. Randomization within the pair is easily achieved by flipping a fair coin. One disadvantage to pairing is the smaller number of degrees of freedom which results, leading to a wider t-distribution and so possibly an increase in variance. This is often more than compensated for however by the reduction in variance that comes about because the paired variables are highly correlated (dependent). Recall from the definitions of variance and of covariance, in the dependent case one has V [ Y ]=E Y Y =V [ ] V [Y ] Cov[,Y ] or in terms of the correlation = [,Y ]=Cov [, Y ]/ Y, V [ Y ]=V [ ] V [Y ] Y. In the matched pairs, the dependence is within the same pair. Different pairs are assumed to be independent. Thus assuming the covariance of each pair is the same gives V [ Y ]=V [ D ] = V [ D n i ] = V [ D i] =. n n Note : If and Y are both normal but dependent, it may or may not be the case that their difference -Y is normal. When the sample size is small, we need to assume that the differences are at least approximately normal to be able to employ the t-distribution. For large samples by the Central Limit Theorem this is not an issue since then D will be approximately normal automatically.

13 WEEK 0 page 3 EAMPLE (like HW problem 7.74) Describe how the experimenter who wants to be able to compare the effectiveness of two drugs in lowering blood pressure would randomly select 6 out of patients to be treated with drug (The other 6 would then get drug ). We number the patients 0 through, then choose 6 random numbers between 0 and representing the 6 patients chosen. We could for example pick two random digits from the random digits table in the book to yield an integer between 0 and 99 which we would throw out if greater than or equal to 96. Then we could take the remainder upon division by of this number lying between 0 and 95. The remainder would be between 0 and. (This would be more efficient than throwing out a two digit number if it is larger than.) If any of the numbers (remainders) thus found are the same we ignore them and try again until we have 6 different numbers between 0 and. EAMPLE (like HW 7.75) Alternately the experimenter realizing that both drugs leave no measurable after effects after a week period, could decide to test the same patient with the two different drugs administered week apart. Describe how to conduct this paired comparison and to randomize within the pair. Assuming the patients are not in mortal danger of dying from a heart attack once taken off the drug, one would want to randomly choose which drug to give first to each patient say by flipping a fair coin (to randomize within the pair). Then one would look at the difference in blood pressure response of a given patient to the two drugs and conduct a paired t-test with these differences (assumed to be normally distributed). EAMPLE 3 In order to test two fitness programs with 50 weight lifters available, to see if the new diet and exercise regime will increase lifting capacity, the instructor asks for 0 volunteers to receive the new treatment. Why is this a bad idea? Such self selected rather than randomly selected methods can have hidden biases. For example perhaps it is more likely that the better, stronger, more motivated or ambitious weight lifters will be the ones who will want to volunteer.

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

2 Sample t-test (unequal sample sizes and unequal variances)

2 Sample t-test (unequal sample sizes and unequal variances) Variations of the t-test: Sample tail Sample t-test (unequal sample sizes and unequal variances) Like the last example, below we have ceramic sherd thickness measurements (in cm) of two samples representing

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Sample Size and Power in Clinical Trials

Sample Size and Power in Clinical Trials Sample Size and Power in Clinical Trials Version 1.0 May 011 1. Power of a Test. Factors affecting Power 3. Required Sample Size RELATED ISSUES 1. Effect Size. Test Statistics 3. Variation 4. Significance

More information

Statistical estimation using confidence intervals

Statistical estimation using confidence intervals 0894PP_ch06 15/3/02 11:02 am Page 135 6 Statistical estimation using confidence intervals In Chapter 2, the concept of the central nature and variability of data and the methods by which these two phenomena

More information

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit? ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Period Date LAB : THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

8 6 X 2 Test for a Variance or Standard Deviation

8 6 X 2 Test for a Variance or Standard Deviation Section 8 6 x 2 Test for a Variance or Standard Deviation 437 This test uses the P-value method. Therefore, it is not necessary to enter a significance level. 1. Select MegaStat>Hypothesis Tests>Proportion

More information

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. jn2@ecs.soton.ac.uk COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared jn2@ecs.soton.ac.uk Relationships between variables So far we have looked at ways of characterizing the distribution

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

AP: LAB 8: THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics Ms. Foglia Date AP: LAB 8: THE CHI-SQUARE TEST Probability, Random Chance, and Genetics Why do we study random chance and probability at the beginning of a unit on genetics? Genetics is the study of inheritance,

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Analysis of Variance ANOVA

Analysis of Variance ANOVA Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

1.5 Oneway Analysis of Variance

1.5 Oneway Analysis of Variance Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

More information

Statistical tests for SPSS

Statistical tests for SPSS Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Introduction to Hypothesis Testing OPRE 6301

Introduction to Hypothesis Testing OPRE 6301 Introduction to Hypothesis Testing OPRE 6301 Motivation... The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief, or hypothesis, about

More information

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade Statistics Quiz Correlation and Regression -- ANSWERS 1. Temperature and air pollution are known to be correlated. We collect data from two laboratories, in Boston and Montreal. Boston makes their measurements

More information

Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

More information

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA)

UNDERSTANDING ANALYSIS OF COVARIANCE (ANCOVA) UNDERSTANDING ANALYSIS OF COVARIANCE () In general, research is conducted for the purpose of explaining the effects of the independent variable on the dependent variable, and the purpose of research design

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Hypothesis Testing: Two Means, Paired Data, Two Proportions Chapter 10 Hypothesis Testing: Two Means, Paired Data, Two Proportions 10.1 Hypothesis Testing: Two Population Means and Two Population Proportions 1 10.1.1 Student Learning Objectives By the end of this

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

STAT 350 Practice Final Exam Solution (Spring 2015)

STAT 350 Practice Final Exam Solution (Spring 2015) PART 1: Multiple Choice Questions: 1) A study was conducted to compare five different training programs for improving endurance. Forty subjects were randomly divided into five groups of eight subjects

More information

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression

CHAPTER 13 SIMPLE LINEAR REGRESSION. Opening Example. Simple Regression. Linear Regression Opening Example CHAPTER 13 SIMPLE LINEAR REGREION SIMPLE LINEAR REGREION! Simple Regression! Linear Regression Simple Regression Definition A regression model is a mathematical equation that descries the

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING CHAPTER 5. A POPULATION MEAN, CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 5.1 Concepts When a number of animals or plots are exposed to a certain treatment, we usually estimate the effect of the treatment

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

University of Chicago Graduate School of Business. Business 41000: Business Statistics

University of Chicago Graduate School of Business. Business 41000: Business Statistics Name: University of Chicago Graduate School of Business Business 41000: Business Statistics Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper for the formulas. 2. Throughout

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1.

So, using the new notation, P X,Y (0,1) =.08 This is the value which the joint probability function for X and Y takes when X=0 and Y=1. Joint probabilit is the probabilit that the RVs & Y take values &. like the PDF of the two events, and. We will denote a joint probabilit function as P,Y (,) = P(= Y=) Marginal probabilit of is the probabilit

More information

Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

P(every one of the seven intervals covers the true mean yield at its location) = 3.

P(every one of the seven intervals covers the true mean yield at its location) = 3. 1 Let = number of locations at which the computed confidence interval for that location hits the true value of the mean yield at its location has a binomial(7,095) (a) P(every one of the seven intervals

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ( c Robert J. Serfling Not for reproduction or distribution) We have seen how to summarize a data-based relative frequency distribution by measures of location and spread, such

More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Unit 9 Describing Relationships in Scatter Plots and Line Graphs Unit 9 Describing Relationships in Scatter Plots and Line Graphs Objectives: To construct and interpret a scatter plot or line graph for two quantitative variables To recognize linear relationships, non-linear

More information

Pearson's Correlation Tests

Pearson's Correlation Tests Chapter 800 Pearson's Correlation Tests Introduction The correlation coefficient, ρ (rho), is a popular statistic for describing the strength of the relationship between two variables. The correlation

More information

ABSORBENCY OF PAPER TOWELS

ABSORBENCY OF PAPER TOWELS ABSORBENCY OF PAPER TOWELS 15. Brief Version of the Case Study 15.1 Problem Formulation 15.2 Selection of Factors 15.3 Obtaining Random Samples of Paper Towels 15.4 How will the Absorbency be measured?

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS

SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS SOLUTIONS TO BIOSTATISTICS PRACTICE PROBLEMS BIOSTATISTICS DESCRIBING DATA, THE NORMAL DISTRIBUTION SOLUTIONS 1. a. To calculate the mean, we just add up all 7 values, and divide by 7. In Xi i= 1 fancy

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Point Biserial Correlation Tests

Point Biserial Correlation Tests Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

More information

Confidence Intervals for Cp

Confidence Intervals for Cp Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process

More information

The Binomial Distribution

The Binomial Distribution The Binomial Distribution James H. Steiger November 10, 00 1 Topics for this Module 1. The Binomial Process. The Binomial Random Variable. The Binomial Distribution (a) Computing the Binomial pdf (b) Computing

More information

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other 1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

More information

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

The Cost of Production

The Cost of Production The Cost of Production 1. Opportunity Costs 2. Economic Costs versus Accounting Costs 3. All Sorts of Different Kinds of Costs 4. Cost in the Short Run 5. Cost in the Long Run 6. Cost Minimization 7. The

More information

Chapter 3 Consumer Behavior

Chapter 3 Consumer Behavior Chapter 3 Consumer Behavior Read Pindyck and Rubinfeld (2013), Chapter 3 Microeconomics, 8 h Edition by R.S. Pindyck and D.L. Rubinfeld Adapted by Chairat Aemkulwat for Econ I: 2900111 1/29/2015 CHAPTER

More information

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine 2 - Manova 4.3.05 25 Multivariate Analysis of Variance What Multivariate Analysis of Variance is The general purpose of multivariate analysis of variance (MANOVA) is to determine whether multiple levels

More information

Non-Inferiority Tests for One Mean

Non-Inferiority Tests for One Mean Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

More information

Constructing and Interpreting Confidence Intervals

Constructing and Interpreting Confidence Intervals Constructing and Interpreting Confidence Intervals Confidence Intervals In this power point, you will learn: Why confidence intervals are important in evaluation research How to interpret a confidence

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

2013 MBA Jump Start Program. Statistics Module Part 3

2013 MBA Jump Start Program. Statistics Module Part 3 2013 MBA Jump Start Program Module 1: Statistics Thomas Gilbert Part 3 Statistics Module Part 3 Hypothesis Testing (Inference) Regressions 2 1 Making an Investment Decision A researcher in your firm just

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

Standard Deviation Estimator

Standard Deviation Estimator CSS.com Chapter 905 Standard Deviation Estimator Introduction Even though it is not of primary interest, an estimate of the standard deviation (SD) is needed when calculating the power or sample size of

More information

Regression Analysis: A Complete Example

Regression Analysis: A Complete Example Regression Analysis: A Complete Example This section works out an example that includes all the topics we have discussed so far in this chapter. A complete example of regression analysis. PhotoDisc, Inc./Getty

More information

Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA. Analysis Of Variance Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

More information

Crosstabulation & Chi Square

Crosstabulation & Chi Square Crosstabulation & Chi Square Robert S Michael Chi-square as an Index of Association After examining the distribution of each of the variables, the researcher s next task is to look for relationships among

More information

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X

Probability density function : An arbitrary continuous random variable X is similarly described by its probability density function f x = f X Week 6 notes : Continuous random variables and their probability densities WEEK 6 page 1 uniform, normal, gamma, exponential,chi-squared distributions, normal approx'n to the binomial Uniform [,1] random

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

1. How different is the t distribution from the normal?

1. How different is the t distribution from the normal? Statistics 101 106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M 7.1 and 7.2, ignoring starred parts. Reread M&M 3.2. The effects of estimated variances on normal approximations. t-distributions.

More information