# UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

Save this PDF as:

Size: px
Start display at page:

Download "UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates"

## Transcription

1 UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally distributed. (ii) Central limit theorem.. (a) (i) µ p (ii) P ˆ p(1 p) σ Pˆ n For large samples Pˆ is approximately Normally distributed. 3. A parameter is a numerical characteristic of a population. 4. An estimate is a known quantity calculated from data in order to estimate an unknown parameter. 5. (4) 6. (a) µ 80 seconds σ seconds 16 ~ approximately Normal (µ 80s, σ 15s) pr( > 40) 1 pr( < 40) σ (a) (i) µ litres σ 1. litres n 1 σ 1. (ii) µ litres σ 0. 6 litres n 4 σ 1. (iii) µ litres σ 0. 3 litres n 16 The standard deviation differs. This is because as the sample size increases there is a decrease in the variability of the sample mean. STAT 13 Statistical Methods - Final Exam Review 1

2 8. (a) The proportion of university students who belong to the student loan scheme. µ ˆ P σ ( 0.65) ˆ P Pˆ ~ approx Normal (µ 0.65, σ ) (c) pr( Pˆ > 0.7) (d) pr(0.45 < Pˆ < 0.55) pr( Pˆ < 0.55) - pr( Pˆ < 0.45) (a) x 10.15, s x ± s ± x n 8 (8.75, 11.50) (c) (i) wider (ii) nothing (iii) narrower p ˆ pˆ ± pˆ(1 pˆ) ± n 10 (0.16, 0.384) 11. () UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 8 Confidence Intervals Section A: Confidence intervals for a mean, proportion and difference between means 1. (a) θ µ, the population mean score for the old Stat 13 exam. θˆ x 38.0, the mean mark of the sample of 30 marks. s (c) se(θˆ ) se( x ) n 30 (d) df t-multiplier % c.i. is: x ± t se(x) 38.0 ±.045 x ± (34.15, 4.5) (g) There are many ways of interpreting a confidence interval. Three different ways follow. (1) With 95% confidence, we estimate that the population mean mark is somewhere between and 4.5 marks. () We estimate that the population mean mark is somewhere between and 4.5 marks. A statement such as this is correct, on average, 19 times out of every 0 times we take such a sample. (3) We estimate the population mean mark to be 38.0 with a margin or error of A statement such as this is correct, on average, 19 times out of every 0 times such a sample is taken. (h) We don t know. The population mean mark is not known so we don t know whether this particular 95% confidence interval contains the population mean. However, in the long run, the population mean will be contained in 95% of the 95% confidence intervals calculated from such samples.. (a) θ p, the proportion of female Spanish prisoners in 1995 who had tuberculosis. 36 θ ˆ pˆ , the proportion in the sample of female Spanish prisoners who had tuberculosis. pˆ(1 pˆ) (c) se(θˆ ) se( pˆ ) n 90 (d) z-multiplier % c.i. is: p ˆ ± t se( pˆ ) 0.4 ± 1.96 x ± (0.99, 0.501) We estimate that the proportion of female Spanish prisoners in 1995 with tuberculosis is somewhere between 9.9% and 50.1%. A statement such as this is correct, on average, 19 times out of every 0 times we take such a sample. (g) We don t know. The population proportion is not known so we don t know whether this particular 95% confidence interval contains the population proportion. However, in the long run, the population proportion will be contained in 95% of the 95% confidence intervals calculated from such samples. STAT 13 Statistical Methods - Final Exam Review STAT 13 Statistical Methods - Final Exam Review 1

3 3. (a) We estimate that the mean thiol level for people suffering from rheumatoid arthritis is somewhere between 1.08 and.01 greater than the mean thiol level for non-sufferers. A statement such as this is correct, on average, 19 times out of every 0 times we take such a sample. 4. (4) 5. () 6. (3) 7. (3) 8. (5) 9. (4) We don t know. The true difference in the thiol levels between the two populations is not known so we don t know whether this particular 95% confidence interval contains the true difference. However, in the long run, the true difference will be contained in 19 out of each batch of 0 confidence intervals calculated from such samples. Section B: Confidence interval for a difference in proportions 1. (a) Situation : Single sample, several response categories (c) (d) Situation (a): Two independent samples Situation (c): Single sample, two or more Yes/No items Situation (a): Two independent samples. (a) Situation : Single sample, several response categories (c) (d) Situation (a): Two independent samples Situation (c): Single sample, two or more Yes/No items Situation (a): Two independent samples Situation (c): Single sample, two or more Yes/No items Situation : Single sample, several response categories 3. () Note: This was Situation (c): Single sample, two or more Yes/No items. STAT 13 Statistical Methods - Final Exam Review 4. (a) Let p 1 represent the proportion of white prisoners who were infected with TB and p represent the proportion of Gypsy prisoners who were infected with TB. θ p 1 p, the true difference in the above proportions θ ˆ pˆ 1 pˆ , the estimated difference in the above proportions ( ) ( ) (c) se( p ˆ1 pˆ ) (d) z % c.i. is: pˆ pˆ ) ± z se( pˆ ˆ ) ± 1.96 x ± ( 1 1 p ( 0.019, ) ( 1.3%, 15.9%) With 95% confidence, we estimate the proportion of white prisoners who were infected with TB to be somewhere between 1.3% lower and 15.9% higher than the proportion of Gypsy prisoners who were infected with TB. 5. (a) θ p W p G, the difference in the proportion of prisoners infected with TB who were white and the proportion of prisoners infected with TB who were Gypsy θ ˆ pˆ ˆ G W p, the estimated difference in the above proportions. (c) se( p ˆ W pˆ G ) ( ) (d) z % c.i. is: pˆ pˆ ) ± z se( pˆ pˆ ) ± 1.96 x ± ( W G W G (0.6585, ) (66%, 77%) We estimate that proportion of prisoners infected with TB who were white is somewhere between 66% and 77% greater than the proportion of prisoners infected with TB who were Gypsy. A statement such as this is correct, on average, 19 times out of every 0 times such a sample is taken. 6. (a) Situation (a). There are two independent samples a sample of intravenous drug user prisoners and a sample of non-intravenous drug user prisoners. (c) 7. (3) 8. (4) 9. (1) 10. (5) With 95% confidence, we estimate that proportion of intravenous drug user prisoners who were infected with TB is somewhere between 0.6% less than and 11.5% greater than the proportion of non-intravenous drug user prisoners who were infected with TB. Yes. Since zero is contained within the 95% confidence interval, zero is a plausible value for true difference between the population proportions. STAT 13 Statistical Methods - Final Exam Review 3

4 UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 9 Significance Testing: Using Data to Test Hypotheses 11. If the P-value is less than or equal to a specified value (usually 5% or 1%), the effect that was tested is said to be significant at that specified level (usually 5% or 1%). Therefore a significant test reveals that there is sufficient evidence against the null hypothesis. Section A: 1. The null hypothesis is the hypothesis tested by the statistical test. The alternative hypothesis specifies the type of departure from the null hypothesis we expect to detect.. (a) H 0 : θ 0 H 0 : > θ 0 (c) H 0 : < θ0 3. A one-tailed test is used when the investigators have good grounds for believing the true value of θ was on one particular side of θ 0 before the study began. Otherwise, or if in doubt, a two-tailed test is 4. (a) used. Good grounds mean that there is prior information or there is a theory to tell the investigators which way the study will go. θ θ θ estimate hypothsised value std error ˆ θ 0 se( θˆ) 5. The P-value is the probability that, if the null hypothesis was true, sampling variation would produce an estimate that is at least as far away from the hypothesised value than our data estimate. θ 6. The P-value measures the strength of evidence against the null hypothesis. 7. P-value Evidence against H 0 > 0.1 none 0.10 weak 0.05 some 0.01 strong < 0.01 very strong 8. Nothing. 9. A confidence interval. 10. One possible value for the parameter, called the hypothesised value, is tested. The test determines the strength of evidence provided by the data against the proposition that the hypothesised value is the true value. STAT 13 Statistical Methods - Final Exam Review 1 STAT 13 Statistical Methods - Final Exam Review

5 Section B: 1. (a) Let p W be the true proportion of white prisoners who were infected with TB and p G be the true proportion of Gypsy prisoners who were infected with TB. Thus θ p p. H 0 : p W - p G 0 vs : p W - p G (c) p ˆ W pˆ G ( ) ( ) (d) se ( p ˆ W pˆ G ) z P-value x pr(z > 1.665) between 0.05 and 0.1 (in fact it is just less than 0.1) We have weak evidence against H 0. (g) There is weak evidence that there is a difference between the proportion of White prisoners who had TB and the proportion of Gypsy prisoners who had TB. (h) 95% confidence interval for pw pg : (i) ± 1.96 x (-0.013, 0.159) With 95% confidence, we estimate that the proportion of White prisoners who had TB is somewhere between 1.3% lower than and 15.9% higher than the proportion of Gypsy prisoners who had TB. W G 3. (a) Let µ 1 be the true mean daily revenue for laundry 1 and µ be the true mean daily revenue for laundry. Thus the parameter used is µ 1 µ, the difference in the mean daily revenue for the two laundries. H 0 : µ 1 - µ 0 vs : µ 1 - µ 0 (c) x x (d) t (g) (h) Section C: 1. (1). (1) 3. (4) P-value We have some evidence that the mean daily revenue of the first laundry is greater than the mean daily revenue of the second laundry. With 95% confidence, we estimate that the mean daily revenue of the first laundry is somewhere between \$1 less than and \$69 more than the mean daily revenue of the second laundry. There are no reasons for doubting the validity of the results of this analysis because neither stem-and-leaf plot shows any non-normal features. The computer uses a different formula for calculating df. This formula gives a larger value of df than the hand calculation based on the minimum of one less than each sample size.. (a) p, the population proportion (ie the proportion of residents of New York City who would have said that they would move somewhere else). H 0 : p 0.5 vs : p (c) p ˆ (d) z P-value There is very strong evidence that the true proportion of New York City residents who would have said that they would move somewhere else is greater than 50%. We estimate that the proportion of New York City residents who would have said that they would move somewhere else is somewhere between 55.9% and 6.0%. A statement such as this is correct, on average, 19 out of every 0 times such a sample is taken. STAT 13 Statistical Methods - Final Exam Review 3 STAT 13 Statistical Methods - Final Exam Review

6 UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 10 Data on a Continuous Variable 1. (a) A t-test on the differences. A pair of observations is made on the same subject so this is paired comparison data. A t-test on the differences is more appropriate. H 0 : µ Diff 0 vs : µ Diff > 0. P-value We have strong evidence against the null hypothesis of no difference between current and previous spending. It appears that access to the cable network increases spending in viewers. With 95% confidence, we estimate that viewers spent, on average, between \$3.10 and \$6.50 more when they had access to the cable network. Note: A one-tailed test was used because if departure from H 0 does occur we expect an increase in sales. (c) Since a t-test on the differences is used we look at the dot plot, Normal probability plot and W- test on the differences. The dot plot shows moderate positive skewness but the t-test is robust to such departures from Normality. The W-test for Normality has a P-value greater than 10% so it is believable that the data has an underlying Normal distribution. The results of the t-test should be valid in this situation. (d) Hypotheses: H : ~ µ 0 vs H : ~ µ 0 0 Difff 1 Diff > Signs of the differences: positive signs, 5 negative signs, 1 zero. Since this is a one-tailed (>) test, evidence against the null hypothesis will be provided by a large number of observations above the hypothesised median of zero (and therefore a small number of observations below zero). P-value pr(y 10) by considering positive signs, or pr(y 5) by considering negative signs, where Y ~ Binomial (n 15, p 0.5) Interpretation: We have no evidence against the null hypothesis. We have no evidence to suggest that the median spending increases when the viewers have access to the cable network. The t-test is more appropriate. When the assumptions of the t-test are met reasonably well choose a t-test in preference to a nonparametric test. This is because a t-test will more readily detect departures from the null hypothesis when these departures do exist (That is, the t-test is more powerful than the Sign test). Note that the t-test gave a P-value of 1.6% while the Sign test gave a P-value of 15.1%.. (a) The dot plot shows the presence of a possible outlier. The t-test is sensitive to the presence of outliers, especially with a small sample size of 11. Because the Normality assumption is not satisfied a nonparametric test should be used. H : ~ µ 1. 8 vs H : ~ µ < (c) P-value (d) We have no evidence against the null hypothesis. We have no evidence that the median operating time for hedge trimmers was less than 1.8 hours. A high number of observations below the hypothesised median (and therefore a small number of observations above the hypothesised median) provide evidence against the null hypothesis. P-value pr(y 3) by considering positive signs, or pr(y 7) by considering negative signs, where Y ~ Binomial (n 10, p 0.5). Section B: More Than Two Independent Samples 1. (a) Analysis of variance (ANOVA). When we want to investigate differences between the underlying means of more than two groups. (c) H 0 : µ 1 µ µ 3 µ 4 (The population means are all equal.) : At least one of the population means is different from the other three.. (a) (d) (i) Assumptions: (I) The samples are independent of each other. (II) The underlying distribution of each group is Normally distributed. (III) The population standard deviations of each group are equal. (ii) Checking assumptions: (I) Ensure in the design of the experiment or study. (II) By plotting the data. The choice of plot will depend on the sample sizes. (III) By plotting the data and/or looking at the sample standard deviations. B W s f 0 s s B is called the between mean sum of squares and it measures the variability between the sample means. s W is called the within mean sum of squares and it measures the average variability within the samples (that is, the internal variability within the samples themselves). 3. (a) k 6, n tot 16 k 9, n tot 153 (c) k 5, n tot 30 STAT 13 Statistical Methods - Final Exam Review STAT 13 Statistical Methods - Final Exam Review

7 4. (a) There are three independent random samples. There are signs of moderate positive skewness in all three of the samples. The Neither group also shows signs of three clusters, indicating possible multi-modality. However with the equal sample sizes and the moderate size of the three samples this should not cause any concern with the validity of the F-test. The assumption of equality of the standard deviations of each underlying distribution is suspect because the sample standard deviation of the Neither group is larger than the sample standard deviations of the Drug and Placebo groups. However this difference is not large enough to cause concern with the assumption because the F-test is quite robust with respect to this assumption. Note that the ratio of the largest sample standard deviation to the smallest standard 3.14 deviation is less than ( 1. 5 ). 15. (c) (d) ANOVA table: DF SS MS F P Group Error Total s B s W H 0 : µ Drug µ Neither µ Placebo, where µ Drug, µ Neither and µ Placebo are the mean number of minutes for patients to fall asleep in the Drug, Neither and Placebo groups respectively. That is, the mean number of minutes to fall asleep is the same for each of the three groups. : At least one of the three underlying means is different from the other two. The P-value of means that we have strong evidence against the null hypothesis. We have strong evidence that at least one of the groups has an different underlying mean number of minutes for patients to fall asleep. The confidence interval for the difference between the means of the Drug group and the Neither group does not contain zero. At the 95% confidence level, it is believable that there is a real difference between the means of these two groups. (i) The family error rate of 5% means that in 5% of the times the test is done, at least one of the difference confidence intervals will not contain the true mean difference. (ii) The family error rate of 5% means that in 95% of the times the test is done, each of the difference confidence intervals will contain its true mean difference. (g) The individual error rate of 1.95% means that in 98.05% of the times the test is done, the confidence interval for µ Drug - µ Placebo will contain the true mean difference. Section C: Experimental Design 1. We should not use a paired design. A paired design would require measuring the number of heart attacks over two different time periods. In this study the time periods are long (10 years), which would make a paired comparison design impractical. The most appropriate design for this experiment would be a two independent sample design, with one of the samples acting as a control group by receiving aspirin. A sample of 9500 subjects should be randomly selected from the 19,000 available and these should be given a regular dosage of aspirin. The remaining subjects should be given a regular dosage of the new drug. The subjects are chosen randomly, in an attempt to make the two groups differ only by the type of drug given to the subjects. The experiment should also be double blind to prevent any possible biases by the subjects or the people administering the treatments. The hypotheses we would be interested in testing would be: H 0 : p A p N 0 versus : p A p N 0 where p A is the proportion of deaths due to heart attacks for the subjects given a regular dosage of aspirin and p N is the proportion of deaths due to heart attacks for the subjects given a regular dosage of the new drug. With such large samples there will be no concern about the Normality assumption for this t-test for two proportions.. The most appropriate design would be to use five samples, one for each spray and a control group, and use one-way analysis of variance. The trees should be randomly assigned into five groups of 16. Each of the four sprays should be randomly allocated to one of these groups, with the remaining group being left untreated. The hypotheses we would be interested in testing would be: H 0 : The underlying mean number of bug infested leaves is the same for each group. : At least one group of trees has a different underlying mean number of bug infested leaves. The assumptions are: The five samples are independent. The underlying distributions are Normally distributed. The population standard deviations of each group are all equal. STAT 13 Statistical Methods - Final Exam Review 3 STAT 13 Statistical Methods - Final Exam Review 4

8 3. The most appropriate design for this experiment would be a two independent sample design, with one sample filling in the first questionnaire and the other filling in the second questionnaire. The 50 subjects should be randomly allocated to the samples in an attempt to make the two samples differ only by the type of questionnaire given to the subjects. Interviewers should each interview 5 subjects from each group to allow for any possible bias due to interviewer effect. Also, subjects should be randomly allocated to interviewers. The hypotheses we would be interested in testing would be: H µ µ 0 versus H µ µ 0, where µ 1 is the mean time taken to fill in questionnaire 1 0 : 1 1 : 1 and µ is the mean time taken to fill in questionnaire. The assumption that needs to be satisfied is that the underlying distributions are Normally distributed. 4. The most appropriate design for this experiment would be a paired design in order to attempt to reduce the variability between rat hearts due to age and condition. If a two independent sample design was used the variability within in each sample might overshadow any difference between the means. We have 0 hearts available (each sliced in two). For each heart, half of the heart can be used for one dosage of the digitalis treatment, and the remaining half used for the other dosage of the digitalis treatment. The hypotheses we would be interested in testing would be: H : µ 0 versus H : µ 0, where µ Diff is the mean difference in the strength of muscle 0 Diff 1 Diff contractions between the pieces of heart treated with differing dosages of digitalis. The assumption that needs to be satisfied is that the distribution of the differences is Normally distributed. Section D: Quiz 1. (a) Plot the data. It is the quickest way to see what the data say. It often reveals interesting features that were not expected. It helps prevent inappropriate analyses and unfounded conclusions. Plots also have a central role in checking the assumptions made by formal methods.. All observations in the sample are independent and are from the same distribution. 3. (a) (i) and (iv) (iv) (c) (i) [in that the underlying distribution of the differences is assumed to be Normal] and (iii) (d) (i), (ii) and (iv) (i), (ii), (iv) and (v). 4. Try to check the original sources of the data. Any observations that you know to be mistakes should be corrected or removed. If in doubt, do the analyses with and without the outliers to see whether you come to the same conclusions. 5. They are less sensitive to outliers and do not assume any particular underlying distribution for the observations. 6. Nonparametric tests tend to be less effective at detecting departures from a null hypothesis and tend to give wider confidence intervals than parametric tests. STAT 13 Statistical Methods - Final Exam Review 5 UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Section A: One-way Tables Chapter 11 Tables of Counts 1. (a) H 0 : Relative market shares are the same as they were prior to Channel A altering its programming, ie p A 0.1, p B 0.4, p C 0.5. : Relative market shares differ since Channel A altered its programming, ie the proportions are not p A 0.1, p B 0.4, p C 0.5. Expected count for Channel A 0.1 x Expected count for Channel B 0.4 x Expected count for Channel C 0.5 x (15 10) (c) Cell contribution (d) (i) P-value pr( ) (ii) There is weak evidence that the altered programming for Channel A has affected relative market shares.. (a) H 0 : The proportions of the types are p BC 16 9, pbc 16 3, pbc 16 3, pbc : The proportions of the types are not p BC 16 9, pbc 16 3, pbc 16 3, pbc Expected count for type BC 16 9 x (16 30) (c) Cell contribution (d) (i) P-value pr( ) (ii) There is strong evidence that the observed frequencies are different from the expected frequencies. STAT 13 Statistical Methods - Final Exam Review 1

9 Section B: Two-way Tables 1. (a) H 0 : A person s primary source of news is independent of their age. : There is a relationship between a person s primary source of news and their age (i) Expected count for the (Under 30, Radio) cell (ii) Expected count for the (30-49, TV) cell ( ) (c) Cell contribution (d) The P-value to 3 decimal places. We have very strong evidence to suggest that there is a relationship between a person s age and their primary news source. Yes. We could consider the samples of people under 30, people in the age group and the people in the 50 and over age group as three independent sub-samples and carry out a Chisquare test of homogeneity with the primary news source as the response factor.. (a) H 0 : The distribution of the opinions is the same for each group. : The distribution of the opinions is different for at least one group. 3. () 4. (4) 5. (3) (i) Expected count for the (Increase government expenditure, Economists) cell (ii) Expected count for the (Other business incentives, Business executives) cell (c) (i) Cell contribution (ii) (Increase government expenditure, Government officials) more than expected (Increase government expenditure, Business executives) fewer than expected (Other business incentives, Business executives) more than expected (iii) The P-value to 3 decimal places. There is extremely strong evidence that the opinions of the three groups differ. (d) No. This data has been collected as three independent samples and a Chi-square test for independence requires that the data is collected as one random sample. UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 1 Relationships between Quantitative Variables: Section A: The Straight Line Graph 1. (a) β 0 5, β 1 3 β 0 10, β (a) y 3 + x (i) (ii) 1 Section B: Regression 1. (a) yˆ x Regression and Correlation Predicted lung capacity x (c) Predicted lung capacity x Residual Observed value predicted value (d) Years smoking is used to predict lung capacity. Years smoking is a quantitative variable and Lung capacity is continuous and random. There is a possible linear trend but the observations (8, 30) and (33, 35) are possible outliers which cause concern with the appropriateness of the model. The residuals versus Years smoking plot along with the P-value for the W-test for Normality indicates some concern with the assumption that the errors are Normally distributed. H 0 : β 1 0 : β 1 0 P-value There is strong evidence of a linear relationship between years of smoking and lung capacity. With 95% confidence, we estimate that for every additional year of smoking an emphysema patient s lung capacity increases by between 0.44 and.18 units. (i) r (ii) Excel calls it Multiple R.. (a) For x 1.46, ŷ x Residual Observed value predicted value The lactic acid concentration is used to predict the taste score. The lactic acid concentration is quantitative, and the taste score is continuous and random. The scatter plot shows a linear trend with scatter about that trend. From the plot of residuals versus lactic acid concentration there is no concern with the assumption that the errors are Normally distributed with mean 0 and with the same standard deviation for each value of. STAT 13 Statistical Methods - Final Exam Review STAT 13 Statistical Methods - Final Exam Review 1

10 (c) H 0 : β 1 0 : β 1 0 P-value There is strong evidence of a linear relationship between lactic acid concentration and taste score. 95% confidence interval for β 1 is: ±.048 x (3.0, 5.4) With 95% confidence, we estimate that for every increase of one unit in the lactic acid concentration the taste score increases by between 3.0 and 5.4 units. (d) (i) We predict that, on average, cheddar cheese with a lactic acid concentration of 1.8 will have a taste score of (ii) With 95% confidence, we estimate that the mean taste score for cheddar cheese with a lactic acid concentration of 1.8 will be somewhere between 31. and (iii) With 95% confidence, we predict that the next piece of cheddar cheese with a lactic acid concentration of 1.8 will be somewhere between 13.0 and Estimated slope 37.7 Estimated increase in taste score for a 1 unit change in lactic acid concentration is Estimated increase in taste score for a 0.05 unit change in lactic acid concentration is 0.05 x () (1) is the correct response. Section C: 1. (4). () 3. (5) 4. (5) 5. () 6. (3) 7. (1) 8. (3) 9. (5) 10. (3) 11. (3) 1. (5) Dinov's STAT 13 Statistical Methods - Final Exam Review

### Chapter 11: Two Variable Regression Analysis

Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

### Math 62 Statistics Sample Exam Questions

Math 62 Statistics Sample Exam Questions 1. (10) Explain the difference between the distribution of a population and the sampling distribution of a statistic, such as the mean, of a sample randomly selected

### E205 Final: Version B

Name: Class: Date: E205 Final: Version B Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The owner of a local nightclub has recently surveyed a random

### Simple Linear Regression Chapter 11

Simple Linear Regression Chapter 11 Rationale Frequently decision-making situations require modeling of relationships among business variables. For instance, the amount of sale of a product may be related

### An example ANOVA situation. 1-Way ANOVA. Some notation for ANOVA. Are these differences significant? Example (Treating Blisters)

An example ANOVA situation Example (Treating Blisters) 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

### ANOVA MULTIPLE CHOICE QUESTIONS. In the following multiple-choice questions, select the best answer.

ANOVA MULTIPLE CHOICE QUESTIONS In the following multiple-choice questions, select the best answer. 1. Analysis of variance is a statistical method of comparing the of several populations. a. standard

### Simple Linear Regression

Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression Statistical model for linear regression Estimating

### Versions 1a Page 1 of 17

Note to Students: This practice exam is intended to give you an idea of the type of questions the instructor asks and the approximate length of the exam. It does NOT indicate the exact questions or the

### Multiple Regression in SPSS STAT 314

Multiple Regression in SPSS STAT 314 I. The accompanying data is on y = profit margin of savings and loan companies in a given year, x 1 = net revenues in that year, and x 2 = number of savings and loan

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### ACTM State Exam-Statistics

ACTM State Exam-Statistics For the 25 multiple-choice questions, make your answer choice and record it on the answer sheet provided. Once you have completed that section of the test, proceed to the tie-breaker

BIOSTATISTICS QUIZ ANSWERS 1. When you read scientific literature, do you know whether the statistical tests that were used were appropriate and why they were used? a. Always b. Mostly c. Rarely d. Never

### 0.1 Multiple Regression Models

0.1 Multiple Regression Models We will introduce the multiple Regression model as a mean of relating one numerical response variable y to two or more independent (or predictor variables. We will see different

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### SELF-TEST: SIMPLE REGRESSION

ECO 22000 McRAE SELF-TEST: SIMPLE REGRESSION Note: Those questions indicated with an (N) are unlikely to appear in this form on an in-class examination, but you should be able to describe the procedures

### Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

### In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a

Math 143 Inference on Regression 1 Review of Linear Regression In Chapter 2, we used linear regression to describe linear relationships. The setting for this is a bivariate data set (i.e., a list of cases/subjects

### AP Statistics 2001 Solutions and Scoring Guidelines

AP Statistics 2001 Solutions and Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use

### The scatterplot indicates a positive linear relationship between waist size and body fat percentage:

STAT E-150 Statistical Methods Multiple Regression Three percent of a man's body is essential fat, which is necessary for a healthy body. However, too much body fat can be dangerous. For men between the

### AP Statistics 2002 Scoring Guidelines

AP Statistics 2002 Scoring Guidelines The materials included in these files are intended for use by AP teachers for course and exam preparation in the classroom; permission for any other use must be sought

### c. The factor is the type of TV program that was watched. The treatment is the embedded commercials in the TV programs.

STAT E-150 - Statistical Methods Assignment 9 Solutions Exercises 12.8, 12.13, 12.75 For each test: Include appropriate graphs to see that the conditions are met. Use Tukey's Honestly Significant Difference

### Testing Hypotheses using SPSS

Is the mean hourly rate of male workers \$2.00? T-Test One-Sample Statistics Std. Error N Mean Std. Deviation Mean 2997 2.0522 6.6282.2 One-Sample Test Test Value = 2 95% Confidence Interval Mean of the

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### Chapter 16 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 16 Multiple Choice Questions (The answers are provided after the last question.) 1. Which of the following symbols represents a population parameter? a. SD b. σ c. r d. 0 2. If you drew all possible

### Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Readings: Ha and Ha Textbook - Chapters 1 8 Appendix D & E (online) Plous - Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability

### Statistical Inference and t-tests

1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

### RNR / ENTO Assumptions for Simple Linear Regression

74 RNR / ENTO 63 --Assumptions for Simple Linear Regression Statistical statements (hypothesis tests and CI estimation) with least squares estimates depends on 4 assumptions:. Linearity of the mean responses

### Sociology 6Z03 Topic 15: Statistical Inference for Means

Sociology 6Z03 Topic 15: Statistical Inference for Means John Fox McMaster University Fall 2016 John Fox (McMaster University) Soc 6Z03: Statistical Inference for Means Fall 2016 1 / 41 Outline: Statistical

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### Chapter 7. Estimates and Sample Size

Chapter 7. Estimates and Sample Size Chapter Problem: How do we interpret a poll about global warming? Pew Research Center Poll: From what you ve read and heard, is there a solid evidence that the average

### Technology Step-by-Step Using StatCrunch

Technology Step-by-Step Using StatCrunch Section 1.3 Simple Random Sampling 1. Select Data, highlight Simulate Data, then highlight Discrete Uniform. 2. Fill in the following window with the appropriate

### For example, enter the following data in three COLUMNS in a new View window.

Statistics with Statview - 18 Paired t-test A paired t-test compares two groups of measurements when the data in the two groups are in some way paired between the groups (e.g., before and after on the

### MTH 140 Statistics Videos

MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative

### Analysis of numerical data S4

Basic medical statistics for clinical and experimental research Analysis of numerical data S4 Katarzyna Jóźwiak k.jozwiak@nki.nl 3rd November 2015 1/42 Hypothesis tests: numerical and ordinal data 1 group:

### AP Statistics 2000 Scoring Guidelines

AP Statistics 000 Scoring Guidelines The materials included in these files are intended for non-commercial use by AP teachers for course and exam preparation; permission for any other use must be sought

### Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

### Testing: is my coin fair?

Testing: is my coin fair? Formally: we want to make some inference about P(head) Try it: toss coin several times (say 7 times) Assume that it is fair ( P(head)= ), and see if this assumption is compatible

### T adult = 96 T child = 114.

Homework Solutions Do all tests at the 5% level and quote p-values when possible. When answering each question uses sentences and include the relevant JMP output and plots (do not include the data in your

### Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

### Sample Exam #1 Elementary Statistics

Sample Exam #1 Elementary Statistics Instructions. No books, notes, or calculators are allowed. 1. Some variables that were recorded while studying diets of sharks are given below. Which of the variables

### Hypothesis Testing. Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University

Hypothesis Testing Dr. Bob Gee Dean Scott Bonney Professor William G. Journigan American Meridian University 1 AMU / Bon-Tech, LLC, Journi-Tech Corporation Copyright 2015 Learning Objectives Upon successful

### Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35

Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistiek-i/ John Nerbonne 1/35 t-tests To test an average or pair of averages when σ is known, we

### Comparing Means in Two Populations

Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

### Statistics for Sports Medicine

Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Paired Differences and Regression

Paired Differences and Regression Students sometimes have difficulty distinguishing between paired data and independent samples when comparing two means. One can return to this topic after covering simple

### AP Statistics 2004 Scoring Guidelines

AP Statistics 2004 Scoring Guidelines The materials included in these files are intended for noncommercial use by AP teachers for course and exam preparation; permission for any other use must be sought

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### AP Statistics 2013 Scoring Guidelines

AP Statistics 2013 Scoring Guidelines The College Board The College Board is a mission-driven not-for-profit organization that connects students to college success and opportunity. Founded in 1900, the

### e = random error, assumed to be normally distributed with mean 0 and standard deviation σ

1 Linear Regression 1.1 Simple Linear Regression Model The linear regression model is applied if we want to model a numeric response variable and its dependency on at least one numeric factor variable.

### How to choose a statistical test. Francisco J. Candido dos Reis DGO-FMRP University of São Paulo

How to choose a statistical test Francisco J. Candido dos Reis DGO-FMRP University of São Paulo Choosing the right test One of the most common queries in stats support is Which analysis should I use There

### Example for testing one population mean:

Today: Sections 13.1 to 13.3 ANNOUNCEMENTS: We will finish hypothesis testing for the 5 situations today. See pages 586-587 (end of Chapter 13) for a summary table. Quiz for week 8 starts Wed, ends Monday

### Statistics II Final Exam - January Use the University stationery to give your answers to the following questions.

Statistics II Final Exam - January 2012 Use the University stationery to give your answers to the following questions. Do not forget to write down your name and class group in each page. Indicate clearly

### AP Statistics 2011 Scoring Guidelines

AP Statistics 2011 Scoring Guidelines The College Board The College Board is a not-for-profit membership association whose mission is to connect students to college success and opportunity. Founded in

### Hypothesis Testing. Bluman Chapter 8

CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 8-1 Steps in Traditional Method 8-2 z Test for a Mean 8-3 t Test for a Mean 8-4 z Test for a Proportion 8-5 2 Test for

### N Mean Std. Deviation Std. Error of Mean

DEPARTMENT OF POLITICAL SCIENCE AND INTERNATIONAL RELATIONS Posc/Uapp 815 SAMPLE QUESTIONS FOR THE FINAL EXAMINATION I. READING: A. Read Agresti and Finalay, Chapters 6, 7, and 8 carefully. 1. Ignore the

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Chapter 10 Chi-Square Test Relative Risk/Odds Ratios

UCLA STAT 3 Introduction to Statistical Methods for the Life and Health Sciences Instructor: Ivo Dinov, Asst. Prof. of Statistics and Neurology Chapter 0 Chi-Square Test Relative Risk/Odds Ratios Teaching

### Chapter 14: 1-6, 9, 12; Chapter 15: 8 Solutions When is it appropriate to use the normal approximation to the binomial distribution?

Chapter 14: 1-6, 9, 1; Chapter 15: 8 Solutions 14-1 When is it appropriate to use the normal approximation to the binomial distribution? The usual recommendation is that the approximation is good if np

### Statistics 2014 Scoring Guidelines

AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

### Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

### Regression III: Dummy Variable Regression

Regression III: Dummy Variable Regression Tom Ilvento FREC 408 Linear Regression Assumptions about the error term Mean of Probability Distribution of the Error term is zero Probability Distribution of

### 9.1 (a) The standard deviation of the four sample differences is given as.68. The standard error is SE (ȳ1 - ȳ 2 ) = SE d - = s d n d

CHAPTER 9 Comparison of Paired Samples 9.1 (a) The standard deviation of the four sample differences is given as.68. The standard error is SE (ȳ1 - ȳ 2 ) = SE d - = s d n d =.68 4 =.34. (b) H 0 : The mean

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### Name: Student ID#: Serial #:

STAT 22 Business Statistics II- Term3 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS Department Of Mathematics & Statistics DHAHRAN, SAUDI ARABIA STAT 22: BUSINESS STATISTICS II Third Exam July, 202 9:20

### Confidence Intervals and Hypothesis Testing

Name: Class: Date: Confidence Intervals and Hypothesis Testing Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The librarian at the Library of Congress

### ANOVA Analysis of Variance

ANOVA Analysis of Variance What is ANOVA and why do we use it? Can test hypotheses about mean differences between more than 2 samples. Can also make inferences about the effects of several different IVs,

### Null Hypothesis H 0. The null hypothesis (denoted by H 0

Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

### AP Statistics 2007 Scoring Guidelines

AP Statistics 2007 Scoring Guidelines The College Board: Connecting Students to College Success The College Board is a not-for-profit membership association whose mission is to connect students to college

### Regression in SPSS. Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology

Regression in SPSS Workshop offered by the Mississippi Center for Supercomputing Research and the UM Office of Information Technology John P. Bentley Department of Pharmacy Administration University of

### Statistics 104: Section 7

Statistics 104: Section 7 Section Overview Reminders Comments on Midterm Common Mistakes on Problem Set 6 Statistical Week in Review Comments on Midterm Overall, the midterms were good with one notable

### Stat 411/511 ANOVA & REGRESSION. Charlotte Wickham. stat511.cwick.co.nz. Nov 31st 2015

Stat 411/511 ANOVA & REGRESSION Nov 31st 2015 Charlotte Wickham stat511.cwick.co.nz This week Today: Lack of fit F-test Weds: Review email me topics, otherwise I ll go over some of last year s final exam

### SIMPLE REGRESSION ANALYSIS

SIMPLE REGRESSION ANALYSIS Introduction. Regression analysis is used when two or more variables are thought to be systematically connected by a linear relationship. In simple regression, we have only two

### Difference of Means and ANOVA Problems

Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

### 12.1 Inference for Linear Regression

12.1 Inference for Linear Regression Least Squares Regression Line y = a + bx You might want to refresh your memory of LSR lines by reviewing Chapter 3! 1 Sample Distribution of b p740 Shape Center Spread

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### 1.5 Oneway Analysis of Variance

Statistics: Rosie Cornish. 200. 1.5 Oneway Analysis of Variance 1 Introduction Oneway analysis of variance (ANOVA) is used to compare several means. This method is often used in scientific or medical experiments

### I. Basics of Hypothesis Testing

Introduction to Hypothesis Testing This deals with an issue highly similar to what we did in the previous chapter. In that chapter we used sample information to make inferences about the range of possibilities

### Analysis of Variance ANOVA

Analysis of Variance ANOVA Overview We ve used the t -test to compare the means from two independent groups. Now we ve come to the final topic of the course: how to compare means from more than two populations.

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### Principles of Hypothesis

Principles of Hypothesis Testing for Public Health Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine johnslau@mail.nih.gov Fall 2011 Answers to Questions

### Statistics: revision

NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 3 / 4 May 2005 Department of Experimental Psychology University of Cambridge Slides at pobox.com/~rudolf/psychology

### Example: Boats and Manatees

Figure 9-6 Example: Boats and Manatees Slide 1 Given the sample data in Table 9-1, find the value of the linear correlation coefficient r, then refer to Table A-6 to determine whether there is a significant

### Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2

Chapter 11: Linear Regression - Inference in Regression Analysis - Part 2 Note: Whether we calculate confidence intervals or perform hypothesis tests we need the distribution of the statistic we will use.

### Chapter 9. Section Correlation

Chapter 9 Section 9.1 - Correlation Objectives: Introduce linear correlation, independent and dependent variables, and the types of correlation Find a correlation coefficient Test a population correlation

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### PASS Sample Size Software. Linear Regression

Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes

### Inference for two Population Means

Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

### STATISTICS 151 SECTION 1 FINAL EXAM MAY

STATISTICS 151 SECTION 1 FINAL EXAM MAY 2 2009 This is an open book exam. Course text, personal notes and calculator are permitted. You have 3 hours to complete the test. Personal computers and cellphones

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### ECO220Y1Y Duration - 3 hours Examination Aids: Calculator

Page 1 of 16 UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2011 EXAMINATIONS ECO220Y1Y Duration - 3 hours Examination Aids: Calculator Last Name: First Name: Student #: ENTER YOUR NAME AND STUDENT

### Hypothesis Testing - II

-3σ -2σ +σ +2σ +3σ Hypothesis Testing - II Lecture 9 0909.400.01 / 0909.400.02 Dr. P. s Clinic Consultant Module in Probability & Statistics in Engineering Today in P&S -3σ -2σ +σ +2σ +3σ Review: Hypothesis