STAB22 Week 10 problems: solutions

Similar documents
Chapter 7 Section 7.1: Inference for the Mean of a Population

2 Sample t-test (unequal sample sizes and unequal variances)

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

How To Test For Significance On A Data Set

Skewed Data and Non-parametric Methods

Hypothesis Testing: Two Means, Paired Data, Two Proportions

Stats for Strategy Fall 2012 First-Discussion Handout: Stats Using Calculators and MINITAB

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Paired 2 Sample t-test

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct

Two Related Samples t Test

Comparing Means in Two Populations

How To Run Statistical Tests in Excel

DIRECTIONS. Exercises (SE) file posted on the Stats website, not the textbook itself. See How To Succeed With Stats Homework on Notebook page 7!

Chapter 23 Inferences About Means

TI-Inspire manual 1. Instructions. Ti-Inspire for statistics. General Introduction

Psychology 60 Fall 2013 Practice Exam Actual Exam: Next Monday. Good luck!

ABSORBENCY OF PAPER TOWELS

2.2 Derivative as a Function

The Dummy s Guide to Data Analysis Using SPSS

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Independent samples t-test. Dr. Tom Pierce Radford University

Simple Regression Theory II 2010 Samuel L. Baker

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

Mind on Statistics. Chapter 13

Using Excel for inferential statistics

Updates to Graphing with Excel

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Projects Involving Statistics (& SPSS)

Association Between Variables

Analysis of Variance ANOVA

Case Study Call Centre Hypothesis Testing

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

Using Microsoft Excel to Analyze Data from the Disk Diffusion Assay

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

p ˆ (sample mean and sample

Statistics 2014 Scoring Guidelines

Comparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples

8 6 X 2 Test for a Variance or Standard Deviation

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Confidence intervals

Chapter 23. Inferences for Regression

A) B) C) D)

CALCULATIONS & STATISTICS

SPSS/Excel Workshop 3 Summer Semester, 2010

Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption

HYPOTHESIS TESTING: POWER OF THE TEST

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Scatter Plots with Error Bars

Microeconomics Topic 6: Be able to explain and calculate average and marginal cost to make production decisions.

PERSONAL LEARNING PLAN- STUDENT GUIDE

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

7. Comparing Means Using t-tests.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

STAB22 section 1.1. total = 88(200/100) + 85(200/100) + 77(300/100) + 90(200/100) + 80(100/100) = = 837,

Hypothesis testing - Steps

STATISTICS PROJECT: Hypothesis Testing

Chapter 5 Analysis of variance SPSS Analysis of variance

Sample Size and Power in Clinical Trials

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Difference of Means and ANOVA Problems

One-Way ANOVA using SPSS SPSS ANOVA procedures found in the Compare Means analyses. Specifically, we demonstrate

Multiple-Comparison Procedures

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

The Chi-Square Test. STAT E-50 Introduction to Statistics

MATH 140 Lab 4: Probability and the Standard Normal Distribution

Two-sample hypothesis testing, II /16/2004

CHAPTER 14 NONPARAMETRIC TESTS

Probability Distributions

Chapter 28: Expanding Web Studio

TI-Inspire manual 1. I n str uctions. Ti-Inspire for statistics. General Introduction

Mind on Statistics. Chapter 12

The 5 P s in Problem Solving *prob lem: a source of perplexity, distress, or vexation. *solve: to find a solution, explanation, or answer for

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Greatest Common Factor and Least Common Multiple

12: Analysis of Variance. Introduction

Results from the 2014 AP Statistics Exam. Jessica Utts, University of California, Irvine Chief Reader, AP Statistics

If this PDF has opened in Full Screen mode, you can quit by pressing Alt and F4, or press escape to view in normal mode. Click here to start.

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

Pristine s Day Trading Journal...with Strategy Tester and Curve Generator

Describing Populations Statistically: The Mean, Variance, and Standard Deviation

Variables Control Charts

Pigeonhole Principle Solutions

THE WINNING ROULETTE SYSTEM.

STAT 350 Practice Final Exam Solution (Spring 2015)

Analysing Questionnaires using Minitab (for SPSS queries contact -)

This chapter discusses some of the basic concepts in inferential statistics.

3.4 Statistical inference for 2 populations based on two samples

Unit 26 Estimation with Confidence Intervals

Determining the Acceleration Due to Gravity

Point Biserial Correlation Tests

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Using Microsoft Excel to Analyze Data

Decision Making under Uncertainty

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Transcription:

STAB22 Week 10 problems: solutions 7.2 Easiest is to type these numbers into Minitab and let it do its thing. Select Stat, Basic Statistics, 1-sample t. Select your column of numbers and let it do a 95% interval: T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI C1 10 531.0 82.8 26.2 ( 471.8, 590.2) If you prefer, you can use Minitab (or some other means like your calculator if you know how to do it) to find the sample mean, x = 531.0, and the sample SD, s = 82.8, and go the rest of the way by hand. With n = 10, there are 9 degrees of freedom (df), and the number from the table is 2.262, giving an interval of 531.0 ± 2.262 82.8/ 10, which is 531.0 ± 59.2, or from 471.8 to 590.2 as above. Either way, we have a small sample from a very variable population, so we shouldn t expect to learn very much from the confidence interval, as indeed we don t it is nearly $120 wide. 7.4 The only way we know how to do this (unless you have read the textbook) is to use Minitab for the whole thing. The alternative hypothesis is the thing we want good reason for believing : that is, that the population mean is greater than 500. The null hypothesis says that the population mean is equal to 500 (less than or equal to, if you prefer). Enter the data into Minitab (or re-use the data you just entered for problem 7.2), select Stat, Basic Statistics and 1-sample t again. Select your column of data if it isn t selected already, click Test Mean and fill 1

in 500, change the alternative hypothesis to Greater Than, and click OK. I got this: T-Test of the Mean Test of mu = 500.0 vs mu > 500.0 Variable N Mean StDev SE Mean T P C1 10 531.0 82.8 26.2 1.18 0.13 The P-value is 0.13. Compared to α = 0.05, say (pick any α you like), this is not small, so we do not have enough evidence to reject the null hypothesis. There is no good reason to believe that the mean rent is greater than $500 per month. (As with the confidence interval above, the data probably don t contain much information, and with this small a sample size we might not even be able to reject a null hypothesis that s a fair way wrong. Here, the sample mean was over $30 higher than $500, but that wasn t high enough to be good evidence.) 7.20 Enter the data into a column in Minitab, and select Graph and Probability Plot to get the normal quantile plot, as we did it earlier. The data don t look very normal-like. The comments in the question suggest that the t confidence interval will nonetheless be reasonably accurate. Select Stat, Basic Statistics and 1-sample t, and let it give you a 95% interval: T Confidence Intervals Variable N Mean StDev SE Mean 95.0 % CI C1 50 20.90 7.65 1.08 ( 18.73, 23.07) We are 95% confident that the mean cost of Internet access (as of August 2000, at least) is between $18.73 and $23.07. 7.60 We are trying to find evidence that bread loses vitamin C, so the alternative hypothesis states that the mean vitamin C content is greater 2

immediately after baking than it is after 3 days. The null hypothesis says that there is no change over time. For Minitab, enter the two immediate values in one column and the two 3 days values into another. Select Stat, Basic Statistics, 2-sample t. Select Samples in different columns, and select the two columns. Change alternative to greater than. If you re really sneaky, you can do part (b) at the same time by changing confidence level to 90. Two Sample T-Test and Confidence Interval Two sample T for immed vs 3days N Mean StDev SE Mean immed 2 48.70 1.53 1.1 3days 2 21.795 0.771 0.54 90% CI for mu immed - mu 3days: ( 19.2, 34.58) T-Test mu immed = mu 3days (vs >): T = 22.16 P = 0.014 DF = 1 The P-value is 0.014. If you chose α = 0.05, this is small enough to reject the null hypothesis and conclude that vitamin C is lost. Having rejected the null hypothesis, we might wish to know how much vitamin C is lost on average. This is given by the 90% confidence interval, which goes from 19.2 to 34.6 mg per 100 grams. This doesn t pin the vitamin C loss down very precisely, but with only a total of 4 observations, it s pretty unreasonable to expect better. 7.61 The analysis of the previous question assumed 4 loaves altogether, randomly divided into the two groups. But if the same two loaves are measured twice, that scenario no longer applies: we have a matched pairs design. The right analysis requires us to use the differences between the two measurements within each pair. (It s not only a question of what the data are, but also of where they came from.) For this analysis in Minitab, select Stat, Basic Statistics and Paired t. Select the two columns. Click Options, select the Confidence Level (90), and the alternative (Greater Than). Click OK a couple of times. I got this: 3

Paired T-Test and Confidence Interval Paired T for immed - 3days N Mean StDev SE Mean immed 2 48.71 1.53 1.09 3days 2 21.80 0.77 0.54 Difference 2 26.910 0.764 0.540 90% CI for mean difference: (23.501, 30.319) T-Test of mean difference = 0 (vs > 0): T-Value = 49.83 P-Value = 0.006 The P-value for this test is 0.006 (smaller than before), and we would reject the null hypothesis of no vitamin loss at any reasonable α value. There definitely is a loss of vitamin C. The 90% confidence interval goes from 23.5 to 30.3 mg per 100g. Both the test and confidence interval give stronger results than before, because we are taking advantage of the pairing. Notice that the first loaf had less vitamin C at both times, but the drop in vitamin C is very nearly the same for both loaves. 7.69 Type the data into two columns, one for low fitness and one for high fitness. We re looking for any difference, so the alternative hypothesis says that the mean scores for the low-fitness and high-fitness groups are different, while the null hypothesis says that they re the same. In Minitab, select Stat, Basic Statistics and 2-sample t. Select Samples in different columns, and select your two columns of data. The alternative hypothesis should read not equal, and the confidence level can stay at 95. I got this: Two sample T for low vs high N Mean StDev SE Mean low 14 4.640 0.690 0.18 high 14 6.429 0.430 0.12 95% CI for mu low - mu high: ( -2.24, -1.34) T-Test mu low = mu high (vs not =): T = -8.23 P = 0.0000 DF = 21 4

The P-value is very small indeed; there is no doubt that the null hypothesis should be rejected. There is strong evidence that mean ego strength scores are higher for high-fitness college professors than for low-fitness ones. However, college professors are probably not typical of all middle-aged men. They will have certain character traits not possessed by, for example, businessmen. In particular, you might expect ego strength to be smaller for college professors generally than for entrepreneurs, and we don t know how the fitness level will relate to ego strength for entrepreneurs. 7.99 Ignore the word pooled in this question and the next one. (We didn t actually do the so-called pooled version of 2-sample t). Fire up Minitab, and select Power and Sample Size, then 2-sample t. You can use either of the first two alternatives; fill in 80 for the sample size (this is 80 per group), 300 for the difference you hope to detect, and 650 for sigma. If you go back to Exercise 7.70, you ll see that the test has to be one-sided ( does cocaine use cause low birth weight? ), so click Options, click on Greater Than (since our hoped-for difference is positive), and check that alpha is set to 0.05. I got this: 2-Sample t Test Testing mean 1 = mean 2 (versus >) Calculating power for mean 1 = mean 2 + 300 Alpha = 0.05 Sigma = 650 Sample Size Power 80 0.8965 The power is 0.8965. In nearly 90% of all possible samples, we will correctly reject the null hypothesis of equal means in favour of the alternative that the mean birth weight is less among cocaine users. (The answers give 0.8267 for the power, which appears to come from a two-sided test. I don t see how you justify that.) 5

7.102 There are a couple of ways to tackle this question: exactly as it is posed, or by directly finding the sample size needed to get 80% power. To start with the latter: in Minitab, select Power and Sample Size, 2-sample t. Click calculate sample size, and fill in 0.80 for the power, 0.5 for the difference and 0.7 for sigma. Click Options, and make sure the alternative hypothesis says not equal and alpha is 0.05. I got these results: 2-Sample t Test Testing mean 1 = mean 2 (versus not =) Calculating power for mean 1 = mean 2 + 0.5 Alpha = 0.05 Sigma = 0.7 Sample Target Actual Size Power Power 32 0.8000 0.8031 So 32 players are needed in each group. To answer the question as asked: go back to Power and Sample Size, Two-sample t, and click on power for each sample size. Choose some sample sizes (such as 10, 20, 30, 40, 50), and fill in the difference we want to detect (0.5). Enter sigma at the bottom (0.7), and click Options to check that the alternative is not equal and α = 0.05. Also, you can save the sample sizes and power values in columns, which will make it easier to plot them (say c1 and c2). (If you don t do this, you can always type in the values that appear in the output.) To plot the power, go to Graph and Plot; select the power values for y and the sample sizes for x. To make the graph easier to read, you can connect the points on the graph by lines; to do this, look in the Data Display box and click on the arrow next to Display. Select connect here. The word symbol below it changes to connect. My graph is shown below: 6

The power goes up, but at a progressively slower rate as the sample size increases. (The gain in power between n = 40 and n = 50 is less than between n = 10 and n = 20, say.) The power appears to hit 80% for a sample size just over 30, which agrees with Minitab s direct calculation above. 7