Testing on proportions

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Testing on proportions"

Transcription

1 Testing on proportions Textbook Section 5.4 April 7, 2011 Example 1. X 1,, X n Bernolli(p). Wish to test H 0 : p p 0 H 1 : p > p 0 (1) Consider a related problem The likelihood ratio test is where c is determined by H 0 : p = p 0 H 1 : p = p 1 (> p 0 ) n pxi 1 q1 xi 1 n pxi 0 q1 xi 0 P { c ( ) n p1 q xi 0 c q 0 p 0 n ( ) p1 q 0 x i log c q 1 p 0 n x i c, n X i c p = p 0 } = α. Note that the test is independent of p 1 and is a level α test for (1) (to be shown). Hence it is a UMP for (1). e.g. Let S = n X i, n = 900, p 0 = 0.5 and the observed statistic s obs = n x i = 500. Then, S Bin(n, p) N(np, npq). From { S np0 P c np } 0 p = p 0 = α, np0 q 0 np0 q 0 c np 0 np0q 0 z α or c = for α = Since s obs > 474.5, we reject H 0 and conclude that the candidate will win. This method does not provide the evidence against H 0. A more common way is to compute the p-value, which is give by very strong evidence against H 0. max P {T (X) > T (x) p} = P (S > 500 p = 0.5) 0.04%, p 0.5 1

2 Example 2. (Approximate pivotal method) Bernoulli trails: X 1,, X n i.i.d.bernoulli(p). Approximate pivot: Z 1 = Z 2 = X p pq/n a N(0, 1), X p ˆpˆq/n a N(0, 1). Using Z 2, an approximate (1 α) CI is Using Z 1, P ( z α/2 X p ˆpˆq/n z α/2 ) 1 α [ˆp z α/2 ˆpˆq/n, ˆp + zα/2 ˆpˆq/n]. P ( ) ( ˆp p ) 2 zα/2 2 pq/n = 1 α, The function g(p) has two roots: g(p) = (ˆp p) 2 cpq 0, c = n 1 z 2 α/2. ˆp 1 = (ˆp + c/2 c 2 /4 + cˆpˆq)/(1 + c) ˆp 2 = (ˆp + c/2 + c 2 /4 + cˆpˆq)/(1 + c) where ˆq = 1 ˆp. Thus 1 α CI is [ˆp 1, ˆp 2 ]. As an illustration, assume that we wish to estimate the probability of a die facing 6. Roll a die 100 times and get Then, ˆp = Using the first approximation, the 95% confidence interval is 0.2 ± /100 = 0.2 ± Using the second method, the 95% confidence interval is ± Both intervals contain 1/6. Hence, the die is fair. 2

3 Goodness of Fit Textbook Section 5.7 April 6, 2010 One of well known real life examples occurred when the University of California at Berkeley was sued for bias against women who had applied for admission to graduate schools there. The admission figures for the fall of 1973 showed that men applying were more likely than women to be admitted, and the difference was so large that it was unlikely to be due to chance. Admitted Not admitted Applicants Men Women A goodness of fit test examine the case of a sequence if independent experiments each of which can have 1 of k possible outcomes. In terms of hypothesis testing, let π = (π 1,..., π k ) be postulated values of the probability P π {experiment takes on the i-th outcome} = π i and let p = (p 1,..., p n ) denote the actual state of nature. Then, the parameter space is the n 1 simplex Θ = {p = (p 1,..., p n ); p i 0 for all i = 1,..., k, p i = 1}. The hypothesis test is H 0 : p i = π i, for all i = 1,..., k versus H 1 : p i π i, for some i = 1,..., k, The data x is the outcome of the n experiments. A sufficient statistic is n = (n 1,..., n k ) where n i is the number of time that outcome i occurs in n experiments. Thus, n = n i. The likelihood function L(p n) = p n1 1 pn k k. 1

4 Its logarithm ln L(p n) = n i ln p i. We maximize this using the method of Lagrange multipliers with constraint s(p) = p i = 1. Thus, at the maximum likelihood estimator (ˆp 1,..., ˆp k ), So, n i /ˆp i = λ, n i = λˆp i. Now sum on i to obtain Consequently, The likelihood ratio test p ln L(ˆp n) = λ ˆp s(p). ( n1,..., n ) k = λ(1,..., 1) ˆp 1 ˆp k n i = λ ˆp i and n = λ. n 1 ˆp i = n and ˆp i = n i n. Λ n (n) = L(n ˆp) ( ) n1 ( ) nk L(n π) = n1 nk. nπ 1 nπ k Recall that as the number of experiments n, 2 ln Λ n (N) = 2 ln N i = 2 ln nπ i. nπ i converges to a χ 2 k 1 random variable. Here N = (N 1,..., N k ) is the observed number of occurrences of outcome i. The traditional method was introduced between 1985 and 1900 by Karl Pearson and consequently has been in use for longer that the idea of likelihood ratio tests. To show the connection between the two tests, recall that ln a (a 1) 1 (a 1)2 2 is the quadratic Taylor polynomial approximation of ln a. Apply this to the logarithm of the likelihood ratio, we find that 2 ln Λ n (N) = 2 = 2 = 0 + ( (nπi ) 1 (nπ i ) + (nπ i ) ( ) ) 2 nπi 1 ( ) 2 nπi 1

5 The is generally rewritten by writing O i = to be the number of observed occurrences of i and E i = nπ i to be the number of expected occurrences of i as given by H 0. The data can be stored in a table Then, (nπ i ) 2 i 1 2 k observed O 1 O 2 O k expected E 1 E 2 E k (nπ i ) 2 nπ i (O i E i ) 2 = Qk 1. E i Example 1. (Textbook example 5.7.1) Roll a die, let A i = {x : x = i}, i = 1,, 6. The hypothesis H 0 : P (A i ) = π i = 1/6 H 1 : all the alternatives will be test at significance level 5%. n = 60, k = 6. Let X i denote the frequency with which the random experiment terminates with the outcome in A i. The data is as follows: Apply the above idea, Outcome A 1 A 2 A 3 A 4 A 5 A 6 Freq (O i 60 1/6) 2 Q 5 = 60 1/6 = (13 10) (4 10)2 10 = Hence the p-value = P (Q 5 > 15.6 H 0 ) = < We conclude that we reject the null hypothesis. 1 Contingency tables For an r c contingency table, we consider two classifications for an experiment. Thus, we can partition the outcome of each experiment into two groups: A 1,... A c and B 1,... B r. Here, we write O ij to denote the number of occurences of the outcome A i B j are organize the results in a two-way table. A 1 A 2 A c total B 1 O 11 O 12 O 1c O 1 B 2 O 21 O 22 O 2c O B r O r1 O r2 O rc O r total O 1 O 2 O c n 3

6 The null hypothesis is that the classifications A and B are independent. To set the parameter space for this model, we have the rc 1 simplex Θ = {p = (p ij, 1 i r, 1 j c); p ij 0 for all i, j = 1, r c p ij=1 }. j=1 Write The hypothesis test is c r p i = p ij and p j = p ij. j=1 H 0 : p ij = p i p j, for all i, j versus H 1 : p ij p i p j, for some i, j. Follow the procedure as before for the goodness of fit test to end with the test statistic r c j=1 O ij ln E ij O ij r j=1 c (O ij E ij ) 2 is asymptotically distributed χ 2 ((r 1)(c 1)), where E ij = O i O j /n. For 2 2 contingency table, the above formula can be simplified as: Q 1 = n(o 11 O 22 O 21 O 12 ) 2 O 1 O 2 O 1 O 2. E ij = Q(r 1)(c 1) (1) Example 2. (Association test) In the cancer study, we would like to know whether smoke is associated with lung cancer. We collected the following data (hypothetical data). No cancer Cancer Total Non-smoker Smoker Using the simplified formula above, we get Q 1 = 105 [ ] 2 /( ) = Or you may want to use equation (1), We first need to find E ij [ Then Q 1 = ( ) 2 / ( ) 2 / ( ) 2 / ( ) 2 /20.43 = Hence p-value = P (χ 2 (1) > 3.42) = We conclude that we fail to reject the independence hypothesis at significance level 5%. Example 3. (Sex bias in Graduate admission at Berkeley) Back to the question at the beginning, is there really sex bias for graduate admission? You can now apply the test we just discuss and find that p-value is quite small. So we reject the null hypothesis that the sex is independent of number of student admitted to graduate school. However when examining the individual departments, it was found that no department was significantly biased against women. In fact, most departments had a small but statistically significant bias in favor of women ]. 4

7 Department Men Women Applicants % admitted Applicants % admitted A % % B % 25 68% C % % D % % E % % F 272 6% 341 7% The research paper by Peter Bickel, et al (1975) concluded that women tended to apply to competitive departments with low rates of admission even among qualified applicants (such as in the English Department), whereas men tended to apply to less-competitive departments with high rates of admission among the qualified applicants (such as in engineering and chemistry). 5

Topic 19: Goodness of Fit

Topic 19: Goodness of Fit Topic 19: November 24, 2009 A goodness of fit test examine the case of a sequence if independent experiments each of which can have 1 of k possible outcomes. In terms of hypothesis testing, let π = (π

More information

Topic 21: Goodness of Fit

Topic 21: Goodness of Fit Topic 21: December 5, 2011 A goodness of fit tests examine the case of a sequence of independent observations each of which can have 1 of k possible categories. For example, each of us has one of 4 possible

More information

Topic 21 Goodness of Fit

Topic 21 Goodness of Fit Topic 21 Goodness of Fit Fit of a Distribution 1 / 14 Outline Fit of a Distribution Blood Bank Likelihood Function Likelihood Ratio Lagrange Multipliers Hanging Chi-Gram 2 / 14 Fit of a Distribution Goodness

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

O, A, B, and AB. 0 for all i =1,...,k,

O, A, B, and AB. 0 for all i =1,...,k, Topic 1 1.1 Fit of a Distribution Goodness of fit tests examine the case of a sequence of independent observations each of which can have 1 of k possible categories. For example, each of us has one of

More information

7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

More information

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics

Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics Jan Krhovják Basic Idea Behind the Statistical Tests Generated random sequences properties as sample drawn from uniform/rectangular

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures. Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

More information

The Neyman-Pearson lemma. The Neyman-Pearson lemma

The Neyman-Pearson lemma. The Neyman-Pearson lemma The Neyman-Pearson lemma In practical hypothesis testing situations, there are typically many tests possible with significance level α for a null hypothesis versus alternative hypothesis. This leads to

More information

Odds ratio, Odds ratio test for independence, chi-squared statistic.

Odds ratio, Odds ratio test for independence, chi-squared statistic. Odds ratio, Odds ratio test for independence, chi-squared statistic. Announcements: Assignment 5 is live on webpage. Due Wed Aug 1 at 4:30pm. (9 days, 1 hour, 58.5 minutes ) Final exam is Aug 9. Review

More information

Chi-square test Testing for independeny The r x c contingency tables square test

Chi-square test Testing for independeny The r x c contingency tables square test Chi-square test Testing for independeny The r x c contingency tables square test 1 The chi-square distribution HUSRB/0901/1/088 Teaching Mathematics and Statistics in Sciences: Modeling and Computer-aided

More information

Chapter 8. Hypothesis Testing

Chapter 8. Hypothesis Testing Chapter 8 Hypothesis Testing Hypothesis In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing

More information

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%

In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4% Hypothesis Testing for a Proportion Example: We are interested in the probability of developing asthma over a given one-year period for children 0 to 4 years of age whose mothers smoke in the home In the

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

6. Duality between confidence intervals and statistical tests

6. Duality between confidence intervals and statistical tests 6. Duality between confidence intervals and statistical tests Suppose we carry out the following test at a significance level of 100α%. H 0 :µ = µ 0 H A :µ µ 0 Then we reject H 0 if and only if µ 0 does

More information

Practice problems for Homework 11 - Point Estimation

Practice problems for Homework 11 - Point Estimation Practice problems for Homework 11 - Point Estimation 1. (10 marks) Suppose we want to select a random sample of size 5 from the current CS 3341 students. Which of the following strategies is the best:

More information

Topic 8. Chi Square Tests

Topic 8. Chi Square Tests BE540W Chi Square Tests Page 1 of 5 Topic 8 Chi Square Tests Topics 1. Introduction to Contingency Tables. Introduction to the Contingency Table Hypothesis Test of No Association.. 3. The Chi Square Test

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

Estimating the Frequency Distribution of the. Numbers Bet on the California Lottery

Estimating the Frequency Distribution of the. Numbers Bet on the California Lottery Estimating the Frequency Distribution of the Numbers Bet on the California Lottery Mark Finkelstein November 15, 1993 Department of Mathematics, University of California, Irvine, CA 92717. Running head:

More information

Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win $1 if head and lose $1 if tail. After 10 plays, you lost $2 in net (4 heads

More information

Social Studies 201 Notes for November 19, 2003

Social Studies 201 Notes for November 19, 2003 1 Social Studies 201 Notes for November 19, 2003 Determining sample size for estimation of a population proportion Section 8.6.2, p. 541. As indicated in the notes for November 17, when sample size is

More information

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems. Practice problems for Homework 1 - confidence intervals and hypothesis testing. Read sections 10..3 and 10.3 of the text. Solve the practice problems below. Open the Homework Assignment 1 and solve the

More information

NPTEL STRUCTURAL RELIABILITY

NPTEL STRUCTURAL RELIABILITY NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

More information

Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

More information

[Chapter 10. Hypothesis Testing]

[Chapter 10. Hypothesis Testing] [Chapter 10. Hypothesis Testing] 10.1 Introduction 10.2 Elements of a Statistical Test 10.3 Common Large-Sample Tests 10.4 Calculating Type II Error Probabilities and Finding the Sample Size for Z Tests

More information

Unit 29 Chi-Square Goodness-of-Fit Test

Unit 29 Chi-Square Goodness-of-Fit Test Unit 29 Chi-Square Goodness-of-Fit Test Objectives: To perform the chi-square hypothesis test concerning proportions corresponding to more than two categories of a qualitative variable To perform the Bonferroni

More information

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Chapter 4: Statistical Hypothesis Testing

Chapter 4: Statistical Hypothesis Testing Chapter 4: Statistical Hypothesis Testing Christophe Hurlin November 20, 2015 Christophe Hurlin () Advanced Econometrics - Master ESA November 20, 2015 1 / 225 Section 1 Introduction Christophe Hurlin

More information

Hypothesis testing for µ:

Hypothesis testing for µ: University of California, Los Angeles Department of Statistics Statistics 13 Elements of a hypothesis test: Hypothesis testing Instructor: Nicolas Christou 1. Null hypothesis, H 0 (always =). 2. Alternative

More information

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96 1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

More information

Likelihood Approaches for Trial Designs in Early Phase Oncology

Likelihood Approaches for Trial Designs in Early Phase Oncology Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University

More information

Comparing Multiple Proportions, Test of Independence and Goodness of Fit

Comparing Multiple Proportions, Test of Independence and Goodness of Fit Comparing Multiple Proportions, Test of Independence and Goodness of Fit Content Testing the Equality of Population Proportions for Three or More Populations Test of Independence Goodness of Fit Test 2

More information

Multiple random variables

Multiple random variables Multiple random variables Multiple random variables We essentially always consider multiple random variables at once. The key concepts: Joint, conditional and marginal distributions, and independence of

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.)

Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Unit 12 Logistic Regression Supplementary Chapter 14 in IPS On CD (Chap 16, 5th ed.) Logistic regression generalizes methods for 2-way tables Adds capability studying several predictors, but Limited to

More information

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

We know from STAT.1030 that the relevant test statistic for equality of proportions is: 2. Chi 2 -tests for equality of proportions Introduction: Two Samples Consider comparing the sample proportions p 1 and p 2 in independent random samples of size n 1 and n 2 out of two populations which

More information

Math 115 Spring 2011 Written Homework 5 Solutions

Math 115 Spring 2011 Written Homework 5 Solutions . Evaluate each series. a) 4 7 0... 55 Math 5 Spring 0 Written Homework 5 Solutions Solution: We note that the associated sequence, 4, 7, 0,..., 55 appears to be an arithmetic sequence. If the sequence

More information

Sampling and Hypothesis Testing

Sampling and Hypothesis Testing Population and sample Sampling and Hypothesis Testing Allin Cottrell Population : an entire set of objects or units of observation of one sort or another. Sample : subset of a population. Parameter versus

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Math 251, Review Questions for Test 3 Rough Answers

Math 251, Review Questions for Test 3 Rough Answers Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,

More information

Chi Square Distribution

Chi Square Distribution 17. Chi Square A. Chi Square Distribution B. One-Way Tables C. Contingency Tables D. Exercises Chi Square is a distribution that has proven to be particularly useful in statistics. The first section describes

More information

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

More information

Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

More information

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

 Y. Notation and Equations for Regression Lecture 11/4. Notation: Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

More information

Categorical Data Analysis

Categorical Data Analysis Richard L. Scheaffer University of Florida The reference material and many examples for this section are based on Chapter 8, Analyzing Association Between Categorical Variables, from Statistical Methods

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations.

13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations. 13.2 The Chi Square Test for Homogeneity of Populations The setting: Used to compare distribution of proportions in two or more populations. Data is organized in a two way table Explanatory variable (Treatments)

More information

11-2 Goodness of Fit Test

11-2 Goodness of Fit Test 11-2 Goodness of Fit Test In This section we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table). We will use a hypothesis

More information

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery)

Chapter 4 Statistical Inference in Quality Control and Improvement. Statistical Quality Control (D. C. Montgomery) Chapter 4 Statistical Inference in Quality Control and Improvement 許 湘 伶 Statistical Quality Control (D. C. Montgomery) Sampling distribution I a random sample of size n: if it is selected so that the

More information

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory

Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory LA-UR-12-24572 Approved for public release; distribution is unlimited Statistical Impact of Slip Simulator Training at Los Alamos National Laboratory Alicia Garcia-Lopez Steven R. Booth September 2012

More information

Two Correlated Proportions (McNemar Test)

Two Correlated Proportions (McNemar Test) Chapter 50 Two Correlated Proportions (Mcemar Test) Introduction This procedure computes confidence intervals and hypothesis tests for the comparison of the marginal frequencies of two factors (each with

More information

Effect estimation versus hypothesis testing

Effect estimation versus hypothesis testing Department of Epidemiology and Public Health Unit of Biostatistics and Computational Sciences Effect estimation versus hypothesis testing PD Dr. C. Schindler Swiss Tropical and Public Health Institute

More information

Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG820). December 15, 2012.

Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG820). December 15, 2012. Solutions for the exam for Matematisk statistik och diskret matematik (MVE050/MSG810). Statistik för fysiker (MSG8). December 15, 12. 1. (3p) The joint distribution of the discrete random variables X and

More information

MATH 10: Elementary Statistics and Probability Chapter 11: The Chi-Square Distribution

MATH 10: Elementary Statistics and Probability Chapter 11: The Chi-Square Distribution MATH 10: Elementary Statistics and Probability Chapter 11: The Chi-Square Distribution Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of slides,

More information

Algebra. Exponents. Absolute Value. Simplify each of the following as much as possible. 2x y x + y y. xxx 3. x x x xx x. 1. Evaluate 5 and 123

Algebra. Exponents. Absolute Value. Simplify each of the following as much as possible. 2x y x + y y. xxx 3. x x x xx x. 1. Evaluate 5 and 123 Algebra Eponents Simplify each of the following as much as possible. 1 4 9 4 y + y y. 1 5. 1 5 4. y + y 4 5 6 5. + 1 4 9 10 1 7 9 0 Absolute Value Evaluate 5 and 1. Eliminate the absolute value bars from

More information

11. Analysis of Case-control Studies Logistic Regression

11. Analysis of Case-control Studies Logistic Regression Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:

More information

Bivariate Analysis. Comparisons of proportions: Chi Square Test (X 2 test) Variable 1. Variable 2 2 LEVELS >2 LEVELS CONTINUOUS

Bivariate Analysis. Comparisons of proportions: Chi Square Test (X 2 test) Variable 1. Variable 2 2 LEVELS >2 LEVELS CONTINUOUS Bivariate Analysis Variable 1 2 LEVELS >2 LEVELS CONTINUOUS Variable 2 2 LEVELS X 2 chi square test >2 LEVELS X 2 chi square test CONTINUOUS t-test X 2 chi square test X 2 chi square test ANOVA (F-test)

More information

Hypothesis. Testing Examples and Case Studies. Chapter 23. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Hypothesis. Testing Examples and Case Studies. Chapter 23. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc. Hypothesis Chapter 23 Testing Examples and Case Studies Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc. 23.1 How Hypothesis Tests Are Reported in the News 1. Determine the null hypothesis

More information

Logistic regression modeling the probability of success

Logistic regression modeling the probability of success Logistic regression modeling the probability of success Regression models are usually thought of as only being appropriate for target variables that are continuous Is there any situation where we might

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Module 7: Hypothesis Testing I Statistics (OA3102)

Module 7: Hypothesis Testing I Statistics (OA3102) Module 7: Hypothesis Testing I Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 10.1-10.5 Revision: 2-12 1 Goals for this Module

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test... Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

More information

Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

More information

Analysis and Interpretation of Clinical Trials. How to conclude?

Analysis and Interpretation of Clinical Trials. How to conclude? www.eurordis.org Analysis and Interpretation of Clinical Trials How to conclude? Statistical Issues Dr Ferran Torres Unitat de Suport en Estadística i Metodología - USEM Statistics and Methodology Support

More information

Goodness of fit - 2 classes

Goodness of fit - 2 classes Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A =0.75! Exact p-value Exact confidence interval Normal approximation

More information

Financial Risk Forecasting Chapter 8 Backtesting and stresstesting

Financial Risk Forecasting Chapter 8 Backtesting and stresstesting Financial Risk Forecasting Chapter 8 Backtesting and stresstesting Jon Danielsson London School of Economics 2015 To accompany Financial Risk Forecasting http://www.financialriskforecasting.com/ Published

More information

General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test

General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics

More information

Stats Review Chapters 9-10

Stats Review Chapters 9-10 Stats Review Chapters 9-10 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test

More information

Common Univariate and Bivariate Applications of the Chi-square Distribution

Common Univariate and Bivariate Applications of the Chi-square Distribution Common Univariate and Bivariate Applications of the Chi-square Distribution The probability density function defining the chi-square distribution is given in the chapter on Chi-square in Howell's text.

More information

Name: ID: Discussion Section:

Name: ID: Discussion Section: Math 28 Midterm 3 Spring 2009 Name: ID: Discussion Section: This exam consists of 6 questions: 4 multiple choice questions worth 5 points each 2 hand-graded questions worth a total of 30 points. INSTRUCTIONS:

More information

WHERE DOES THE 10% CONDITION COME FROM?

WHERE DOES THE 10% CONDITION COME FROM? 1 WHERE DOES THE 10% CONDITION COME FROM? The text has mentioned The 10% Condition (at least) twice so far: p. 407 Bernoulli trials must be independent. If that assumption is violated, it is still okay

More information

Poisson Models for Count Data

Poisson Models for Count Data Chapter 4 Poisson Models for Count Data In this chapter we study log-linear models for count data under the assumption of a Poisson error structure. These models have many applications, not only to the

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables.

SIMPLE LINEAR CORRELATION. r can range from -1 to 1, and is independent of units of measurement. Correlation can be done on two dependent variables. SIMPLE LINEAR CORRELATION Simple linear correlation is a measure of the degree to which two variables vary together, or a measure of the intensity of the association between two variables. Correlation

More information

Chi-square (χ 2 ) Tests

Chi-square (χ 2 ) Tests Math 442 - Mathematical Statistics II May 5, 2008 Common Uses of the χ 2 test. 1. Testing Goodness-of-fit. Chi-square (χ 2 ) Tests 2. Testing Equality of Several Proportions. 3. Homogeneity Test. 4. Testing

More information

Size of a study. Chapter 15

Size of a study. Chapter 15 Size of a study 15.1 Introduction It is important to ensure at the design stage that the proposed number of subjects to be recruited into any study will be appropriate to answer the main objective(s) of

More information

CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES

CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES CHAPTER 11. GOODNESS OF FIT AND CONTINGENCY TABLES The chi-square distribution was discussed in Chapter 4. We now turn to some applications of this distribution. As previously discussed, chi-square is

More information

Homework 5 Solutions

Homework 5 Solutions Math 130 Assignment Chapter 18: 6, 10, 38 Chapter 19: 4, 6, 8, 10, 14, 16, 40 Chapter 20: 2, 4, 9 Chapter 18 Homework 5 Solutions 18.6] M&M s. The candy company claims that 10% of the M&M s it produces

More information

Goodness of Fit Goodness of fit - 2 classes

Goodness of Fit Goodness of fit - 2 classes Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A =0.75! Exact p-value Exact confidence interval

More information

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Extending Hypothesis Testing. p-values & confidence intervals

Extending Hypothesis Testing. p-values & confidence intervals Extending Hypothesis Testing p-values & confidence intervals So far: how to state a question in the form of two hypotheses (null and alternative), how to assess the data, how to answer the question by

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

CHAPTER NINE. Key Concepts. McNemar s test for matched pairs combining p-values likelihood ratio criterion

CHAPTER NINE. Key Concepts. McNemar s test for matched pairs combining p-values likelihood ratio criterion CHAPTER NINE Key Concepts chi-square distribution, chi-square test, degrees of freedom observed and expected values, goodness-of-fit tests contingency table, dependent (response) variables, independent

More information

Chi-square test Fisher s Exact test

Chi-square test Fisher s Exact test Lesson 1 Chi-square test Fisher s Exact test McNemar s Test Lesson 1 Overview Lesson 11 covered two inference methods for categorical data from groups Confidence Intervals for the difference of two proportions

More information

1. Comparing Two Means: Dependent Samples

1. Comparing Two Means: Dependent Samples 1. Comparing Two Means: ependent Samples In the preceding lectures we've considered how to test a difference of two means for independent samples. Now we look at how to do the same thing with dependent

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

More information

Multinomial and Ordinal Logistic Regression

Multinomial and Ordinal Logistic Regression Multinomial and Ordinal Logistic Regression ME104: Linear Regression Analysis Kenneth Benoit August 22, 2012 Regression with categorical dependent variables When the dependent variable is categorical,

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

Lecture Outline. Hypothesis Testing. Simple vs. Composite Testing. Stat 111. Hypothesis Testing Framework

Lecture Outline. Hypothesis Testing. Simple vs. Composite Testing. Stat 111. Hypothesis Testing Framework Stat 111 Lecture Outline Lecture 14: Intro to Hypothesis Testing Sections 9.1-9.3 in DeGroot 1 Hypothesis Testing Consider a statistical problem involving a parameter θ whose value is unknown but must

More information

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis

III. INTRODUCTION TO LOGISTIC REGRESSION. a) Example: APACHE II Score and Mortality in Sepsis III. INTRODUCTION TO LOGISTIC REGRESSION 1. Simple Logistic Regression a) Example: APACHE II Score and Mortality in Sepsis The following figure shows 30 day mortality in a sample of septic patients as

More information

Difference of Means and ANOVA Problems

Difference of Means and ANOVA Problems Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly

More information

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

12.5: CHI-SQUARE GOODNESS OF FIT TESTS 125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

More information

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp. 380-394 1. Does vigorous exercise affect concentration? In general, the time needed for people to complete

More information

Senior Secondary Australian Curriculum

Senior Secondary Australian Curriculum Senior Secondary Australian Curriculum Mathematical Methods Glossary Unit 1 Functions and graphs Asymptote A line is an asymptote to a curve if the distance between the line and the curve approaches zero

More information

MA107 Precalculus Algebra Exam 2 Review Solutions

MA107 Precalculus Algebra Exam 2 Review Solutions MA107 Precalculus Algebra Exam 2 Review Solutions February 24, 2008 1. The following demand equation models the number of units sold, x, of a product as a function of price, p. x = 4p + 200 a. Please write

More information