Lecture 2: Statistical Estimation and Testing

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Lecture 2: Statistical Estimation and Testing"

Transcription

1 Bioinformatics: In-depth PROBABILITY AND STATISTICS Spring Semester 2012 Lecture 2: Statistical Estimation and Testing Stefanie Muff 1

2 Problems in statistics 2

3 The three main questions in statistics are Estimation: estimate the unknown value of θ, given observations of X. Question: what is the most likely value for θ? Testing: test a hypothesis about the unknown value of θ. Base acceptance/rejection upon observation of X. Question: is my hypothesis compatible with the observed data? Confidence intervals: give an interval of parameter values that explain the data reasonably well. Question: which parameters would be compatible with my data? We will concentrate on the first two questions. 3

4 4

5 Given: a probability model X P θ For example: X Bin(100,p) but the probability p is unknown. How to obtain a guess of p? => Estimation! The collection x1, x2,..., xn is called (observed) sample of X1, X2,..., Xn. 5

6 Estimator, estimate 6

7 Examples of Estimators 7

8 Desirable properties of estimators 8

9 9

10 Likelihood function for discrete RVs 10

11 Likelihood function for continuous RVs The likelihood function for continuous random variables can be set equal to the density function L(x 1,x 2,..., x n ; ˆθ) =f X (x 1,x 2,...,x n ; ˆθ), whereas f X is the joint density of (X 1,X 2,..., X n ). If X 1,X 2,..., X n are independent L(x 1,x 2,..., x n ; ˆθ) =f X1 (x 1 ; ˆθ) f X2 (x 2 ; ˆθ)...f Xn (x n ; ˆθ). 11

12 Maximum likelihood estimator 12

13 Maximum likelihood estimate 13

14 Properties of MLEs 14

15 Example: ML for the binomial distribution 15

16 16

17 Compare this to the estimators on Slide 7: the ML estimator! is 17

18 The Log Likelihood 18

19 Likelihoods are not just for independent observations! 19

20 Example: Log likelihood for the binomial distribution Instead of optimizing The log likelihood x 1 log(θ) + (100 x 1 ) log(1 θ) x n log(θ) + (100 x n ) log(1 θ) = log(θ) x i + log(1 θ) (100 x i ) i i has to be optimized to obtain the ML estimator. The result is exactly the same as in the non-log case (check as an exercise). 20

21 Example: MLE for a normal distribution Remember: f(x, µ, σ 2 )= 1 (x µ)2 e 2σ 2 2πσ 2 Given a set of n independent observations x1, x2,...,xn.the log likelihood then is log(f(x 1,...,x n ; µ, σ 2 )) = n 2 log(2π) n 2 log(σ2 ) 1 2σ 2 n (x i µ) 2 i=1 This expression has to be derived with respect to σ 2 and µ separately and be set to 0. => Obtain two equations to estimate two parameters. See example in the exercises. 21

22 MLE in practice Analytical formulas for the ML estimator can be found only in relatively simple models. In other cases, approximate ML estimators can be found by iterative numerical optimization (Expectation-Maximization algorithm, Newton- Raphson algorithm) second-order Taylor approximations. These calculations are left to the computer (R). 22

23 Statistical Testing 23

24 24

25 Introductory example revisited g a g g a t t a c g g t a c t a g a t t c a t a a a c a c t g a c a c a t c a c t g c a c t c g c t a a Two DNA sequences of length 26. Matches at 11 of 26 positions. Is this sufficient to conclude that the two sequences are evolutionarily related? In order to answer this question, we have to find out how unlikely it would be to see 11 out of 26 matches by chance. Need to know the probability distribution of the random variable describing this experiment. Can then calculate the probability of the event. This is the essence of statistical testing. 25

26 Steps in a statistical test 1. Formulate null and alternative hypotheses H0 and H1. 2. Determine a test statistic T. 3. Determine the distribution of T under H0. 4. Choose the significance level α. 5. Calculate the critical value C. 6. Obtain the data and decide. For illustration, we now go through steps 1-6 for the binomal test. 26

27 1. Formulate the hypotheses A hypothesis typically specifies a value in a distribution. Here: X Bin(26, p), but p is not known. The null hypothesis H0 is the default hypothesis: H 0 : X Bin(26,p), p=0.25 The alternative hypothesis H1 is the controversial hypothesis. Strong evidence is needed to accept it in favour of H0: H 1 : X Bin(26,p), p > 0.25 Aim of a test: to find evidence against H0 in order to reject it. 27

28 2. Determine a test statistic A test statistic T is a numerical value that can be determined from the outcome of a chance experiment. Note that, by definition, T is a random variable as well! Here, T = number of matches between the two sequences (= X) (There is only one realization) Usually there is more than one realization in a random sample, and the test statistic depends on all realizations: Other examples: T (X 1,...,X n )= X X n n T (X 1,..., X n )= (X µ 0) ˆσ/ n = X (mean) (T-statistic) 28

29 3. Distribution of T under H0 In case of H0 (pure chance alignment), the distribution of T is T Bin(26, 0.25) (Note that in reality Bin(26,p) is not the right distribution for this problem, we only use it to illustrate the idea of statistical testing.) 29

30 4. Choose the significance level α In our example we reject H0 if the number of matches is too high, so that it is unlikely to happen by chance. α determines what unlikely means. Let us choose α=0.05. The significance level α fixes the probability with which H0 is rejected, although it is true. Interpretation: In 5% of the cases (1 out of 20) we will find a value of T so high that we do not believe it has happened by chance - although it did! α = probability to reject a true null hypothesis = probability to make a type I error. 30

31 5. Calculate the critical value We now calculate a value C for the test statistic T, above which we consider it unlikely that H0 is true: P(T C H 0 )=α In our example with H0: T=X Bin(26,0.25) P(X 7 H 0 ) = P(X 8 H 0 ) = P(X 9 H 0 ) = P(X 10 H 0 ) = P(X 11 H 0 ) = => C = 11! 26 ( ) 26 (which is calculated as P(X k H 0 )= 0.25 i i ) i i=k 31

32 6. Decide Only now is it finally allowed to calculate the value of T. Here, we already know that T=11, since X=T. From step 5 we have the following rule: Reject H0 if T 11 and do not reject H0 if T < 11 Decision: we reject H0. Thus we do not believe that 11 out of 26 matches can happen by chance. We say: There is statistical evidence that the two sequences are related due to evolution. 32

33 Statistical significance Note: The decision to reject H0 on the previous slide depends on the significance level α. We would not have rejected H0 if α < 0.04! Whether the outcome of an experiment is statistically significant or not depends crucially on α! For α=1 any result is significant... (but meaningless). Scientific results that claim statistical significance without giving α should at least be doubted... 33

34 p-values The p-value is the probability to see something at least as extreme as just observed under H0. It depends on the data. In our example: P(X 11 H 0 ) = Thus the p-value of our experiment is p= Many statistics programmes (R, SPSS,...) compute directly this. Your results are then significant if p < α. Interpretation: The p-value tells you for which α your data would be significant. 34

35 Type I and type II errors The type I error depends on the significance level α. It is the probability to reject the null hypothesis, although it is true. The probability for a type I error is The type II error is the other kind of false decisions: it is the probability that the null hypothesis is not rejected, although it is wrong: 35

36 The power of a test The power is typically more complicated to compute, especially if H1 is unknown. 36

37 Example 37

38 BUT if we would have chosen α=0.01, the power (1-β) would be lower! E.g. 1 β = P(X 11 p =0.26) = β = P(X 11 p =0.3) =

39 Fact: The decrease of the type I error comes at the expense of an increased type II error - and vice versa. There is a compromise between a low significance level α and high power 1-β. 39

40 Bin(20,0.25) and Bin(20,0.3) distribution f(x) x Power if H0 : p =0.25, H1 : p =0.3 α =

41 Bin(20,0.25) and Bin(20,0.6) distribution f(x) Power if H0 : p =0.25, H1 : p =0.6 x α =

42 Bayesian Hypothesis Testing Remember: P(A j B) = P(B A j ) P(A j ) n i=1 P(B A i) P(A i ) Bayes theorem Example (from Ewans/Grant): A bag contains 10 coins, where only 3 of them are fair. The other 7 have a chance to show heads with ph=0.6. Take one coin at random and flip it five times. All five flips give heads (event D). Then: P(H)=0.3 (prior probability that coin is fair) P(H c )=0.7 (prior probability that coin is unfair) P(D H)=0.5 5 P(D H c )=

43 Now, the posterior probability that the coin was fair, given the outcome, can be calculated: P(H D) = = P(D H) P (H) P(D H) P (H)+P(D H c ) P (H c ) =0.147 This is lower than the prior distribution of H, so evidence against it. Moreover: P(H c D) = So there is a much higher posterior probability (given the outcome and the prior) that the coin I picked was unfair. The same setup works mit multiple hypotheses H1, H2,..., Hn. Identical calculations as above lead to posterior probabilities and the hypothesis with the highest posterior is chosen. 43

44 Other statistical tests There is a large variety of statistical tests. The choice of the correct test depends on the type and qualitiy of the data, the assumptions and the question to be answered. Examples: z-test t-test sign-test Wilcoxon-test Mann-Whitney / U-test χ 2 goodness-of-fit test / χ 2 test for independence... 44

45 The z-test The simplest version of a z-test: One-sample problem Situation: Given n independent measurements Xi, 1 i n. Question: Can the expected value E[X]=µ be equal to, larger or lower than some theoretical value µo? Paired two-sample problem Situation: Given n independent measurements Yi and Zi, 1 i n of the same feature in two different states. E.g., the blood pressure of each person is measured before and after the intake of a special drug. Question: Is there a significant difference between the two states? I.e., is the difference Xi = Yi - Zi 0 (or < 0, >0) or, equivalently: is E[X] 0? 45

46 Assumptions In the z-test it is assumed that X i N(µ X, σ 2 ) Thus the measurements should follow a normal distribution. Moreover, the variance σ 2 of Xi is known. 46

47 1. Hypotheses H 0 : X i N(µ 0, σ0 2 ), 1 i n, independent, with known variance σ2 0 H 1 : X i N(µ 1, σ 0 2 ), 1 i n, independent, with known variance σ2 0 with either µ 1 >µ 0, µ 1 <µ 0 or µ 0 µ 1 2. Test statistic Z = X µ 0 σ 0 / n 3. Distribution of Z under H0 Z N(0, 1) 47

48 4. Choose the significance level α E.g., α=5% (or a lower level, is stronger signifiance is needed). 5. Calculate the critical value The values can be looked up in a table. The most important ones (for the α=5% level) are given here: µ 1 >µ 0 : c =1.64 with R: > qnorm(0.95) => Ho is rejected, if Z > 1.64 µ 1 <µ 0 : c = 1.64 => Ho is rejected, if Z < with R: > qnorm(0.05) µ 0 µ 1 : c =1.96 => Ho is rejected, if Z > 1.96 with R: > qnorm(0.975) where do these values come from...? 48

49 One-sided test µ 1 >µ 0 : µ 1 <µ 0 : N(0,1) distribution N(0,1) distribution f(x) f(x) x Rejection range 5% x 49

50 Two-sided test µ 0 µ 1 N(0,1) distribution f(x) % 2.5% x 50

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests Spring 2014 Jeremy Orloff and Jonathan Bloom

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests Spring 2014 Jeremy Orloff and Jonathan Bloom Null Hypothesis Significance Testing Signifcance Level, Power, t-tests 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Simple and composite hypotheses Simple hypothesis: the sampling distribution is

More information

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

More information

Statistiek (WISB361)

Statistiek (WISB361) Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17

More information

Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng

Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution

More information

Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35

Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.  John Nerbonne 1/35 Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistiek-i/ John Nerbonne 1/35 t-tests To test an average or pair of averages when σ is known, we

More information

Power and Sample Size Determination

Power and Sample Size Determination Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 Power 1 / 31 Experimental Design To this point in the semester,

More information

Multiple Hypothesis Testing: The F-test

Multiple Hypothesis Testing: The F-test Multiple Hypothesis Testing: The F-test Matt Blackwell December 3, 2008 1 A bit of review When moving into the matrix version of linear regression, it is easy to lose sight of the big picture and get lost

More information

Chapter 14: 1-6, 9, 12; Chapter 15: 8 Solutions When is it appropriate to use the normal approximation to the binomial distribution?

Chapter 14: 1-6, 9, 12; Chapter 15: 8 Solutions When is it appropriate to use the normal approximation to the binomial distribution? Chapter 14: 1-6, 9, 1; Chapter 15: 8 Solutions 14-1 When is it appropriate to use the normal approximation to the binomial distribution? The usual recommendation is that the approximation is good if np

More information

Statistics - Written Examination MEC Students - BOVISA

Statistics - Written Examination MEC Students - BOVISA Statistics - Written Examination MEC Students - BOVISA Prof.ssa A. Guglielmi 26.0.2 All rights reserved. Legal action will be taken against infringement. Reproduction is prohibited without prior consent.

More information

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior

More information

Statistical Inference and t-tests

Statistical Inference and t-tests 1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

More information

Quantitative Biology Lecture 5 (Hypothesis Testing)

Quantitative Biology Lecture 5 (Hypothesis Testing) 15 th Oct 2015 Quantitative Biology Lecture 5 (Hypothesis Testing) Gurinder Singh Mickey Atwal Center for Quantitative Biology Summary Classification Errors Statistical significance T-tests Q-values (Traditional)

More information

15.0 More Hypothesis Testing

15.0 More Hypothesis Testing 15.0 More Hypothesis Testing 1 Answer Questions Type I and Type II Error Power Calculation Bayesian Hypothesis Testing 15.1 Type I and Type II Error In the philosophy of hypothesis testing, the null hypothesis

More information

Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

More information

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test... Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

More information

Module 7: Hypothesis Testing I Statistics (OA3102)

Module 7: Hypothesis Testing I Statistics (OA3102) Module 7: Hypothesis Testing I Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 10.1-10.5 Revision: 2-12 1 Goals for this Module

More information

SECOND PART, LECTURE 4: CONFIDENCE INTERVALS

SECOND PART, LECTURE 4: CONFIDENCE INTERVALS Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 4: CONFIDENCE INTERVALS Lecture 4: Confidence Intervals

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Chapter 7 Part 2. Hypothesis testing Power

Chapter 7 Part 2. Hypothesis testing Power Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship

More information

1 Maximum likelihood estimation

1 Maximum likelihood estimation COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N

More information

The Delta Method and Applications

The Delta Method and Applications Chapter 5 The Delta Method and Applications 5.1 Linear approximations of functions In the simplest form of the central limit theorem, Theorem 4.18, we consider a sequence X 1, X,... of independent and

More information

Confindence Intervals and Probability Testing

Confindence Intervals and Probability Testing Confindence Intervals and Probability Testing PO7001: Quantitative Methods I Kenneth Benoit 3 November 2010 Using probability distributions to assess sample likelihoods Recall that using the µ and σ from

More information

Chapter 11-12 1 Review

Chapter 11-12 1 Review Chapter 11-12 Review Name 1. In formulating hypotheses for a statistical test of significance, the null hypothesis is often a statement of no effect or no difference. the probability of observing the data

More information

Nonparametric Test Procedures

Nonparametric Test Procedures Nonparametric Test Procedures 1 Introduction to Nonparametrics Nonparametric tests do not require that samples come from populations with normal distributions or any other specific distribution. Hence

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests. 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom

Null Hypothesis Significance Testing Signifcance Level, Power, t-tests. 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Null Hypothesis Significance Testing Signifcance Level, Power, t-tests 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom Simple and composite hypotheses Simple hypothesis: the sampling distribution is

More information

m (t) = e nt m Y ( t) = e nt (pe t + q) n = (pe t e t + qe t ) n = (qe t + p) n

m (t) = e nt m Y ( t) = e nt (pe t + q) n = (pe t e t + qe t ) n = (qe t + p) n 1. For a discrete random variable Y, prove that E[aY + b] = ae[y] + b and V(aY + b) = a 2 V(Y). Solution: E[aY + b] = E[aY] + E[b] = ae[y] + b where each step follows from a theorem on expected value from

More information

A crash course in probability and Naïve Bayes classification

A crash course in probability and Naïve Bayes classification Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Chapter 8 Introduction to Hypothesis Testing

Chapter 8 Introduction to Hypothesis Testing Chapter 8 Student Lecture Notes 8-1 Chapter 8 Introduction to Hypothesis Testing Fall 26 Fundamentals of Business Statistics 1 Chapter Goals After completing this chapter, you should be able to: Formulate

More information

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals Bayesian Updating with Discrete Priors Class 11, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1. Be able to apply Bayes theorem to compute probabilities. 2. Be able to identify

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Chapter 3: Nonparametric Tests

Chapter 3: Nonparametric Tests B. Weaver (15-Feb-00) Nonparametric Tests... 1 Chapter 3: Nonparametric Tests 3.1 Introduction Nonparametric, or distribution free tests are so-called because the assumptions underlying their use are fewer

More information

7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

More information

14.0 Hypothesis Testing

14.0 Hypothesis Testing 14.0 Hypothesis Testing 1 Answer Questions Hypothesis Tests Examples 14.1 Hypothesis Tests A hypothesis test (significance test) is a way to decide whether the data strongly support one point of view or

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

More information

Examination 110 Probability and Statistics Examination

Examination 110 Probability and Statistics Examination Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiple-choice test questions. The test is a three-hour examination

More information

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010 MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

More information

People have thought about, and defined, probability in different ways. important to note the consequences of the definition:

People have thought about, and defined, probability in different ways. important to note the consequences of the definition: PROBABILITY AND LIKELIHOOD, A BRIEF INTRODUCTION IN SUPPORT OF A COURSE ON MOLECULAR EVOLUTION (BIOL 3046) Probability The subject of PROBABILITY is a branch of mathematics dedicated to building models

More information

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test) Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test) A hypothesis test is conducted when trying to find out if a claim is true or not. And if the claim is true, is it significant.

More information

L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

More information

Design and Analysis of Equivalence Clinical Trials Via the SAS System

Design and Analysis of Equivalence Clinical Trials Via the SAS System Design and Analysis of Equivalence Clinical Trials Via the SAS System Pamela J. Atherton Skaff, Jeff A. Sloan, Mayo Clinic, Rochester, MN 55905 ABSTRACT An equivalence clinical trial typically is conducted

More information

Two-sample hypothesis testing, II 9.07 3/16/2004

Two-sample hypothesis testing, II 9.07 3/16/2004 Two-sample hypothesis testing, II 9.07 3/16/004 Small sample tests for the difference between two independent means For two-sample tests of the difference in mean, things get a little confusing, here,

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the

More information

Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

Measuring the Power of a Test

Measuring the Power of a Test Textbook Reference: Chapter 9.5 Measuring the Power of a Test An economic problem motivates the statement of a null and alternative hypothesis. For a numeric data set, a decision rule can lead to the rejection

More information

Testing Hypotheses About Proportions

Testing Hypotheses About Proportions Chapter 11 Testing Hypotheses About Proportions Hypothesis testing method: uses data from a sample to judge whether or not a statement about a population may be true. Steps in Any Hypothesis Test 1. Determine

More information

9-3.4 Likelihood ratio test. Neyman-Pearson lemma

9-3.4 Likelihood ratio test. Neyman-Pearson lemma 9-3.4 Likelihood ratio test Neyman-Pearson lemma 9-1 Hypothesis Testing 9-1.1 Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental

More information

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1.

General Method: Difference of Means. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n 1, n 2 ) 1. General Method: Difference of Means 1. Calculate x 1, x 2, SE 1, SE 2. 2. Combined SE = SE1 2 + SE2 2. ASSUMES INDEPENDENT SAMPLES. 3. Calculate df: either Welch-Satterthwaite formula or simpler df = min(n

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6

More information

Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

More information

Testing a claim about a population mean

Testing a claim about a population mean Introductory Statistics Lectures Testing a claim about a population mean One sample hypothesis test of the mean Department of Mathematics Pima Community College Redistribution of this material is prohibited

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

More information

Chapter 7. Section Introduction to Hypothesis Testing

Chapter 7. Section Introduction to Hypothesis Testing Section 7.1 - Introduction to Hypothesis Testing Chapter 7 Objectives: State a null hypothesis and an alternative hypothesis Identify type I and type II errors and interpret the level of significance Determine

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections - we are still here Skip: 9.2 Testing Simple Hypotheses Skip: 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.5 The t Test 9.6 Comparing the

More information

Probability and Statistics Lecture 9: 1 and 2-Sample Estimation

Probability and Statistics Lecture 9: 1 and 2-Sample Estimation Probability and Statistics Lecture 9: 1 and -Sample Estimation to accompany Probability and Statistics for Engineers and Scientists Fatih Cavdur Introduction A statistic θ is said to be an unbiased estimator

More information

Confidence intervals, t tests, P values

Confidence intervals, t tests, P values Confidence intervals, t tests, P values Joe Felsenstein Department of Genome Sciences and Department of Biology Confidence intervals, t tests, P values p.1/31 Normality Everybody believes in the normal

More information

Structure of the Data. Paired Samples. Overview. The data from a paired design can be tabulated in this form. Individual Y 1 Y 2 d i = Y 1 Y

Structure of the Data. Paired Samples. Overview. The data from a paired design can be tabulated in this form. Individual Y 1 Y 2 d i = Y 1 Y Structure of the Data Paired Samples Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 11th November 2005 The data from a paired design can be tabulated

More information

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures. Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

More information

INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

More information

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name: Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Statistical Inference

Statistical Inference Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

More information

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing Introduction Hypothesis Testing Mark Lunt Arthritis Research UK Centre for Ecellence in Epidemiology University of Manchester 13/10/2015 We saw last week that we can never know the population parameters

More information

SECOND PART, LECTURE 3: HYPOTHESIS TESTING

SECOND PART, LECTURE 3: HYPOTHESIS TESTING Massimo Guidolin Massimo.Guidolin@unibocconi.it Dept. of Finance STATISTICS/ECONOMETRICS PREP COURSE PROF. MASSIMO GUIDOLIN SECOND PART, LECTURE 3: HYPOTHESIS TESTING Lecture 3: Hypothesis Testing Prof.

More information

Nonparametric tests, Bootstrapping

Nonparametric tests, Bootstrapping Nonparametric tests, Bootstrapping http://www.isrec.isb-sib.ch/~darlene/embnet/ Hypothesis testing review 2 competing theories regarding a population parameter: NULL hypothesis H ( straw man ) ALTERNATIVEhypothesis

More information

Terminology. 2 There is no mathematical difference between the errors, however. The bottom line is that we choose one type

Terminology. 2 There is no mathematical difference between the errors, however. The bottom line is that we choose one type Hypothesis Testing 10.2.1 Terminology The null hypothesis H 0 is a nothing hypothesis, whose interpretation could be that nothing has changed, there is no difference, there is nothing special taking place,

More information

What is Bayesian statistics and why everything else is wrong

What is Bayesian statistics and why everything else is wrong What is Bayesian statistics and why everything else is wrong 1 Michael Lavine ISDS, Duke University, Durham, North Carolina Abstract We use a single example to explain (1), the Likelihood Principle, (2)

More information

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

More information

Hypothesis Testing I

Hypothesis Testing I ypothesis Testing I The testing process:. Assumption about population(s) parameter(s) is made, called null hypothesis, denoted. 2. Then the alternative is chosen (often just a negation of the null hypothesis),

More information

Lecture 9: Bayesian hypothesis testing

Lecture 9: Bayesian hypothesis testing Lecture 9: Bayesian hypothesis testing 5 November 27 In this lecture we ll learn about Bayesian hypothesis testing. 1 Introduction to Bayesian hypothesis testing Before we go into the details of Bayesian

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

Introduction to Regression and Data Analysis

Introduction to Regression and Data Analysis Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it

More information

Hypothesis Testing: General Framework 1 1

Hypothesis Testing: General Framework 1 1 Hypothesis Testing: General Framework Lecture 2 K. Zuev February 22, 26 In previous lectures we learned how to estimate parameters in parametric and nonparametric settings. Quite often, however, researchers

More information

Analysis of numerical data S4

Analysis of numerical data S4 Basic medical statistics for clinical and experimental research Analysis of numerical data S4 Katarzyna Jóźwiak k.jozwiak@nki.nl 3rd November 2015 1/42 Hypothesis tests: numerical and ordinal data 1 group:

More information

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters.

C. The null hypothesis is not rejected when the alternative hypothesis is true. A. population parameters. Sample Multiple Choice Questions for the material since Midterm 2. Sample questions from Midterms and 2 are also representative of questions that may appear on the final exam.. A randomly selected sample

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3

Stat260: Bayesian Modeling and Inference Lecture Date: February 1, Lecture 3 Stat26: Bayesian Modeling and Inference Lecture Date: February 1, 21 Lecture 3 Lecturer: Michael I. Jordan Scribe: Joshua G. Schraiber 1 Decision theory Recall that decision theory provides a quantification

More information

Hypothesis Testing. Bluman Chapter 8

Hypothesis Testing. Bluman Chapter 8 CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 8-1 Steps in Traditional Method 8-2 z Test for a Mean 8-3 t Test for a Mean 8-4 z Test for a Proportion 8-5 2 Test for

More information

Single sample hypothesis testing, II 9.07 3/02/2004

Single sample hypothesis testing, II 9.07 3/02/2004 Single sample hypothesis testing, II 9.07 3/02/2004 Outline Very brief review One-tailed vs. two-tailed tests Small sample testing Significance & multiple tests II: Data snooping What do our results mean?

More information

Probability of rejecting the null hypothesis when

Probability of rejecting the null hypothesis when Sample Size The first question faced by a statistical consultant, and frequently the last, is, How many subjects (animals, units) do I need? This usually results in exploring the size of the treatment

More information

Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504 Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

More information

Invariance and optimality Linear rank statistics Permutation tests in R. Rank Tests. Patrick Breheny. October 7. STA 621: Nonparametric Statistics

Invariance and optimality Linear rank statistics Permutation tests in R. Rank Tests. Patrick Breheny. October 7. STA 621: Nonparametric Statistics Rank Tests October 7 Power Invariance and optimality Permutation testing allows great freedom to use a wide variety of test statistics, all of which lead to exact level-α tests regardless of the distribution

More information

Two-sample hypothesis testing, I 9.07 3/09/2004

Two-sample hypothesis testing, I 9.07 3/09/2004 Two-sample hypothesis testing, I 9.07 3/09/2004 But first, from last time More on the tradeoff between Type I and Type II errors The null and the alternative: Sampling distribution of the mean, m, given

More information

HYPOTHESIS TESTING: POWER OF THE TEST

HYPOTHESIS TESTING: POWER OF THE TEST HYPOTHESIS TESTING: POWER OF THE TEST The first 6 steps of the 9-step test of hypothesis are called "the test". These steps are not dependent on the observed data values. When planning a research project,

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Bayesian probability: P. State of the World: X. P(X your information I)

Bayesian probability: P. State of the World: X. P(X your information I) Bayesian probability: P State of the World: X P(X your information I) 1 First example: bag of balls Every probability is conditional to your background knowledge I : P(A I) What is the (your) probability

More information

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

Chapter 8: Introduction to Hypothesis Testing

Chapter 8: Introduction to Hypothesis Testing Chapter 8: Introduction to Hypothesis Testing We re now at the point where we can discuss the logic of hypothesis testing. This procedure will underlie the statistical analyses that we ll use for the remainder

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

4. Introduction to Statistics

4. Introduction to Statistics Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

More information

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

More information