# 3.6: General Hypothesis Tests

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 3.6: General Hypothesis Tests The χ 2 goodness of fit tests which we introduced in the previous section were an example of a hypothesis test. In this section we now consider hypothesis tests more generally : Simple Hypothesis Tests A simple hypothesis test is one where we test a null hypothesis, denoted by H 1 (say), against an alternative hypothesis, denoted by H 2 i.e. the test consists of only two competing hypotheses. We construct a test statistic, t, and based on the value of t observed for our real data we make one of the following two decisions:- 1. accept H 1, and reject H 2 2. accept H 2, and reject H 1 To carry out the hypothesis test we choose the critical region for the test statistic, t. This is the set of values of t for which we will choose to reject the null hypothesis and accept the alternative hypothesis. The region for which we accept the null hypothesis is known as the acceptance region. Note that we must choose the critical region and acceptance region ourselves. For example we might choose the critical region as the set of values of t for which t > : Level of Significance The level of significance of a hypothesis test is the maximum probability of incurring a type I error which we are willing to risk when making our decision. In practice a level of significance of 5% or 1% is common. If a level of significance of 5% is adopted, for example, then we choose our critical region so that the probability of rejecting the null hypothesis when it is true is no more than If the test statistic is found to lie in the critical region then we say that the null hypothesis is rejected at the 5% level, or equivalently that our rejection of the null hypothesis is significant at the 5% level. This means that, if the null hypothesis is true, and we were to repeat our experiment or observation a large number of times, then we would expect to obtain by chance a value of the test statistic which lies 1

2 in the critical region (thus leading us to reject the NH) in no more than 5% of the repeated trials. In other words, we expect our rejection of the null hypothesis to be the wrong decision in no more than 5 times out of every 100 experiments : Goodness of Fit for Discrete Distributions We can illustrate some of the important ideas of hypothesis testing by considering how we test the goodness of fit of data to discrete distributions. We do this again using the χ 2 statistic. Suppose we carry out n observations and obtain as our results k different discrete outcomes, E 1,..., E k which occur with frequencies o 1,..., o k ( o for observed ). An example of such observations might be the number of meteors observed on n different nights, or the number of photons counted in n different pixels of a CCD. Consider the null hypothesis that the observed outcomes are a sample from some model discrete distribution (e.g. a Poisson distribution). Suppose, under this null hypothesis, that the k outcomes, E 1,..., E k, are expected to occur with frequencies e 1,..., e k ( e for expected ). We can test our null hypothesis by comparing the observed and expected frequencies and determining if they differ significantly. We construct the following χ 2 test statistic. k χ 2 (o i e i ) 2 = i=1 where o i = e i = n. Under the null hypothesis this test statistic has approximately a χ 2 pdf with ν = k 1 m degrees of freedom. Here m denotes the number of parameters (possibly zero) of the model discrete distribution which one needs to estimate before one can compute the expected frequencies, and ν is reduced by one further degree of freedom because of the constraint that e i = n. In other words, once we have computed the first k 1 expected frequencies, the k th value is uniquely determined by the sample size n. This χ 2 goodness of fit test need not be restricted only to discrete random variables, since we can effectively produce discrete data from a sample drawn from a continuous pdf by binning the data. Indeed, as we remarked in Section the Central Limit Theorem will ensure that such binned data are approximately normally distributed, which means that the sum of their squares will be approximately distributed as a 2 e i

3 χ 2 random variable. The approximation to a χ 2 pdf is very good provided e i 10, and is reasonable for 5 e i 10. Example 1 A list of 1000 random digits integers from 0 to 9 are generated by a computer. Can this list of digits be regarded as uniformly distributed? Suppose the integers appear in the list with the following frequencies:- r o r Let our NH be that the digits are drawn from a uniform distribution. This means that each digit is expected to occur with equal frequency i.e. e r = 100, for all r. Thus:- k χ 2 (o i e i ) 2 = = 7.00 e i i=1 Suppose we adopt a 5% level of significance. The number of degrees of freedom, ν = 9; hence the critical value of χ 2 = 16.9 for a one-tailed test. Thus, at the 5% significance level we accept the NH that the digits are uniformly distributed. Example 2 The table below shows the number of nights during a 50 night observing run when r hours of observing time were clouded out. Fit a Poisson distribution to these data for the pdf of r and determine if the fit is acceptable at the 5% significance level. r > 4 No. of nights Of course one might ask whether a Poisson distribution is a sensible model for the pdf of r since a Poisson RV is defined for any non-negative integer, whereas r is clearly at most 12 hours. However, as we saw in Section 1.3.2, the shape of the Poisson pdf is sensitive to the value of the mean, µ, and in particular for small values of µ the value of the pdf will be negligible for all but the first few integers, and so we neglect all larger integers as possible outcomes. Hence, in fitting a Poisson model 3

4 we also need to estimate the value of µ. We take as our estimator of µ the sample mean, i.e. ˆµ = = 0.90 Substituting this value into the Poisson pdf we can compute the expected outcomes, e r = 50 p(r; ˆµ), where p(0; 0.90) = p(1; 0.90) = p(2; 0.90) = p(3; 0.90) = p(4; 0.90) = p(5; 0.90) = If we consider only five outcomes, i.e. r 4, since the value of the pdf is negligible for r > 4, then the number of degrees of freedom, ν = 3 (remember that we had to estimate the mean, µ). The value of the test statistic is χ 2 = 0.68, which is smaller than the critical value. Hence we accept the NH at the 5% level i.e. the data are well fitted by a Poisson distribution : The Kolmogorov-Smirnov Test Suppose we want to test the hypothesis that a sample of data is drawn from the underlying population with some given pdf. We could do this by binning the data and comparing with the model pdf using the χ 2 test statistic. This approach might be suitable, for example, for comparing the number counts of photons in the pixels (i.e. the bins) of a CCD array with a bivariate normal model for the point spread function of the telescope optics, where the centre of the bivariate normal defines the position of a star. For small samples this does not work well, however, as we cannot bin the data finely enough to usefully constrain the underlying pdf. A more useful approach in this situation is to compare the sample cumulative distribution function with a theoretical model. We can do this using the Kolmogorov- Smirnov (KS) test statistic. Let {x 1,..., x n } be an iid random sample from the unknown population. Suppose the {x i } have been arranged in ascending order. The sample cdf, S n (x), of X is defined as:- 4

5 0 x < x 1 S n (x) = i x n i x < x i+1, 1 i n 1 1 x x n i.e. S n (x) is a step function which increments by 1/n at each sampled value of x. Let the model cdf be P (x), corresponding to pdf p(x), and let the null hypothesis be that our random sample is drawn from p(x). The KS test statistic is D n = max P (x) S n (x) It is easy to show that D n always occurs at one of the sampled values of x. The remarkable fact about the KS test is that the distribution of D n under the null hypothesis is independent of the functional form of P (x). In other words, whatever the form of the model cdf, P (x), we can determine how likely it is that our actual sample data was drawn from the corresponding pdf. Critical values for the KS statistic are tabulated or can be obtained e.g. from numerical recipes algorithms. The KS test is an example of a robust, or nonparametric, test since one can apply the test with minimal assumption of a parametric form for the underlying pdf. The price for this robustness is that the power of the KS test is lower than other, parametric, tests. In other words there is a higher probability of accepting a false null hypothesis that two samples are drawn from the same pdf because we are making no assumptions about the parametric form of that pdf : Hypothesis Tests on the Sample Correlation Coefficient The final type of hypothesis test which we consider is associated with testing whether two variables are statistically independent, which we can do by considering the value of the sample correlation coefficient. In Section 3.1 we defined the covariance of two RVs, x and y, as cov(x, y) = E[(x µ x )(y µ y )] and the correlation coefficient, ρ, as ρ = cov(x, y) σ x σ y 5

6 We estimate ρ by the sample correlation coefficient, ˆρ, defined by:- ˆρ = (xi ˆµ x )(y i ˆµ y ) [ (xi ˆµ x ) 2 ] [ (y i ˆµ y ) 2 ] where, as usual, ˆµ x and ˆµ y denote the sample means of x and y respectively, and all sums are over 1,..., n, for sample size, n. ˆρ is also often denoted by r, and is referred to as Pearson s correlation coefficient. If x and y do have a bivariate normal pdf, then ρ corresponds precisely to the parameter defined in Section 3.1. To test hypotheses about ρ we need to know the sampling distribution of ˆρ. We consider two special cases, both of which are when x and y have a bivariate normal pdf. (i): ρ = 0 (i.e. x and y are independent) If ρ = 0, then the statistic t = ˆρ n 2 1 ˆρ 2 has a student s t distribution, with ν = n 2 degrees of freedom. Hence, we can use t to test the hypothesis that x and y are independent. (ii): ρ = ρ 0 0 In this case, then for large samples, the statistic z = 1 2 log e ( ) 1 + ˆρ 1 ˆρ has an approximately normal pdf with mean, µ z and variance σ 2 z given by µ z = 1 2 log e ( ) 1 + ρ0 1 ρ 0 σ 2 z = 1 n 3 6

### Lecture 13: Kolmogorov Smirnov Test & Power of Tests

Lecture 13: Kolmogorov Smirnov Test & Power of Tests S. Massa, Department of Statistics, University of Oxford 2 February 2016 An example Suppose you are given the following 100 observations. -0.16-0.68-0.32-0.85

### Non Parametric Inference

Maura Department of Economics and Finance Università Tor Vergata Outline 1 2 3 Inverse distribution function Theorem: Let U be a uniform random variable on (0, 1). Let X be a continuous random variable

### Definition The covariance of X and Y, denoted by cov(x, Y ) is defined by. cov(x, Y ) = E(X µ 1 )(Y µ 2 ).

Correlation Regression Bivariate Normal Suppose that X and Y are r.v. s with joint density f(x y) and suppose that the means of X and Y are respectively µ 1 µ 2 and the variances are 1 2. Definition The

### Statistical tables for use in examinations

Statistical tables for use in examinations Royal Statistical Society 2004 Contents Table Page 1 Binomial Cumulative Distribution Function 3 2 Poisson Cumulative Distribution Function 4 3 Normal Cumulative

### Examination 110 Probability and Statistics Examination

Examination 0 Probability and Statistics Examination Sample Examination Questions The Probability and Statistics Examination consists of 5 multiple-choice test questions. The test is a three-hour examination

### Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01 Probability and Statistics for Engineers Winter 2010-2011 Contents 1 Joint Probability Distributions 1 1.1 Two Discrete

### . (3.3) n Note that supremum (3.2) must occur at one of the observed values x i or to the left of x i.

Chapter 3 Kolmogorov-Smirnov Tests There are many situations where experimenters need to know what is the distribution of the population of their interest. For example, if they want to use a parametric

### Bivariate Distributions

Chapter 4 Bivariate Distributions 4.1 Distributions of Two Random Variables In many practical cases it is desirable to take more than one measurement of a random observation: (brief examples) 1. What is

### NPTEL STRUCTURAL RELIABILITY

NPTEL Course On STRUCTURAL RELIABILITY Module # 02 Lecture 6 Course Format: Web Instructor: Dr. Arunasis Chakraborty Department of Civil Engineering Indian Institute of Technology Guwahati 6. Lecture 06:

### Jointly Distributed Random Variables

Jointly Distributed Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Jointly Distributed Random Variables 1 1.1 Definition......................................... 1 1.2 Joint cdfs..........................................

### Statistiek (WISB361)

Statistiek (WISB361) Final exam June 29, 2015 Schrijf uw naam op elk in te leveren vel. Schrijf ook uw studentnummer op blad 1. The maximum number of points is 100. Points distribution: 23 20 20 20 17

### MAT X Hypothesis Testing - Part I

MAT 2379 3X Hypothesis Testing - Part I Definition : A hypothesis is a conjecture concerning a value of a population parameter (or the shape of the population). The hypothesis will be tested by evaluating

### Exam 3 - Math Solutions - Spring Name : ID :

Exam 3 - Math 3200 Solutions - Spring 2013 Name : ID : General instructions: This exam has 14 questions, each worth the same amount. Check that no pages are missing and promptly notify one of the proctors

### Inference in Regression Analysis

Yang Feng Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of

### Lecture 1: t tests and CLT

Lecture 1: t tests and CLT http://www.stats.ox.ac.uk/ winkel/phs.html Dr Matthias Winkel 1 Outline I. z test for unknown population mean - review II. Limitations of the z test III. t test for unknown population

### The Chi-Square Distributions

MATH 183 The Chi-Square Distributions Dr. Neal, WKU The chi-square distributions can be used in statistics to analyze the standard deviation " of a normally distributed measurement and to test the goodness

### Permutation tests are similar to rank tests, except that we use the observations directly without replacing them by ranks.

Chapter 2 Permutation Tests Permutation tests are similar to rank tests, except that we use the observations directly without replacing them by ranks. 2.1 The two-sample location problem Assumptions: x

### MCQ TESTING OF HYPOTHESIS

MCQ TESTING OF HYPOTHESIS MCQ 13.1 A statement about a population developed for the purpose of testing is called: (a) Hypothesis (b) Hypothesis testing (c) Level of significance (d) Test-statistic MCQ

### QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS

QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.

### Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

### On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models

On Small Sample Properties of Permutation Tests: A Significance Test for Regression Models Hisashi Tanizaki Graduate School of Economics Kobe University (tanizaki@kobe-u.ac.p) ABSTRACT In this paper we

### Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm.

Homework #3 is due Friday by 5pm. Homework #4 will be posted to the class website later this week. It will be due Friday, March 7 th, at 5pm. Political Science 15 Lecture 12: Hypothesis Testing Sampling

### Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0

Probability & Statistics Primer Gregory J. Hakim University of Washington 2 January 2009 v2.0 This primer provides an overview of basic concepts and definitions in probability and statistics. We shall

### Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling

Data Analysis and Uncertainty Part 3: Hypothesis Testing/Sampling Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu Topics 1. Hypothesis Testing 1.

### Measuring the Power of a Test

Textbook Reference: Chapter 9.5 Measuring the Power of a Test An economic problem motivates the statement of a null and alternative hypothesis. For a numeric data set, a decision rule can lead to the rejection

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Statistical tests for SPSS

Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

### Random Variables, Expectation, Distributions

Random Variables, Expectation, Distributions CS 5960/6960: Nonparametric Methods Tom Fletcher January 21, 2009 Review Random Variables Definition A random variable is a function defined on a probability

### Hypothesis Testing COMP 245 STATISTICS. Dr N A Heard. 1 Hypothesis Testing 2 1.1 Introduction... 2 1.2 Error Rates and Power of a Test...

Hypothesis Testing COMP 45 STATISTICS Dr N A Heard Contents 1 Hypothesis Testing 1.1 Introduction........................................ 1. Error Rates and Power of a Test.............................

### 41.2. Tests Concerning asinglesample. Introduction. Prerequisites. Learning Outcomes

Tests Concerning asinglesample 41.2 Introduction This Section introduces you to the basic ideas of hypothesis testing in a non-mathematical way by using a problem solving approach to highlight the concepts

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Inference in Regression Analysis. Dr. Frank Wood

Inference in Regression Analysis Dr. Frank Wood Inference in the Normal Error Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters

### Test of Hypotheses. Since the Neyman-Pearson approach involves two statistical hypotheses, one has to decide which one

Test of Hypotheses Hypothesis, Test Statistic, and Rejection Region Imagine that you play a repeated Bernoulli game: you win \$1 if head and lose \$1 if tail. After 10 plays, you lost \$2 in net (4 heads

### Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

### OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance

Lecture 5: Hypothesis Testing What we know now: OLS is not only unbiased it is also the most precise (efficient) unbiased estimation technique - ie the estimator has the smallest variance (if the Gauss-Markov

### Joint Probability Distributions and Random Samples. Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage

5 Joint Probability Distributions and Random Samples Week 5, 2011 Stat 4570/5570 Material from Devore s book (Ed 8), and Cengage Two Discrete Random Variables The probability mass function (pmf) of a single

### Applications of the Central Limit Theorem

Applications of the Central Limit Theorem October 23, 2008 Take home message. I expect you to know all the material in this note. We will get to the Maximum Liklihood Estimate material very soon! 1 Introduction

### Statistical foundations of machine learning

Machine learning p. 1/45 Statistical foundations of machine learning INFO-F-422 Gianluca Bontempi Département d Informatique Boulevard de Triomphe - CP 212 http://www.ulb.ac.be/di Machine learning p. 2/45

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Hypothesis Testing. Hypothesis Testing CS 700

Hypothesis Testing CS 700 1 Hypothesis Testing! Purpose: make inferences about a population parameter by analyzing differences between observed sample statistics and the results one expects to obtain if

### Lecture 2: Statistical Estimation and Testing

Bioinformatics: In-depth PROBABILITY AND STATISTICS Spring Semester 2012 Lecture 2: Statistical Estimation and Testing Stefanie Muff (stefanie.muff@ifspm.uzh.ch) 1 Problems in statistics 2 The three main

### where b is the slope of the line and a is the intercept i.e. where the line cuts the y axis.

Least Squares Introduction We have mentioned that one should not always conclude that because two variables are correlated that one variable is causing the other to behave a certain way. However, sometimes

### Lecture 18 Chapter 6: Empirical Statistics

Lecture 18 Chapter 6: Empirical Statistics M. George Akritas Definition (Sample Covariance and Pearson s Correlation) Let (X 1, Y 1 ),..., (X n, Y n ) be a sample from a bivariate population. The sample

### 2.8 Expected values and variance

y 1 b a 0 y 0.0 0.2 0.4 0.6 0.8 1.0 1.2 0 a b (a) Probability density function x 0 a b (b) Cumulative distribution function x Figure 2.3: The probability density function and cumulative distribution function

### Homework Zero. Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis. Complete before the first class, do not turn in

Homework Zero Max H. Farrell Chicago Booth BUS41100 Applied Regression Analysis Complete before the first class, do not turn in This homework is intended as a self test of your knowledge of the basic statistical

### Chapter 11: Two Variable Regression Analysis

Department of Mathematics Izmir University of Economics Week 14-15 2014-2015 In this chapter, we will focus on linear models and extend our analysis to relationships between variables, the definitions

### Statistics: revision

NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 3 / 4 May 2005 Department of Experimental Psychology University of Cambridge Slides at pobox.com/~rudolf/psychology

### Statistics for Management II-STAT 362-Final Review

Statistics for Management II-STAT 362-Final Review Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. The ability of an interval estimate to

### Review. Lecture 3: Probability Distributions. Poisson Distribution. May 8, 2012 GENOME 560, Spring Su In Lee, CSE & GS

Lecture 3: Probability Distributions May 8, 202 GENOME 560, Spring 202 Su In Lee, CSE & GS suinlee@uw.edu Review Random variables Discrete: Probability mass function (pmf) Continuous: Probability density

### Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average

PHP 2510 Expectation, variance, covariance, correlation Expectation Discrete RV - weighted average Continuous RV - use integral to take the weighted average Variance Variance is the average of (X µ) 2

### 3.7 The Stable Marriage Problem

3.7 The Stable Marriage Problem Situation: Given n men A, B,... and n women a, b,..., each man and each woman has his or her preference list about persons of opposite sex. A pairing M: a 1-1 correspondence

### MATH 201. Final ANSWERS August 12, 2016

MATH 01 Final ANSWERS August 1, 016 Part A 1. 17 points) A bag contains three different types of dice: four 6-sided dice, five 8-sided dice, and six 0-sided dice. A die is drawn from the bag and then rolled.

### Modèles stochastiques II

Modèles stochastiques II INFO 15 ianluca Bontempi Département d Informatique Boulevard de Triomphe - CP 212 http://wwwulbacbe/di Modéles stochastiques II p1/5 Testing hypothesis Hypothesis testing is the

### Continuous Random Variables

Continuous Random Variables COMP 245 STATISTICS Dr N A Heard Contents 1 Continuous Random Variables 2 11 Introduction 2 12 Probability Density Functions 3 13 Transformations 5 2 Mean, Variance and Quantiles

### For eg:- The yield of a new paddy variety will be 3500 kg per hectare scientific hypothesis. In Statistical language if may be stated as the random

Lecture.9 Test of significance Basic concepts null hypothesis alternative hypothesis level of significance Standard error and its importance steps in testing Test of Significance Objective To familiarize

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### Nonparametric Statistics

Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric

### Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

### 10 BIVARIATE DISTRIBUTIONS

BIVARIATE DISTRIBUTIONS After some discussion of the Normal distribution, consideration is given to handling two continuous random variables. The Normal Distribution The probability density function f(x)

### Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a

### How to compare distributions? or rather Has my sample been drawn from this distribution? or even Kolmogorov-Smirnov and the others...

How to compare distributions? or rather Has my sample been drawn from this distribution? or even Kolmogorov-Smirnov and the others... Stéphane Paltani ISDC/Observatoire de Genève 1/23 The Problem... (I)

### Review Solutions, Exam 3, Math 338

Review Solutions, Eam 3, Math 338. Define a random sample: A random sample is a collection of random variables, typically indeed as X,..., X n.. Why is the t distribution used in association with sampling?

### 7 Hypothesis testing - one sample tests

7 Hypothesis testing - one sample tests 7.1 Introduction Definition 7.1 A hypothesis is a statement about a population parameter. Example A hypothesis might be that the mean age of students taking MAS113X

### Descriptive Data Summarization

Descriptive Data Summarization (Understanding Data) First: Some data preprocessing problems... 1 Missing Values The approach of the problem of missing values adopted in SQL is based on nulls and three-valued

### DEPARTMENT OF ECONOMICS. Unit ECON 12122 Introduction to Econometrics. Notes 4 2. R and F tests

DEPARTMENT OF ECONOMICS Unit ECON 11 Introduction to Econometrics Notes 4 R and F tests These notes provide a summary of the lectures. They are not a complete account of the unit material. You should also

### Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

### 1. The Classical Linear Regression Model: The Bivariate Case

Business School, Brunel University MSc. EC5501/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 018956584) Lecture Notes 3 1.

### Hence, multiplying by 12, the 95% interval for the hourly rate is (965, 1435)

Confidence Intervals for Poisson data For an observation from a Poisson distribution, we have σ 2 = λ. If we observe r events, then our estimate ˆλ = r : N(λ, λ) If r is bigger than 20, we can use this

### Chi-Square & F Distributions

Chi-Square & F Distributions Carolyn J. Anderson EdPsych 580 Fall 2005 Chi-Square & F Distributions p. 1/55 Chi-Square & F Distributions... and Inferences about Variances The Chi-square Distribution Definition,

### Chapter 13. Vehicle Arrival Models : Count Introduction Poisson Distribution

Chapter 13 Vehicle Arrival Models : Count 13.1 Introduction As already noted in the previous chapter that vehicle arrivals can be modelled in two interrelated ways; namely modelling how many vehicle arrive

### 3. Continuous Random Variables

3. Continuous Random Variables A continuous random variable is one which can take any value in an interval (or union of intervals) The values that can be taken by such a variable cannot be listed. Such

### MATH4427 Notebook 3 Spring MATH4427 Notebook Hypotheses: Definitions and Examples... 3

MATH4427 Notebook 3 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 3 MATH4427 Notebook 3 3 3.1 Hypotheses: Definitions and Examples............................

### DISTRIBUTION FITTING 1

1 DISTRIBUTION FITTING What Is Distribution Fitting? Distribution fitting is the procedure of selecting a statistical distribution that best fits to a data set generated by some random process. In other

### Statistical Inference

Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

### FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

### 5. Conditional Expected Value

Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 5. Conditional Expected Value As usual, our starting point is a random experiment with probability measure P on a sample space Ω. Suppose that X is

### Expected Value. Let X be a discrete random variable which takes values in S X = {x 1, x 2,..., x n }

Expected Value Let X be a discrete random variable which takes values in S X = {x 1, x 2,..., x n } Expected Value or Mean of X: E(X) = n x i p(x i ) i=1 Example: Roll one die Let X be outcome of rolling

### Statistics revision. Dr. Inna Namestnikova. Statistics revision p. 1/8

Statistics revision Dr. Inna Namestnikova inna.namestnikova@brunel.ac.uk Statistics revision p. 1/8 Introduction Statistics is the science of collecting, analyzing and drawing conclusions from data. Statistics

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### Conditional expectation

A Conditional expectation A.1 Review of conditional densities, expectations We start with the continuous case. This is sections 6.6 and 6.8 in the book. Let X, Y be continuous random variables. We defined

### The Modified Kolmogorov-Smirnov One-Sample Test Statistic

Thailand Statistician July 00; 8() : 43-55 http://statassoc.or.th Contributed paper The Modified Kolmogorov-Smirnov One-Sample Test Statistic Jantra Sukkasem epartment of Mathematics, Faculty of Science,

### , where f(x,y) is the joint pdf of X and Y. Therefore

INDEPENDENCE, COVARIANCE AND CORRELATION M384G/374G Independence: For random variables and, the intuitive idea behind " is independent of " is that the distribution of shouldn't depend on what is. This

### Point Biserial Correlation Tests

Chapter 807 Point Biserial Correlation Tests Introduction The point biserial correlation coefficient (ρ in this chapter) is the product-moment correlation calculated between a continuous random variable

### Use of Pearson s Chi-Square for Testing Equality of Percentile Profiles across Multiple Populations

Open Journal of Statistics, 2015, 5, 412-420 Published Online August 2015 in SciRes. http://www.scirp.org/journal/ojs http://dx.doi.org/10.4236/ojs.2015.55043 Use of Pearson s Chi-Square for Testing Equality

### PASS Sample Size Software. Linear Regression

Chapter 855 Introduction Linear regression is a commonly used procedure in statistical analysis. One of the main objectives in linear regression analysis is to test hypotheses about the slope (sometimes

### t-statistics for Weighted Means with Application to Risk Factor Models

t-statistics for Weighted Means with Application to Risk Factor Models Lisa R. Goldberg Barra, Inc. 00 Milvia St. Berkeley, CA 94704 Alec N. Kercheval Dept. of Mathematics Florida State University Tallahassee,

### Lecture Topic 6: Chapter 9 Hypothesis Testing

Lecture Topic 6: Chapter 9 Hypothesis Testing 9.1 Developing Null and Alternative Hypotheses Hypothesis testing can be used to determine whether a statement about the value of a population parameter should

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Using Excel for inferential statistics

FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

### Fundamental Probability and Statistics

Fundamental Probability and Statistics "There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don't know. But there are

### 15.1 Least Squares as a Maximum Likelihood Estimator

15.1 Least Squares as a Maximum Likelihood Estimator 651 should provide (i) parameters, (ii) error estimates on the parameters, and (iii) a statistical measure of goodness-of-fit. When the third item suggests

### Comparing Two Populations OPRE 6301

Comparing Two Populations OPRE 6301 Introduction... In many applications, we are interested in hypotheses concerning differences between the means of two populations. For example, we may wish to decide

### SUBMODELS (NESTED MODELS) AND ANALYSIS OF VARIANCE OF REGRESSION MODELS

1 SUBMODELS (NESTED MODELS) AND ANALYSIS OF VARIANCE OF REGRESSION MODELS We will assume we have data (x 1, y 1 ), (x 2, y 2 ),, (x n, y n ) and make the usual assumptions of independence and normality.

### Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

### Statistics to English Translation, Part 2b: Calculating Significance

Statistics to English Translation, Part 2b: Calculating Significance Nina Zumel December, 2009 In the previous installment of the Statistics to English Translation, we discussed the technical meaning of

### Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part II)

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part II) Florian Pelgrin HEC September-December 2010 Florian Pelgrin (HEC) Constrained estimators September-December

### SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation

SYSM 6304: Risk and Decision Analysis Lecture 3 Monte Carlo Simulation M. Vidyasagar Cecil & Ida Green Chair The University of Texas at Dallas Email: M.Vidyasagar@utdallas.edu September 19, 2015 Outline