# Hypothesis Testing for Beginners

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

2 One year ago a friend asked me to put down some easy-to-read notes on hypothesis testing. Being a student of Osteopathy, he is unfamiliar with basic expressions like random variables or probability density functions. Nevertheless, the profession expects him to know the basics of hypothesis testing. These notes offer a very simplified explanation of the topic. The ambition is to get the ideas through the mind of someone whose knowledge of statistics is limited to the fact that a probability cannot be bigger than one. People familiar with the topic will find the approach too easy and not rigorous. But this is fine, these notes are not intended to them. For comments, typos and mistakes please contact me on Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

3 Plan for these notes Describing a random variable Expected value and variance Probability density function Normal distribution Reading the table of the standard normal Hypothesis testing on the mean The basic intuition Level of significance, p-value and power of a test An example Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

4 Random Variable: Definition The first step to understand hypothesis testing is to understand what we mean by random variable A random variable is a variable whose realization is determined by chance You can think of it as something intrinsically random, or as something that we don t understand completely and that we call random Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

5 Random Variable: Example 1 What is the number that will show up when rolling a dice? We don t know what it is ex-ante (i.e. before we roll the dice). We only know that numbers from 1 to 6 are equally likely, and that other numbers are impossible. Of course, we would not consider it random if we could keep track of all the factors affecting the dice (the density of air, the precise shape and weight of the hand...). Being impossible, we refer to this event as random. In this case the random variable is {Number that appears when rolling a dice once} and the possible realizations are {1, 2, 3, 4, 5, 6 } Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

6 Random Variable: Example 2 How many people are swimming in the lake of Lausanna at 4 pm? If you could count them it would be, say, 21 on June 1st, 27 on June 15th, 311 on July 1st... Again, we would not consider it random if we could keep track of all the factors leading people to swim (number of tourists in the area, temperature, weather..). This is not feasible, so we call it a random event. In this case the random variable is {Number of people swimming in the lake of Lausanne at 4 pm} and the possible realizations are {0, 1, 2,... } Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

7 Moments of a Random Variable The world is stochastic, and this means that it is full of random variables. The challenge is to come up with methods for describing their behavior. One possible description of a random variable is offered by the expected value. It is defined as the sum of all possible realizations weighted by their probabilities In the example of the dice, the expected value is 3.5, which you obtain from E[x] = = 3, 5 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

8 Moments of a Random Variable The expected value is similar to an average, but with an important difference: the average is something computed ex-post (when you already have different realizations), while the expected value is ex-ante (before you have realizations). Suppose you roll the dice 3 times and obtain {1, 3, 5}. In this case the average is 3, although the expected value is 3,5. The variable is random, so if you roll the dice again you will probably get different numbers. Suppose you roll the dice again 3 times and obtain {3, 4, 5}. Now the average is 4, but the expected value is still 3,5. The more observations extracted and the closer the average to the expected value Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

9 Moments of a Random Variable The expected value gives an idea of the most likely realization. We might also want a measure of the volatility of the random variable around its expected value. This is given by the variance The variance is computed as the expected value of the squares of the distances of each possible realization from the expected value In our example the variance is 2,9155. In fact, x probability x-e(x) (x-e(x))^2 prob*(x-e(x))^2 1 0,1666-2,5 6,25 1, ,1666-1,5 2,25 0, ,1666-0,5 0,25 0, ,1666 0,5 0,25 0, ,1666 1,5 2,25 0, ,1666 2,5 6,25 1, ,9155 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

10 Moments of a Random Variable Remember, our goal for the moment is to find possible tools for describing the behavior of a random variable Expected values and variances are two, fairly intuitive possible tools. They are called moments of a random variable More precisely, they are the first and the second moment of a random variable (there are many more, but we don t need them here) For our example above, we have First moment = E(X) = 3,5 Second moment = Var(X) = 2,9155 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

11 PDF of a Random Variable The moments are only one possible tool used to describe the behavior of a random variable. The second tool is the probability density function A probability density function (pdf) is a function that covers an area representing the probability of realizations of the underlying values Understanding a pdf is all we need to understand hypothesis testing Pdfs are more intuitive with continuous random variables instead of discrete ones (as from example 1 and 2 above). Let s move now to continuous variables Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

12 PDF of a Random Variable: Example 3 Consider an extension of example 1: the realizations of X can still go from 1 to 6 with equal probability, but all intermediate values are possible as well, not only the integers {1, 2, 3, 4, 5, 6 } Given this new (continuous) random variable, what is the probability that the realization of X will be between 1 and 3,5? What is the probability that it will be between 5 and 6? Graphically, they are the area under the pdf in the segments [1; 3,5] and [5; 6] Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

13 PDF of a Random Variable The graphic intuition is simple, but of course to compute these probabilities we need to know the value of the parameter c, i.e. how high is the pdf of this random variable To answer this question, you should first ask yourself a preliminary question: what is the probability that the realization will be between 1 and 6? Of course this probability is 1, as by construction the realization cannot be anywhere else. This means that the whole area under a pdf must be equal to one In our case this means that (6-1)*c must be equal to 1, which implies c = 1/5 = 0,2. Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

14 PDF of a Random Variable Now, ask yourself another question: what is the probability that the realization of X will be between 8 and 12? This probability is zero, given that [8, 12] is outside [1,6] This means that above the segment [8,12] (and everywhere outside the support [1, 6]) the area under the pdf must be zero. To sum up, the full pdf of this specific random variable is Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

15 PDF of a Random Variable In our example we see that: Prob(1 < X < 3, 5) = (3, 5 1) 1/5 = 1/2 Prob(5 < X < 6) = (6 5) 1/5 = 1/5 Prob(8 < X < 24) = (24 8) 0 = 0 Hypothesis testing will rely extensively on the idea that, having a pdf, one can compute the probability of all the corresponding events. Make sure you understand this point before going ahead Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

16 Normal Distribution We have seen that the pdf of a random variable synthesizes all the probabilities of realization of the underlying events Different random variables have different distributions, which imply different pdfs. For instance, the variable seen above is uniformly distributed in the support [1,6]. As for all uniform distributions, the pdf is simply a constant (in our case 0,2) Let s introduce now the most famous distribution, which we will use extensively when doing hypothesis testing: the normal distribution Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

17 Normal Distribution The normal distribution has the characteristic bell-shaped pdf: Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

18 Normal Distribution Contrary to the uniform distribution, a normally distributed random variable can have realizations from to +, although realizations in the tail are really rare (the area in the tail is very small) The entire distribution is characterized by the first two moments of the variable: µ and σ 2. Having these two moments one can obtain the precise position of the pdf When a random variable is normally distributed with expected value µ and variance σ 2, then it is written We will see this notation again X N(µ, σ 2 ) Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

19 Normal Distribution Let s consider an example. Suppose we have a normally distributed random variable with µ = 8 and σ 2 = 16. What is the probability that the realization will be below 7? What is the probability that it will be in the interval [8, 10]? Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

20 Standard Normal Distribution Graphically this is easy. But how do we compute them? To answer this question we first need to introduce a special case of the normal distribution: the standard normal distribution The standard normal distribution is the distribution of a normal variable with expected value equal to zero and variance equal to 1. It is expressed by the variable Z: Z N(0, 1) The pdf of the standard normal looks identical to the pdf of the normal variable, except that it has center of gravity at zero and has a width that is adjusted to allow the variance to be one Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

21 Standard Normal Distribution What is the big deal of the standard normal distribution? It is the fact that we have a table showing, for each point in [, + ], the probability that we have to the left and to the right of that point. For instance, one might want to know the probability that a standard normal has a realization lower that point From these table one gets that it is 0,99. What is then the probability that the realization is above 2,33? 0,01, of course. Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

22 Standard Normal Distribution The following slide shows the table for the standard normal distribution. On the vertical and horizontal axes you read the z-point (respectively the tenth and the hundredth), while inside the box you read the corresponding probability The blue box shows the point we used to compute the probability in the previous slide Make sure you can answer the following questions before going ahead Question 1: what is the probability that the standard normal will give a realization below 1? Question 2: what is the point z below which the standard normal has 0,95 probability to occur? Question 3: what is the probability that the standard normal will give a realization above (not below) -0.5 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

23 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

24 Standard Normal Distribution Answers to our previous questions 1) Prob(Z < 1) = 0, ) z so that Prob(Z < z) = 0, 95 is 1, 64 3) 1 Prob(Z < 0.5) = 0, 6915 Last, make sure you understand the following theorem: if X is normally distributed with expected value µ and variance σ 2, then if you subtract µ from X and divide everything by σ 2 = σ you obtain a new variable, which is a standard normal: Given X N(µ, σ 2 ) X µ σ N(0, 1) = Z Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

25 Normal Distribution We are finally able to answer our question from a few slides before: what is the value of Prob(X < 7) and Prob(8 < X < 10) given X N(8, 16)? Do we have a table for this special normal variable with E(x)=8 and Var(x)=16? No! But we don t need it: we can do a transformation to X and exploit the table of the standard normal Remembering that you can add/subtract or multiply/divide both sides of an inequality, we obtain.. Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

26 Normal Distribution Prob(X < 7) = Prob(X 8 < 7 8) = Prob(X 8 < 1) = = Prob( X 8 4 < 1 4 ) = Prob(X 8 4 < 0, 25) The theorem tells us that the variable X µ σ = X 8 4 is a standard normal, which greatly simplifies our question to: what is the probability that a standard normal has a realization on the left of point 0, 25? We know how to answer this question. From the table we get Prob(X < 7) = Prob(Z < 0, 25) = 1 Prob(Z > 0.25) = 1 0, 5987 = = 40, 13% Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

27 Normal Distribution Let s now compute the other probability: Prob(8 < X < 10) = Prob(X < 10) Prob(X < 8) = = Prob(Z < 10 8 ) Prob(Z < ) = Prob(Z < 0.5) Prob(Z < 0) = 0, = = 0, 1915 = 19, 15% To sum up, a normal distribution with expected value 8 and variance 16 could have realizations from to. But we now know that the probability that X will be lower than 7 is 40,13 %, while the probability that it will be in the interval [8,10] is 19,15 % Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

28 Normal Distribution Graphically, given X N(8, 16) Exercise: show that Prob(X > 11) = 22, 66% Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

29 Hypothesis Testing We are finally able to use what we have seen so far to do hypothesis testing. What is this about? In all the exercises above, we assumed to know the parameter values and investigated the properties of the distribution (i.e. we knew that µ = 8 and σ 2 = 16). Of course, knowing the true values of the parameters is not a straightforward task By inference we mean research of the values of the parameters given some realizations of the variable (i.e. given some data) Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

30 Hypothesis Testing There are two ways to proceed. The first one is to use the data to estimate the parameters, the second is to guess a value for the parameters and ask the data whether this value is true. The former approach is estimation, the latter is hypothesis testing In the rest of the notes we will do hypothesis testing on the expected value of a normal distribution, assuming that the true variance is known. There are of course tests on the variance, as well as all kinds of tests Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

31 Hypothesis Testing Consider the following scenario. We have a normal random variable with expected value µ unknown and variance equal to 16 We are told that the true value for µ is either 8 or 25. For some external reason we are inclined to think that µ = 8, but of course we don t really know for sure. What we have is one realization of the random variable. The point is, what is this information telling us? Of course, the closer is the realization to 8 and the more I would be willing to conclude that the true value is 8. Similarly, the closer the realization is to 25 and the more I would reject that the hypothesis that µ = 8 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

32 Hypothesis Testing How do we use the data to derive some meaningful procedure for inferring whether my data is supporting my null hypothesis of µ = 8 or is rejecting it in favor of the alternative µ = 25? Let s first put things formally: X N(µ, 16), with H 0 : µ = 8 vs. H a : µ = 25 Call x the realization of X which we have and which we use to do inference. What is the distribution that has generated this point, X N(8, 16) or X N(25, 16)? Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

33 Hypothesis Testing If null hypothesis is true, the realization x comes from If alternative hypothesis is true, the realization x comes from Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

34 Hypothesis Testing The key intuition is: how far away from 8 must the realization be for us to reject the null hypothesis that the true value of µ is 8? Suppose x = 12. In this case x is much more likely to come from X N(8, 16) rather than from X N(25, 16). Similarly, if x = 55 then this value is much more likely to come from X N(25, 16) rather than from X N(8, 16) The procedure is to choose a point c (called critical value) and then reject H 0 if x > c, while not reject H 0 if x < c. The point is to find this c in a meaningful way Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

35 Hypothesis Testing Start with the following exercise. For a given c, what is the probability that we reject H 0 by mistake, i.e., when the true value for µ was actually 8? As we have seen, this is nothing but thee area under X N(8, 16) on the interval [c, + ] Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

36 Hypothesis Testing Given c, we also know how to compute this probability. Under H 0, it is simply Prob(X > c) = Prob(Z > c 8 4 ) What we just did is: given c, compute the probability of rejecting H 0 by mistake. The problem is that we don t have c yet, that s what we need to compute! A sensible way to choosing c is to do the other way around: find c so that the probability of rejecting H 0 by mistake is a given number, which we choose at the beginning. In other words, you choose with which probability of rejecting H 0 by mistake you want to work and then you compute the corresponding critical value Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

37 Hypothesis Testing In jargon, the probability of rejecting H 0 by mistake is called type 1 error, or level of significance. It is expressed with the letter α Our problem of finding a meaningful critical value c has been solved: at first, choose a confidence interval. Then find the point c corresponding to the interval α. Having c, one only needs to compare it to the realization x from the data. If x > c, then we reject H 0 = 8, knowing that we might be wrong in doing so with probability α The most frequent values used for α are 0.01, 0,05 and 0,1 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

38 Hypothesis Testing Let s do this with our example. Suppose we want to work with a level of significance of 1 % The first thing to do is to find the critical vale c which, if H 0 was true, would have 1 % probability on the right. Formally, find c so that 0, 01 = Prob(X > c) = Prob(Z > c 8 4 ) What is the point of the standard normal distribution which has area of 0,01 on the right? This is the first value we saw when we introduced the N(0,1) distribution. The value was 2,33: 0, 01 = Prob(Z > c 8 ) = Prob(Z > 2, 33) 4 Impose c 8 4 = 2, 33 and obtain c = 17, 32. Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

39 Hypothesis Testing At this point we have all we need to run our test. Suppose that the realization from X that we get is 15. Is x =15 far enough from µ = 8 to lead us to reject H 0? Clearly 15 < 17, 32, which means that we cannot reject the hypothesis that µ = 8 Suppose we run the test with another realization of X, say x = 20. Is 20 sufficiently close to µ = 25 to lead us to reject H 0? Of course it is, given that 20 is above the critical c = 17, 32 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

40 Hypothesis Testing Note that when rejecting H 0 we cannot be sure at 100% that the true value for the parameter µ is not 8. Similarly, when we fail to reject H 0 we cannot be sure at 100 % that the true value for µ is actually 8. What we know is only that, in doing the test several times, we would mistakenly reject µ = 8 with α = 0.01 probability. Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

41 Hypothesis Testing Our critical value c would have been different, had we chosen to work with a different level of significance. For instance, the critical value for α = 0, 05 is 14,56, while the critical value for α = 0, 1 is 13,12 (make sure you know how to compute these values) α c Suppose that the realization of X is 15. Do we reject H 0 when working with a level of significance of 1 %? No. But do we reject it if we work with a level of significance of 10 %? Yes! The higher the probability that we accept in rejecting H 0 by mistake and the more likely it is that we reject H 0 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

42 Hypothesis Testing Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

43 Hypothesis Testing If, given x, the outcome of the test depends on the level of significance chosen, can we find a probability p so that we will reject H 0 for all levels of significance from 100 % down to p? To see why this question is interesting, compute the probability p on the right of x = 15. We will be able to reject H 0 for all critical values lower than x, or equivalently, for all levels of significance higher than p p = Prob(X > x ) = Prob(Z > 15 8 ) = Prob(Z > 1, 75) = 4 = 1 Prob(Z < 1, 75) = 0, 0401 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

44 Hypothesis Testing Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

45 Hypothesis Testing If we have chosen α < p, then it means that x < c and we cannot reject H 0 If we have chosen an α > p, then it means that x > c and we can reject H 0 In our example, we can reject H 0 for all levels of significance α 4, 01% The probability p is called p-value. The p-value is the lowest level of significance which allows to reject H 0. The lower p and the easier will be to reject H 0, since we will be able to reject H 0 with lower probabilities of rejecting it by mistake. In other words, the lower p and the more confident you are in rejecting H 0 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

46 Hypothesis Testing Our last step involves the following question. What if the true value for µ was actually 25? What is the probability that we mistakenly fail to reject H 0 when the true µ was actually 25? Given what we saw so far this should be easy: it is the area on the region [, c] under the pdf of the alternative hypothesis: Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

47 Hypothesis Testing In jargon, the probability of not rejecting H 0 when it is actually wrong is called type 2 error. It is expressed by the letter β Let s go back to our example. At this point you should be able to compute β easily. Under H a we have β = Prob(X > c) = Prob(X > 17, 32) = = Prob( X 25 4 < 17, ) = Prob(Z < 1, 92) = 0, Last, what is the probability that we reject H 0 when it was actually wrong and the true one was H a? It is obviously the area under X N(25, 16) in the interval [c, + ], or equivalently, 1 β = 0, This is the power of the test Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

48 Hypothesis Testing The power of a test is the probability that we correctly reject the null hypothesis. It is expressed by the letter π Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

49 Hypothesis Testing To sum up, the procedure for running an hypothesis testing requires to: Choose a level of significance α Compute the corresponding critical value and define the critical region for rejecting or not rejecting the null hypothesis Obtain the realization of your random variable and compare it to c If you reject H0, then you know that the probability that you are wrong in doing so is α Given H 1 and the level c, compute the power of the test If you reject the H0 in favor of H 1, then you know that you are right in doing so with probability given by π Alternatively, one can compute the p-value corresponding to the realization of X given by the data and infer all levels of significance up to which you can reject H 0 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

50 Hypothesis Testing In the example seen so far, choosing α = 0, 01, we reject H 0 for all x > 17, 32 and do not reject it for all x < 17, 32 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

51 Hypothesis Testing To conclude, three remarks are necessary: Remark 1: What if we specify that the alternative hypothesis is not a single point, but it is simply H a : µ 8? Such a test is called two-sided test, compared to the one-sided test seen so far. Nothing much changes, other than the fact that the rejection region will be divided into three parts. Find c 1.c 2 so that α = Prob(X < c 1 or X > c 2 ) = 1 Prob(c 1 < X < c 2 ) Then, you reject H 0 if x lies outside the interval [c 1, c 2 ] and do not reject H 0 otherwise Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

52 Hypothesis Testing Remark 2: why should one use only a single realization of X? Doesn t it make more sense to extract n observations and then compute the average? Yes, of course. Nothing much changes in the interpretation. The only note of caution is on the distribution used. It is possible to show that, if X N(µ, σ 2 ), then with X = 1 n n i=1 X i X N(µ, σ2 n ) The construction of the test follows the same steps, but you have to use the new distribution Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

53 Hypothesis Testing Remark 3: so far we assumed that we knew the true value of σ 2. But what is it? Shouldn t we estimate it? Doesn t this change something? Yes, or course. One can show that the best thing to do is to substitute σ 2 with the following statistic: s 2 = 1 n 1 n (X 1 X ) 2 The important difference in this case is that the statistics will not follow the normal distribution, but a different distribution (known as the t distribution). The logical steps for constructing the test are the same. For the details, have a look at a book of statistics i=1 Michele Piffer (LSE) Hypothesis Testing for Beginners August, / 53

### 1 Sufficient statistics

1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

### Introduction to Hypothesis Testing

I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

### Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011

Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

### STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

### E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

### Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes Simcha Pollack, Ph.D. St. John s University Tobin College of Business Queens, NY, 11439 pollacks@stjohns.edu

### Domain of a Composition

Domain of a Composition Definition Given the function f and g, the composition of f with g is a function defined as (f g)() f(g()). The domain of f g is the set of all real numbers in the domain of g such

### Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

### 1 The Brownian bridge construction

The Brownian bridge construction The Brownian bridge construction is a way to build a Brownian motion path by successively adding finer scale detail. This construction leads to a relatively easy proof

### Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem Ernie Croot September 17, 2008 On more than one occasion I have heard the comment Probability does not exist in the real world, and

### Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Introduction to Hypothesis Testing CHAPTER 8 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1 Identify the four steps of hypothesis testing. 2 Define null hypothesis, alternative

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### p ˆ (sample mean and sample

Chapter 6: Confidence Intervals and Hypothesis Testing When analyzing data, we can t just accept the sample mean or sample proportion as the official mean or proportion. When we estimate the statistics

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

### t Tests in Excel The Excel Statistical Master By Mark Harmon Copyright 2011 Mark Harmon

t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com www.excelmasterseries.com

### Recall this chart that showed how most of our course would be organized:

Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

### WISE Power Tutorial All Exercises

ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II

### Name: Date: Use the following to answer questions 3-4:

Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

### Statistics 100A Homework 8 Solutions

Part : Chapter 7 Statistics A Homework 8 Solutions Ryan Rosario. A player throws a fair die and simultaneously flips a fair coin. If the coin lands heads, then she wins twice, and if tails, the one-half

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,

### STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science Mondays 2:10 4:00 (GB 220) and Wednesdays 2:10 4:00 (various) Jeffrey Rosenthal Professor of Statistics, University of Toronto

### Normality Testing in Excel

Normality Testing in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. mark@excelmasterseries.com

### 8. THE NORMAL DISTRIBUTION

8. THE NORMAL DISTRIBUTION The normal distribution with mean μ and variance σ 2 has the following density function: The normal distribution is sometimes called a Gaussian Distribution, after its inventor,

### 1-3 id id no. of respondents 101-300 4 respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

Basic Data Analysis Graziadio School of Business and Management Data Preparation & Entry Editing: Inspection & Correction Field Edit: Immediate follow-up (complete? legible? comprehensible? consistent?

### CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

### Stats Review Chapters 9-10

Stats Review Chapters 9-10 Created by Teri Johnson Math Coordinator, Mary Stangler Center for Academic Success Examples are taken from Statistics 4 E by Michael Sullivan, III And the corresponding Test

### Math/Stats 425 Introduction to Probability. 1. Uncertainty and the axioms of probability

Math/Stats 425 Introduction to Probability 1. Uncertainty and the axioms of probability Processes in the real world are random if outcomes cannot be predicted with certainty. Example: coin tossing, stock

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

STATISTICS/GRACEY PRACTICE TEST/EXAM 2 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the given random variable as being discrete or continuous.

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Mathematical Induction

Mathematical Induction (Handout March 8, 01) The Principle of Mathematical Induction provides a means to prove infinitely many statements all at once The principle is logical rather than strictly mathematical,

### Homework 4 - KEY. Jeff Brenion. June 16, 2004. Note: Many problems can be solved in more than one way; we present only a single solution here.

Homework 4 - KEY Jeff Brenion June 16, 2004 Note: Many problems can be solved in more than one way; we present only a single solution here. 1 Problem 2-1 Since there can be anywhere from 0 to 4 aces, the

### Math/Stats 342: Solutions to Homework

Math/Stats 342: Solutions to Homework Steven Miller (sjm1@williams.edu) November 17, 2011 Abstract Below are solutions / sketches of solutions to the homework problems from Math/Stats 342: Probability

### Solutions for Practice problems on proofs

Solutions for Practice problems on proofs Definition: (even) An integer n Z is even if and only if n = 2m for some number m Z. Definition: (odd) An integer n Z is odd if and only if n = 2m + 1 for some

### 12.5: CHI-SQUARE GOODNESS OF FIT TESTS

125: Chi-Square Goodness of Fit Tests CD12-1 125: CHI-SQUARE GOODNESS OF FIT TESTS In this section, the χ 2 distribution is used for testing the goodness of fit of a set of data to a specific probability

### Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing

Business Statistics, 9e (Groebner/Shannon/Fry) Chapter 9 Introduction to Hypothesis Testing 1) Hypothesis testing and confidence interval estimation are essentially two totally different statistical procedures

### If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question

### Review #2. Statistics

Review #2 Statistics Find the mean of the given probability distribution. 1) x P(x) 0 0.19 1 0.37 2 0.16 3 0.26 4 0.02 A) 1.64 B) 1.45 C) 1.55 D) 1.74 2) The number of golf balls ordered by customers of

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Mind on Statistics. Chapter 12

Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

### Chapter 25: Exchange in Insurance Markets

Chapter 25: Exchange in Insurance Markets 25.1: Introduction In this chapter we use the techniques that we have been developing in the previous 2 chapters to discuss the trade of risk. Insurance markets

### 4.1 4.2 Probability Distribution for Discrete Random Variables

4.1 4.2 Probability Distribution for Discrete Random Variables Key concepts: discrete random variable, probability distribution, expected value, variance, and standard deviation of a discrete random variable.

### Statistics 104: Section 6!

Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

### 1. How different is the t distribution from the normal?

Statistics 101 106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M 7.1 and 7.2, ignoring starred parts. Reread M&M 3.2. The effects of estimated variances on normal approximations. t-distributions.

### LESSON 7: IMPORTING AND VECTORIZING A BITMAP IMAGE

LESSON 7: IMPORTING AND VECTORIZING A BITMAP IMAGE In this lesson we ll learn how to import a bitmap logo, transform it into a vector and perform some editing on the vector to clean it up. The concepts

### 1. What is the critical value for this 95% confidence interval? CV = z.025 = invnorm(0.025) = 1.96

1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years

### Basic Concepts in Research and Data Analysis

Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the

### Problem sets for BUEC 333 Part 1: Probability and Statistics

Problem sets for BUEC 333 Part 1: Probability and Statistics I will indicate the relevant exercises for each week at the end of the Wednesday lecture. Numbered exercises are back-of-chapter exercises from

### CHECKLIST FOR THE DEGREE PROJECT REPORT

Kerstin Frenckner, kfrenck@csc.kth.se Copyright CSC 25 mars 2009 CHECKLIST FOR THE DEGREE PROJECT REPORT This checklist has been written to help you check that your report matches the demands that are

### Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the

### CHANCE ENCOUNTERS. Making Sense of Hypothesis Tests. Howard Fincher. Learning Development Tutor. Upgrade Study Advice Service

CHANCE ENCOUNTERS Making Sense of Hypothesis Tests Howard Fincher Learning Development Tutor Upgrade Study Advice Service Oxford Brookes University Howard Fincher 2008 PREFACE This guide has a restricted

### Example 1. Consider the following two portfolios: 2. Buy one c(s(t), 20, τ, r) and sell one c(s(t), 10, τ, r).

Chapter 4 Put-Call Parity 1 Bull and Bear Financial analysts use words such as bull and bear to describe the trend in stock markets. Generally speaking, a bull market is characterized by rising prices.

### Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

AMS 5 PROBABILITY Basic Probability Probability: The part of Mathematics devoted to quantify uncertainty Frequency Theory Bayesian Theory Game: Playing Backgammon. The chance of getting (6,6) is 1/36.

### ECE302 Spring 2006 HW4 Solutions February 6, 2006 1

ECE302 Spring 2006 HW4 Solutions February 6, 2006 1 Solutions to HW4 Note: Most of these solutions were generated by R. D. Yates and D. J. Goodman, the authors of our textbook. I have added comments in

### Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

### MATH 140 Lab 4: Probability and the Standard Normal Distribution

MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes

### Testing for Lack of Fit

Chapter 6 Testing for Lack of Fit How can we tell if a model fits the data? If the model is correct then ˆσ 2 should be an unbiased estimate of σ 2. If we have a model which is not complex enough to fit

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Financial Mathematics and Simulation MATH 6740 1 Spring 2011 Homework 2

Financial Mathematics and Simulation MATH 6740 1 Spring 2011 Homework 2 Due Date: Friday, March 11 at 5:00 PM This homework has 170 points plus 20 bonus points available but, as always, homeworks are graded

### Tests of Hypotheses Using Statistics

Tests of Hypotheses Using Statistics Adam Massey and Steven J. Miller Mathematics Department Brown University Providence, RI 0292 Abstract We present the various methods of hypothesis testing that one

### An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### The Assumption(s) of Normality

The Assumption(s) of Normality Copyright 2000, 2011, J. Toby Mordkoff This is very complicated, so I ll provide two versions. At a minimum, you should know the short one. It would be great if you knew

### Near Optimal Solutions

Near Optimal Solutions Many important optimization problems are lacking efficient solutions. NP-Complete problems unlikely to have polynomial time solutions. Good heuristics important for such problems.

### MTH6120 Further Topics in Mathematical Finance Lesson 2

MTH6120 Further Topics in Mathematical Finance Lesson 2 Contents 1.2.3 Non-constant interest rates....................... 15 1.3 Arbitrage and Black-Scholes Theory....................... 16 1.3.1 Informal

### Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

### MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

### Schooling, Political Participation, and the Economy. (Online Supplementary Appendix: Not for Publication)

Schooling, Political Participation, and the Economy Online Supplementary Appendix: Not for Publication) Filipe R. Campante Davin Chor July 200 Abstract In this online appendix, we present the proofs for

### MA 1125 Lecture 14 - Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values.

MA 5 Lecture 4 - Expected Values Friday, February 2, 24. Objectives: Introduce expected values.. Means, Variances, and Standard Deviations of Probability Distributions Two classes ago, we computed the

### 1 Short Introduction to Time Series

ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### Fractions Packet. Contents

Fractions Packet Contents Intro to Fractions.. page Reducing Fractions.. page Ordering Fractions page Multiplication and Division of Fractions page Addition and Subtraction of Fractions.. page Answer Keys..

### Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### Chapter 23 Inferences About Means

Chapter 23 Inferences About Means Chapter 23 - Inferences About Means 391 Chapter 23 Solutions to Class Examples 1. See Class Example 1. 2. We want to know if the mean battery lifespan exceeds the 300-minute

### Factoring Patterns in the Gaussian Plane

Factoring Patterns in the Gaussian Plane Steve Phelps Introduction This paper describes discoveries made at the Park City Mathematics Institute, 00, as well as some proofs. Before the summer I understood

### Chapter 7 Section 1 Homework Set A

Chapter 7 Section 1 Homework Set A 7.15 Finding the critical value t *. What critical value t * from Table D (use software, go to the web and type t distribution applet) should be used to calculate the

### STATISTICS 8, FINAL EXAM. Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4

STATISTICS 8, FINAL EXAM NAME: KEY Seat Number: Last six digits of Student ID#: Circle your Discussion Section: 1 2 3 4 Make sure you have 8 pages. You will be provided with a table as well, as a separate

### Chi Square Tests. Chapter 10. 10.1 Introduction

Contents 10 Chi Square Tests 703 10.1 Introduction............................ 703 10.2 The Chi Square Distribution.................. 704 10.3 Goodness of Fit Test....................... 709 10.4 Chi Square

### Grade 4 - Module 5: Fraction Equivalence, Ordering, and Operations

Grade 4 - Module 5: Fraction Equivalence, Ordering, and Operations Benchmark (standard or reference point by which something is measured) Common denominator (when two or more fractions have the same denominator)

### Calculating P-Values. Parkland College. Isela Guerra Parkland College. Recommended Citation

Parkland College A with Honors Projects Honors Program 2014 Calculating P-Values Isela Guerra Parkland College Recommended Citation Guerra, Isela, "Calculating P-Values" (2014). A with Honors Projects.

### Basic Probability and Statistics Review. Six Sigma Black Belt Primer

Basic Probability and Statistics Review Six Sigma Black Belt Primer Pat Hammett, Ph.D. January 2003 Instructor Comments: This document contains a review of basic probability and statistics. It also includes

### Exploratory Data Analysis

Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

### Solving Quadratic Equations by Factoring

4.7 Solving Quadratic Equations by Factoring 4.7 OBJECTIVE 1. Solve quadratic equations by factoring The factoring techniques you have learned provide us with tools for solving equations that can be written

### Online 12 - Sections 9.1 and 9.2-Doug Ensley

Student: Date: Instructor: Doug Ensley Course: MAT117 01 Applied Statistics - Ensley Assignment: Online 12 - Sections 9.1 and 9.2 1. Does a P-value of 0.001 give strong evidence or not especially strong

### Likelihood Approaches for Trial Designs in Early Phase Oncology

Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University

### Practice problems for Homework 12 - confidence intervals and hypothesis testing. Open the Homework Assignment 12 and solve the problems.

Practice problems for Homework 1 - confidence intervals and hypothesis testing. Read sections 10..3 and 10.3 of the text. Solve the practice problems below. Open the Homework Assignment 1 and solve the

### MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

### INCIDENCE-BETWEENNESS GEOMETRY

INCIDENCE-BETWEENNESS GEOMETRY MATH 410, CSUSM. SPRING 2008. PROFESSOR AITKEN This document covers the geometry that can be developed with just the axioms related to incidence and betweenness. The full

### Chapter 4. Hypothesis Tests

Chapter 4 Hypothesis Tests 1 2 CHAPTER 4. HYPOTHESIS TESTS 4.1 Introducing Hypothesis Tests Key Concepts Motivate hypothesis tests Null and alternative hypotheses Introduce Concept of Statistical significance

### A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS

A SURVEY ON CONTINUOUS ELLIPTICAL VECTOR DISTRIBUTIONS Eusebio GÓMEZ, Miguel A. GÓMEZ-VILLEGAS and J. Miguel MARÍN Abstract In this paper it is taken up a revision and characterization of the class of

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### Risk, Return and Market Efficiency

Risk, Return and Market Efficiency For 9.220, Term 1, 2002/03 02_Lecture16.ppt Student Version Outline 1. Introduction 2. Types of Efficiency 3. Informational Efficiency 4. Forms of Informational Efficiency

### An introduction to statistical data analysis (Summer 2014) Lecture notes. Taught by Shravan Vasishth [vasishth@uni-potsdam.de]

An introduction to statistical data analysis (Summer 2014) Lecture notes Taught by Shravan Vasishth [vasishth@uni-potsdam.de] Last edited: May 9, 2014 2 > sessioninfo() R version 3.0.2 (2013-09-25) Platform:

### Algebra Practice Problems for Precalculus and Calculus

Algebra Practice Problems for Precalculus and Calculus Solve the following equations for the unknown x: 1. 5 = 7x 16 2. 2x 3 = 5 x 3. 4. 1 2 (x 3) + x = 17 + 3(4 x) 5 x = 2 x 3 Multiply the indicated polynomials