The Normal distribution



Similar documents
The Standard Normal distribution

Chapter 4. Probability and Probability Distributions

Hypothesis Testing for Beginners

Lecture 8. Confidence intervals and the central limit theorem

Week 3&4: Z tables and the Sampling Distribution of X

4. Continuous Random Variables, the Pareto and Normal Distributions

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

MEASURES OF VARIATION

Normal distribution. ) 2 /2σ. 2π σ

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

Math 151. Rumbos Spring Solutions to Assignment #22

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Sampling Distributions

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

6.4 Normal Distribution

Estimation of σ 2, the variance of ɛ

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Descriptive Statistics

Pr(X = x) = f(x) = λe λx

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem

EMPIRICAL FREQUENCY DISTRIBUTION

Coefficient of Determination

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment Statistics and Probability Example Part 1

3.2 Measures of Spread

5.1 Identifying the Target Parameter

MBA 611 STATISTICS AND QUANTITATIVE METHODS

BA 275 Review Problems - Week 6 (10/30/06-11/3/06) CD Lessons: 53, 54, 55, 56 Textbook: pp , ,

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Dongfeng Li. Autumn 2010

1 Prior Probability and Posterior Probability

An Introduction to Basic Statistics and Probability

UNIT I: RANDOM VARIABLES PART- A -TWO MARKS

3.4 The Normal Distribution

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Inference for two Population Means

The normal approximation to the binomial

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Chapter 7 - Practice Problems 1

Simple Regression Theory II 2010 Samuel L. Baker

Confidence intervals

Key Concept. Density Curve

Normal Distribution. Definition A continuous random variable has a normal distribution if its probability density. f ( y ) = 1.

Confidence Intervals for the Difference Between Two Means

CHAPTER 6: Continuous Uniform Distribution: 6.1. Definition: The density function of the continuous random variable X on the interval [A, B] is.

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Chapter 5: Normal Probability Distributions - Solutions

Week 4: Standard Error and Confidence Intervals

6.2 Normal distribution. Standard Normal Distribution:

Risk and return (1) Class 9 Financial Management,

An Introduction to Statistics Course (ECOE 1302) Spring Semester 2011 Chapter 10- TWO-SAMPLE TESTS

Chapter 9 Monté Carlo Simulation

HYPOTHESIS TESTING: POWER OF THE TEST

1 Sufficient statistics

5. Continuous Random Variables

1.5 Oneway Analysis of Variance

Lecture 8: More Continuous Random Variables

Beta Distribution. Paul Johnson and Matt Beverlin June 10, 2013

12.5: CHI-SQUARE GOODNESS OF FIT TESTS

Northumberland Knowledge

Chapter 3 RANDOM VARIATE GENERATION

Questions and Answers

9.07 Introduction to Statistical Methods Homework 4. Name:

AP Statistics Solutions to Packet 2

Statistical Data analysis With Excel For HSMG.632 students

Lesson 4 Measures of Central Tendency

Standard Deviation Estimator

The Normal Distribution. Alan T. Arnholt Department of Mathematical Sciences Appalachian State University

Chicago Booth BUSINESS STATISTICS Final Exam Fall 2011

Sample Term Test 2A. 1. A variable X has a distribution which is described by the density curve shown below:

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

The normal approximation to the binomial

NEW YORK CITY COLLEGE OF TECHNOLOGY The City University of New York

The Normal Distribution

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

Practice Problems and Exams

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Exploratory Data Analysis

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Comparing Means in Two Populations

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Confidence Intervals for One Standard Deviation Using Standard Deviation

Exact Confidence Intervals

Chapter 7 Section 1 Homework Set A

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Statistics 104: Section 6!

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

The Normal Distribution

z-scores AND THE NORMAL CURVE MODEL

Notes on Continuous Random Variables

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Capital Market Theory: An Overview. Return Measures

The Wilcoxon Rank-Sum Test

Lecture 2: Discrete Distributions, Normal Distributions. Chapter 1

Once saved, if the file was zipped you will need to unzip it. For the files that I will be posting you need to change the preferences.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Transcription:

The Normal distribution The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function f(y) = 1 2σ π e{ 1 2σ 2(y µ)2}, where y is the quantitative variable whose relative frequencies we are modeling, µ is the mean of the distribution, and σ 2 is the variance. For fixed values of µ and σ (e.g., 0 and 1) we can evaluate f(y) for a range of values of y. The plot of f(y) against y has the familiar bell shape. Figures. 1

Normal distribution (cont d) Examples of possibly normally distributed variables: Sale price of houses in a given area Interest rates offered by financial institutions in the US Corn yields across the 99 counties in Iowa. Examples of variables that are not normally distributed: Number of traffic accidents per week at each intersection in Ames. Proportion of US population who voted for Bush, Gore, or Nader in 2000. Proportion of consumers who prefer Coke or Pepsi. 2

Normal distribution (cont d) If the normal distribution is a good model for the relative frequencies of y, then we say that y is distributed as a normal random variable and write y N(µ, σ 2 ). We will be interesting in estimating µ and σ 2 from sample data. Given estimates of µ and σ 2, we can also estimate the probability that y is in the interval (a, b). From calculus: Prob(a < y < b) = b a f(y)dy. No need to know calculus! Tables of probabilities under the normal distribution can be used to calculate any probability of interest. Example: Table C.1 in text. Entries in the table give probability under the curve between 0 and z, where z = y µ σ The variable z is called a standard normal random variable and its distribution is a normal distribution with mean µ = 0 and variance σ 2 = 1. 3

Normal distribution (cont d) Example 1: y N(50, 225). Then z = y 50 N(0, 1). 15 To compute the probability that y is between 35 and 70, we first standardize and then use the table: 1. Standardize by subtracting mean and dividing by the standard deviation. For y = 70: and for y = 35: z = z = 70 50 15 35 50 15 = 1.33, = 1.00. 2. Get area under curve between 0 and 1.33 and between -1.00 and 0, and add them up: 0.4082 + 0.3413 = 0.7495. 3. Interpretation in English : if y is normal with mean 50 and standard deviation 15, we expect that the probability that y is between 35 and 70 is about 75%. 4

Normal distribution (cont d) Example 2: y N(10, 64). Find the probability that y is bigger than 15 and also the probability that y is less than 7. Prob(y > 15) = Prob( y 10 15 10 > ) 8 8 = Prob(z > 0.625) = 1 Prob(z < 0.625) = 1 (0.5 + 0.2357) = 0.264. See figure. Prob(y < 7) = Prob( y 10 < 7 10 ) 8 8 = Prob(z < 0.375) = 1 Prob(z > 0.375) = 1 (0.5 + 0.1480) = 0.352. See figure. 5

Normal distribution (cont d) For any value of (µ, σ 2 ), y is equally likely to be above or below the mean: Prob(y < µ) = Prob(y > µ) = 0.5), because the normal distribution is symmetric about the mean. Because of symmetry, the mean, median and mode of a normal variable are the same. A normal random variable can take on any value on the real line, so that Prob( < y < ) = 1. The probability that y is within a standard deviation of the mean is approximately 0.68: Prob( σ < y < σ) = Prob( 1 < z < 1) 0.68. and also: Prob( 2σ < y < 2σ) = Prob( 2 < z < 2) 0.95 and Prob( 3σ < y < 3σ) = Prob( 3 < z < 3) 0.99. 6

Sampling distributions We use sample data to make inferences about populations. In particular, we compute sample statistics and use them as estimators of population parameters. The sample mean ȳ is a good estimator of the population mean µ and the sample variance S 2 (or ˆσ 2 is a good estimator of the population variance σ 2. How reliable an estimator is a sample statistic? To answer the question, we need to know the sampling distribution of the statistic. This is one of the most difficult concepts in statistics! 7

Sampling distributions (cont d) Suppose that y N(µ, 25) and we wish to estimate µ. We proceed as follows: 1. Draw a sample of size n from the population: y 1, y 2,..., y n. 2. Compute the sample mean: ȳ = n 1 i y i. Two things to note: 1. The sample is random! If I had collected more than one sample of size n from the same population, I would have obtained different values of y. Then, ȳ is also a random variable. 2. The larger n, the more reliable is ȳ as an estimator of µ. Example using simulation: pretend that we have y N(20, 25) and draw 30 random samples, each of size n = 10 from the population using the computer. With each sample, compute ȳ. 8

Sampling distributions (cont d) The sampling distribution of a statistic computed from a sample of size n > 1 has smaller variance than the distribution of the variable itself. Theorem: If y i, y 2,..., y n are a random sample from some population, then: Mean of ȳ equals mean of y: E(ȳ) = µȳ = µ. Variance of ȳ is smaller than variance of y: σ 2 ȳ = σ 2 n. The larger the sample, the smaller the variance (or the higher the reliability) of ȳ. Central Limit Theorem: For large n, the sample mean ȳ has a distribution that is approximately normal, with mean µ and variance σ 2 /n, regardless of the shape of the distribution from which we sample the y s. The larger the sample, the better the approximation (see Figure 1.14). Example in lab. 9

Sampling distributions (cont d) Given the sampling distribution of ȳ, we can make probability statements such as: Prob(µ 2 σ n < ȳ < µ + 2 σ n ) = 0.95. Prob(ȳ > a) = Prob(z > a µ σ/ n ) and use the standard normal table to get an answer. A preview of coming attractions: if it is true that Prob(µ 2 σ n < ȳ < µ + 2 σ n ) = 0.95 then we can estimate the following interval using sample data: (ȳ 2 ˆσ n, ȳ + 2 ˆσ n ) We call it a 95% confidence interval for the population mean. As in the case of ȳ, the interval is also random and varies from sample to sample. If we drew 100 samples of size n from some population and computed 100 intervals like the one above, about 95 of them would cover the true but unknown value of µ. 10