11.2 POINT ESTIMATES AND CONFIDENCE INTERVALS

Size: px
Start display at page:

Download "11.2 POINT ESTIMATES AND CONFIDENCE INTERVALS"

Transcription

1 11.2 POINT ESTIMATES AND CONFIDENCE INTERVALS Point Estimates Suppose we want to estimate the proportion of Americans who approve of the president. In the previous section we took a random sample of size 1200 from the population and used the proportion of the people in the sample who approved of the president to estimate the proportion of the people in

2 the entire country who approve of the president. The proportion of people in the sample who approve of the president is an example of what is called a point estimate. It is called a point estimate because it is a single number that estimates a population parameter, here the proportion of the people in the country who approve of the president. Example 11.3 Suppose that out of 1200 people in the sample, 615 approve of the president. The point estimate of the proportion of the people in the country who approve of the president is 615/ Example 11.4 Suppose that in the example of the 10 containers of sampled water, the sample 2 average of the 10 observed E. coli bacteria densities is 1500/cm. The point estimate 2 of the density of E. coli bacteria in the swimming pool is 1500/cm. Example 11.5 It was desired to find the population mean and standard deviation of the ages of the students enrolled at a particular community college. It was impossible to survey all the students, so a random sample of 100 students was taken and the mean and the standard deviation of the ages of the students in the sample were calculated. The sample mean was and the sample standard deviation was The sample mean age, X , is a point estimate of the population mean age,. The sample standard deviation of the ages, S 2. 25, is a point estimate of the population standard deviation,. In summary, a point estimate for a population parameter of interest is a statistic computed from the sample. It is believed to be an effective estimate of the unknown population parameter. Often, what formula to use for the statistic is obvious. For example, for the population parameter, which is the mean over a (presumably large) population, it seems that the sample mean will be a good estimate. However, in more complex situations the choice of an estimate of a population parameter is not always clear. Statisticians may find that either they have no idea how to use the sample data to estimate the population parameter of interest, or they may in fact have several equally plausible competing estimates to select from. A relatively simple example of the latter case is the estimation of 2. Let n denote the sample size. Many statisticians would use S2 sum( X X 2 n X X 2 n X i ) /, while many others would use sum( ) /( 1). Here denotes the i i

3 ith observation of the random sample. A solid argument can be made for either estimate. In fact, a significant amount of statistical theory is devoted to finding the best estimate of a population parameter. Let s consider an example in which the formula for a good estimate is not obvious. Example 11.6 Suppose the waiting time for a train that goes to the parking area from a particular terminal at Chicago s O Hare International Airport obeys a continuous uniform distribution on the time interval [0,T], where Tis an unknown population parameter. Thus a train always arrives within T minutes, but it is equally likely to arrive at any moment during this waiting time. Suppose we interview five randomly selected passengers and find that their waiting times were (in minutes) 1.4, 4.7, 4.2, 5.1, and 2.1. We want to estimate the parameter T. Clearly the sample mean is a poor choice for an estimate of T. What formula should we use? A widely used technique for producing an estimate when we have none in mind, called maximum likelihood estimation, leads us to an estimate that is equal to the maximum of the five observed times, which is 5.1 minutes. This choice of the maximum of the observed values as our point estimate is not obvious. Even though it might not be the best choice for an estimate of T, it seems clearly better than X. We will not consider maximum likelihood estimation in this book. A point estimate reports one single value that we estimate to be the true value of the population parameter. However, point estimates have the important limitation of not informing us how much the estimate is likely to be in error. Whenever we estimate a population parameter, we lack total accuracy. Thus our estimate will almost always be different from the actual population parameter. So the estimate will almost always have some amount of error associated with it. By simply reporting the point estimate of a parameter, we have essentially ignored the important issue of the likely error size associated with the estimate. Example 11.7 Reconsider Example A second random sample, this time of 200 students, was taken. Suppose, to keep our explanation simple, the point estimate of the population mean is the same in both cases, namely The standard deviation of the second sample mean would be much lower than the standard deviation of the first mean, because the second sample is larger. So even though the two point estimates of the population mean are the same, the second one is surely more accurate, because it is based on more information namely, twice as many observations.

4 Confidence Intervals To improve on point estimates, statisticians usually report an interval of values that they believe the parameter is highly likely to lie in. Usually the point estimate is the middle point of the interval and the endpoints of the interval communicate the size of the error associated with the estimate (recall that point estimates ignore this error) and how confident we are that the population parameter is in the interval. The intervals are called confidence intervals. Typical confidence levels used in practice for confidence intervals are 90%, 95%, or 99%, with 95% occurring most frequently in applications. In Examples 11.1 and 11.3, if we are given a 95% confidence interval for the proportion of the population that approves of the president, which can be shown to be the interval (0.48, 0.54), we say we are 95% confident that the population proportion is contained in the confidence interval. In Examples 11.2 and 11.4, given a 90% confidence interval for the density of bacteria in the swimming pool, (1490, 1510) say, we say we are 90% confident that the density of bacteria in the swimming pool is contained in the confidence interval. (We will learn how to calculate confidence intervals for different population parameters later in the chapter.) What does it really mean to state a 95% confidence interval for the unknown population proportion approving of the president? Although 95% sounds impressive, we cannot be satisfied unless we understand what it means. Theoretically, it means that the probability is 0.95 that such a confidence interval, which will be random because the sample it is formed from is random, will contain (surround) the unknown proportion in the population approving of the president. Our experimental view of probability based on the five-step method will help us more clearly and deeply understand what this probability of 0.95 means practically. Just as we do simulations over and over in the five-step method, imagine that a statistician does the sampling experiment of Example 11.1 over and over 1000 times, say and each time computes a 95% confidence interval from the 1200 sampled people. Now we can find the experimental probability given by the proportion of the 1000 confidence intervals that actually covers the true fraction of the population favoring the president. Since 0.95 is the theoretical probability of the interval containing the population proportion, this experimental probability of the 1000 confidence intervals including the true value will also be close to (Below, we will simulate 100 such 95% confidence intervals and calculate the experimental confidence interval probability.) Of course, in a real application there will only be one random sample and hence only one such confidence interval, such as the (0.48, 0.54) interval of Examples 11.1 and But the statistician obtaining this one

5 sample knows, because of the experimental probability viewpoint, that this confidence interval is very likely to be correct in the sense that it contains the true value (since about 95% of such confidence intervals would cover the true population proportion). In the case of the (0.48, 0.54) interval of Examples 11.1 and 11.3, we know it is very likely that the true proportion of people favoring the president lies between 0.48 and Now in light of this insight into how to interpret the confidence interval percentage, let s return to the Key Problem. The St. Louis Post Dispatch explained the concept of its reported confidence interval this way: [A 95% confidence interval] means if the survey were taken 100 times, the results for the [random] group of respondents would each vary no more than 5.7 percent in either direction from the true population percentage opposing the stadium [about] 95% of these times. The Post Dispatch quote is a bit roundabout and hence forces us to go through a slightly tricky piece of logic (draw yourself a picture if needed). If the interval, which extends 5.7% in either direction from its midpoint, indeed varies no more than 5.7% in either direction from the true population mean about 95% of the time, then about 95% of these intervals must contain the true population percentage as desired. The Post Dispatch could have more simply and more directly told its readers that such an interval can be expected to contain the true population parameter about 95% of the time. In other words, just as explained above in the presidential popularity example, the Post Dispatch is pointing out that if you take 100 random samples from the same population and calculate the confidence interval for the population proportion for each sample, about 95% of the confidence intervals will include the true population proportion. That is, you will be correct in your claim that the unknown population proportion is in the interval computed using the sample for about 95 of the 100 samples. Using our five-step method, we now simulate 100 confidence intervals and determine how many of them contain the true population parameter. Suppose that for the Key Problem the true proportion of people in the county who are opposed to the stadium is 50% (remember that the parameter value is never known to the statistician). In that case the probability that a person in the sample will be opposed to the stadium is the same as the probability of heads seen in flipping a fair coin: 0.5. Our goal is to obtain 100 samples of 301 people and calculate the confidence interval for each of these 100 samples (we will learn to compute such confidence intervals in Section 11.6 below). Each sampling is the same as flipping a coin 301 times, recording the number of heads, and calculating the confidence interval for the proportion of heads in the sample. We then repeat the process 100 times. The 100 confidence intervals obtained from this process are represented in Figure 11.1.

6 Figure 11.1 One hundred simulated 95% confidence intervals for the Key Problem assuming a 50/50 population split. The line across the middle of the graph represents the true population proportion, The 100 confidence intervals are the vertical lines on the graph. If the confidence interval covers the 0.50 line, then we say that the true population proportion is contained in the interval. Likewise, if the confidence interval does not cover the 0.50 line, then we say that the true population proportion is not contained in the interval. For the graph of Figure 11.1, we see that 96 out of the 100 confidence intervals cover the 0.50 line. So for the 100 confidence intervals, 96% of them (about 95%, as expected) contain the true population parameter, Thus, this example of the five-step method clearly illustrates how we are to correctly interpret a 95% confidence interval. As already discussed above, it is a sort of statisticians success rate or batting average. If a statistician constructs % confidence intervals during a year s work, then, as our five-step simulation confirms, we can expect about 95% of them to be hits : cases in which the population parameter is contained in the interval. Similarly we would expect about 5% of them to be outs, or misses: cases in which the population parameter is not contained in the interval. Compared with baseball, in which a batting average of is considered great, a confidence interval coverage rate for statisticians is what is usually required. Batting is never guaranteed in baseball, but in fact a statistician can guarantee a 95% confidence interval, as we shall see. Confidence intervals have two basic characteristics that we need to understand. First, given the same set of data, a 95% confidence interval is wider than a 90% confidence interval, and a 99% confidence interval is wider than a 95% confidence interval. Thus, the higher the confidence level we require, the wider the interval we are forced to accept! Of course, a very wide interval is of little use to the scientist who has sought statistical advice. Thus there is no free lunch in specifying a 99% confidence instead of a

7 SECTION 11.2 EXERCISES 95% confidence, because the price paid is a wider interval. Here are the 90%, 95%, and 99% confidence intervals for the proportion of people in the country who approve of the president in Example 11.1: 90%: (0.49, 0.53) 95%: (0.48, 0.54) 99%: (0.47, 0.55) As you can see from this example, the higher the level of confidence, the wider the confidence interval needs to be in order to contain the population proportion with the specified confidence. The second characteristic of confidence intervals is that, given the same confidence level, a shorter and hence more informative confidence interval is associated with more data points. Suppose in the situation of Example 11.1 a sample of 2400 people was taken from the population and the number of people in the sample that approved of the president was The point estimate of the population proportion would be the same: 1230/ However, a 95% confidence interval for the population proportion based on this sample of 2400 can be shown to be (0.4925, ). This is shorter than the 95% confidence interval for the original sample of 1200, which can be shown to be (0.4842, ). Indeed, the result of increasing the sample size is a shorter interval (of length 0.04 compared with 0.057) for the same confidence level of 95%. In summary, point estimates provide only a single number to estimate the value of a population parameter. By contrast, confidence intervals give a range of values that we reasonably expect will contain the population parameter. Again, a 95% confidence interval means that if we were to take a large number of samples (like 100 or 1000) of equal size from the same population and calculate a confidence interval for the population parameter, about 95% of the confidence intervals would contain the true value of the population parameter. 1. Suppose we want to estimate the proportion 3. Explain the meaning of a 99% confidence of a city s residents who drive to work. What level. is a good choice for the point estimate of this 4. Suppose you want a confidence interval for proportion? a population proportion. You want to be as 2. Suppose, instead, we want to estimate the av- accurate as possible, so you select a 100% conerage number of miles people living in a city fidence level. What would your confidence drive to work. What are two possible choices interval have to be? for the point estimate of this average?

8 5. Which confidence interval, when based on 7. A Gallup poll of 1013 adults found 61% of the same data, is wider: an 80% or an 85% the people in the sample drink alcoholic bevconfidence interval? erages, which yields a confidence interval of 6. True or false: If you flip a fair coin 100 times, (58%, 64%). True or false: There is an apthen calculate a 95% confidence interval, there proximate 95% chance that the percentage of is an approximate 95% chance 1/2 will be in adults in the population who drink alcoholic the interval. beverages is between 58% and 64%.

Statistical Inference

Statistical Inference Statistical Inference Idea: Estimate parameters of the population distribution using data. How: Use the sampling distribution of sample statistics and methods based on what would happen if we used this

More information

Chapter 7 Part 2. Hypothesis testing Power

Chapter 7 Part 2. Hypothesis testing Power Chapter 7 Part 2 Hypothesis testing Power November 6, 2008 All of the normal curves in this handout are sampling distributions Goal: To understand the process of hypothesis testing and the relationship

More information

Point and Interval Estimates

Point and Interval Estimates Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number

More information

Margin of Error When Estimating a Population Proportion

Margin of Error When Estimating a Population Proportion Margin of Error When Estimating a Population Proportion Student Outcomes Students use data from a random sample to estimate a population proportion. Students calculate and interpret margin of error in

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

SAMPLING DISTRIBUTIONS

SAMPLING DISTRIBUTIONS 0009T_c07_308-352.qd 06/03/03 20:44 Page 308 7Chapter SAMPLING DISTRIBUTIONS 7.1 Population and Sampling Distributions 7.2 Sampling and Nonsampling Errors 7.3 Mean and Standard Deviation of 7.4 Shape of

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

Models for Discrete Variables

Models for Discrete Variables Probability Models for Discrete Variables Our study of probability begins much as any data analysis does: What is the distribution of the data? Histograms, boxplots, percentiles, means, standard deviations

More information

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe

Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Introduction to the Practice of Statistics Fifth Edition Moore, McCabe Section 5.1 Homework Answers 5.7 In the proofreading setting if Exercise 5.3, what is the smallest number of misses m with P(X m)

More information

Tossing a Biased Coin

Tossing a Biased Coin Tossing a Biased Coin Michael Mitzenmacher When we talk about a coin toss, we think of it as unbiased: with probability one-half it comes up heads, and with probability one-half it comes up tails. An ideal

More information

How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

More information

Hypothesis Testing Summary

Hypothesis Testing Summary Hypothesis Testing Summary Hypothesis testing begins with the drawing of a sample and calculating its characteristics (aka, statistics ). A statistical test (a specific form of a hypothesis test) is an

More information

Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

More information

Carolyn Anderson & Youngshil Paek (Slides created by Shuai Sam Wang) Department of Educational Psychology University of Illinois at Urbana-Champaign

Carolyn Anderson & Youngshil Paek (Slides created by Shuai Sam Wang) Department of Educational Psychology University of Illinois at Urbana-Champaign Carolyn Anderson & Youngshil Paek (Slides created by Shuai Sam Wang) Department of Educational Psychology University of Illinois at Urbana-Champaign Key Points 1. Data 2. Variable 3. Types of data 4. Define

More information

Social Studies 201 Notes for November 19, 2003

Social Studies 201 Notes for November 19, 2003 1 Social Studies 201 Notes for November 19, 2003 Determining sample size for estimation of a population proportion Section 8.6.2, p. 541. As indicated in the notes for November 17, when sample size is

More information

4. Introduction to Statistics

4. Introduction to Statistics Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

More information

Math 251, Review Questions for Test 3 Rough Answers

Math 251, Review Questions for Test 3 Rough Answers Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,

More information

Lesson 17: Margin of Error When Estimating a Population Proportion

Lesson 17: Margin of Error When Estimating a Population Proportion Margin of Error When Estimating a Population Proportion Classwork In this lesson, you will find and interpret the standard deviation of a simulated distribution for a sample proportion and use this information

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1.

MATH10212 Linear Algebra. Systems of Linear Equations. Definition. An n-dimensional vector is a row or a column of n numbers (or letters): a 1. MATH10212 Linear Algebra Textbook: D. Poole, Linear Algebra: A Modern Introduction. Thompson, 2006. ISBN 0-534-40596-7. Systems of Linear Equations Definition. An n-dimensional vector is a row or a column

More information

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago 8 June 1998, Corrections 14 February 2010 Abstract Results favoring one treatment over another

More information

Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

More information

Introduction to Hypothesis Testing

Introduction to Hypothesis Testing I. Terms, Concepts. Introduction to Hypothesis Testing A. In general, we do not know the true value of population parameters - they must be estimated. However, we do have hypotheses about what the true

More information

Testing Scientific Explanations (In words slides page 7)

Testing Scientific Explanations (In words slides page 7) Testing Scientific Explanations (In words slides page 7) Most people are curious about the causes for certain things. For example, people wonder whether exercise improves memory and, if so, why? Or, we

More information

MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample

MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample MATH 10: Elementary Statistics and Probability Chapter 9: Hypothesis Testing with One Sample Tony Pourmohamad Department of Mathematics De Anza College Spring 2015 Objectives By the end of this set of

More information

Statistical Foundations:

Statistical Foundations: Statistical Foundations: Hypothesis Testing Psychology 790 Lecture #9 9/19/2006 Today sclass Hypothesis Testing. General terms and philosophy. Specific Examples Hypothesis Testing Rules of the NHST Game

More information

Probability distributions

Probability distributions Probability distributions (Notes are heavily adapted from Harnett, Ch. 3; Hayes, sections 2.14-2.19; see also Hayes, Appendix B.) I. Random variables (in general) A. So far we have focused on single events,

More information

Ungrouped data. A list of all the values of a variable in a data set is referred to as ungrouped data.

Ungrouped data. A list of all the values of a variable in a data set is referred to as ungrouped data. 1 Social Studies 201 September 21, 2006 Presenting data See text, chapter 4, pp. 87-160. Data sets When data are initially obtained from questionnaires, interviews, experiments, administrative sources,

More information

Basic Probability Theory I

Basic Probability Theory I A Probability puzzler!! Basic Probability Theory I Dr. Tom Ilvento FREC 408 Our Strategy with Probability Generally, we want to get to an inference from a sample to a population. In this case the population

More information

Chapter 7 Review. Confidence Intervals. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 7 Review. Confidence Intervals. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Chapter 7 Review Confidence Intervals MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) Suppose that you wish to obtain a confidence interval for

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Sanjeev Arora Scribe: One of the running themes in this course is the notion of

More information

3 Some Integer Functions

3 Some Integer Functions 3 Some Integer Functions A Pair of Fundamental Integer Functions The integer function that is the heart of this section is the modulo function. However, before getting to it, let us look at some very simple

More information

8. You have three six-sided dice (one red, one green, and one blue.) The numbers of the dice are

8. You have three six-sided dice (one red, one green, and one blue.) The numbers of the dice are Probability Warm up problem. Alice and Bob each pick, at random, a real number between 0 and 10. Call Alice s number A and Bob s number B. The is the probability that A B? What is the probability that

More information

Reasoning with Uncertainty More about Hypothesis Testing. P-values, types of errors, power of a test

Reasoning with Uncertainty More about Hypothesis Testing. P-values, types of errors, power of a test Reasoning with Uncertainty More about Hypothesis Testing P-values, types of errors, power of a test P-Values and Decisions Your conclusion about any null hypothesis should be accompanied by the P-value

More information

Lab 11. Simulations. The Concept

Lab 11. Simulations. The Concept Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that

More information

Hypothesis Testing for Beginners

Hypothesis Testing for Beginners Hypothesis Testing for Beginners Michele Piffer LSE August, 2011 Michele Piffer (LSE) Hypothesis Testing for Beginners August, 2011 1 / 53 One year ago a friend asked me to put down some easy-to-read notes

More information

Module 7: Hypothesis Testing I Statistics (OA3102)

Module 7: Hypothesis Testing I Statistics (OA3102) Module 7: Hypothesis Testing I Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 10.1-10.5 Revision: 2-12 1 Goals for this Module

More information

1.7 Graphs of Functions

1.7 Graphs of Functions 64 Relations and Functions 1.7 Graphs of Functions In Section 1.4 we defined a function as a special type of relation; one in which each x-coordinate was matched with only one y-coordinate. We spent most

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

6.3 Conditional Probability and Independence

6.3 Conditional Probability and Independence 222 CHAPTER 6. PROBABILITY 6.3 Conditional Probability and Independence Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Problem Solving and Data Analysis

Problem Solving and Data Analysis Chapter 20 Problem Solving and Data Analysis The Problem Solving and Data Analysis section of the SAT Math Test assesses your ability to use your math understanding and skills to solve problems set in

More information

Statistical Foundations:

Statistical Foundations: Statistical Foundations: Hypothesis Testing Psychology 790 Lecture #10 9/26/2006 Today sclass Hypothesis Testing. An Example. Types of errors illustrated. Misconceptions about hypothesis testing. Upcoming

More information

Sampling and Sampling Distributions

Sampling and Sampling Distributions Sampling and Sampling Distributions Random Sampling A sample is a group of objects or readings taken from a population for counting or measurement. We shall distinguish between two kinds of populations

More information

Chapter 20: chance error in sampling

Chapter 20: chance error in sampling Chapter 20: chance error in sampling Context 2 Overview................................................................ 3 Population and parameter..................................................... 4

More information

Null Hypothesis H 0. The null hypothesis (denoted by H 0

Null Hypothesis H 0. The null hypothesis (denoted by H 0 Hypothesis test In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test (or test of significance) is a standard procedure for testing a claim about a property

More information

Statistical estimation using confidence intervals

Statistical estimation using confidence intervals 0894PP_ch06 15/3/02 11:02 am Page 135 6 Statistical estimation using confidence intervals In Chapter 2, the concept of the central nature and variability of data and the methods by which these two phenomena

More information

1 Sufficient statistics

1 Sufficient statistics 1 Sufficient statistics A statistic is a function T = rx 1, X 2,, X n of the random sample X 1, X 2,, X n. Examples are X n = 1 n s 2 = = X i, 1 n 1 the sample mean X i X n 2, the sample variance T 1 =

More information

Graphical Presentation of Data

Graphical Presentation of Data Graphical Presentation of Data Guidelines for Making Graphs Titles should tell the reader exactly what is graphed Remove stray lines, legends, points, and any other unintended additions by the computer

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, 99.42 cm Error Analysis and the Gaussian Distribution In experimental science theory lives or dies based on the results of experimental evidence and thus the analysis of this evidence is a critical part of the

More information

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012

COMP 250 Fall 2012 lecture 2 binary representations Sept. 11, 2012 Binary numbers The reason humans represent numbers using decimal (the ten digits from 0,1,... 9) is that we have ten fingers. There is no other reason than that. There is nothing special otherwise about

More information

1) What is the probability that the random variable has a value greater than 2? A) 0.750 B) 0.625 C) 0.875 D) 0.700

1) What is the probability that the random variable has a value greater than 2? A) 0.750 B) 0.625 C) 0.875 D) 0.700 Practice for Chapter 6 & 7 Math 227 This is merely an aid to help you study. The actual exam is not multiple choice nor is it limited to these types of questions. Using the following uniform density curve,

More information

Introduction to Diophantine Equations

Introduction to Diophantine Equations Introduction to Diophantine Equations Tom Davis tomrdavis@earthlink.net http://www.geometer.org/mathcircles September, 2006 Abstract In this article we will only touch on a few tiny parts of the field

More information

Cents and the Central Limit Theorem Overview of Lesson GAISE Components Common Core State Standards for Mathematical Practice

Cents and the Central Limit Theorem Overview of Lesson GAISE Components Common Core State Standards for Mathematical Practice Cents and the Central Limit Theorem Overview of Lesson In this lesson, students conduct a hands-on demonstration of the Central Limit Theorem. They construct a distribution of a population and then construct

More information

Mind on Statistics. Chapter 12

Mind on Statistics. Chapter 12 Mind on Statistics Chapter 12 Sections 12.1 Questions 1 to 6: For each statement, determine if the statement is a typical null hypothesis (H 0 ) or alternative hypothesis (H a ). 1. There is no difference

More information

Sampling Distribution of a Sample Proportion

Sampling Distribution of a Sample Proportion Sampling Distribution of a Sample Proportion From earlier material remember that if X is the count of successes in a sample of n trials of a binomial random variable then the proportion of success is given

More information

Induction. Margaret M. Fleck. 10 October These notes cover mathematical induction and recursive definition

Induction. Margaret M. Fleck. 10 October These notes cover mathematical induction and recursive definition Induction Margaret M. Fleck 10 October 011 These notes cover mathematical induction and recursive definition 1 Introduction to induction At the start of the term, we saw the following formula for computing

More information

Experimental Analysis

Experimental Analysis Experimental Analysis Instructors: If your institution does not have the Fish Farm computer simulation, contact the project directors for information on obtaining it free of charge. The ESA21 project team

More information

1. The Fly In The Ointment

1. The Fly In The Ointment Arithmetic Revisited Lesson 5: Decimal Fractions or Place Value Extended Part 5: Dividing Decimal Fractions, Part 2. The Fly In The Ointment The meaning of, say, ƒ 2 doesn't depend on whether we represent

More information

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice,

More information

+ Section 6.2 and 6.3

+ Section 6.2 and 6.3 Section 6.2 and 6.3 Learning Objectives After this section, you should be able to DEFINE and APPLY basic rules of probability CONSTRUCT Venn diagrams and DETERMINE probabilities DETERMINE probabilities

More information

, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0

, for x = 0, 1, 2, 3,... (4.1) (1 + 1/n) n = 2.71828... b x /x! = e b, x=0 Chapter 4 The Poisson Distribution 4.1 The Fish Distribution? The Poisson distribution is named after Simeon-Denis Poisson (1781 1840). In addition, poisson is French for fish. In this chapter we will

More information

The Margin of Error for Differences in Polls

The Margin of Error for Differences in Polls The Margin of Error for Differences in Polls Charles H. Franklin University of Wisconsin, Madison October 27, 2002 (Revised, February 9, 2007) The margin of error for a poll is routinely reported. 1 But

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

Inferential Statistics

Inferential Statistics Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

More information

Probability. A random sample is selected in such a way that every different sample of size n has an equal chance of selection.

Probability. A random sample is selected in such a way that every different sample of size n has an equal chance of selection. 1 3.1 Sample Spaces and Tree Diagrams Probability This section introduces terminology and some techniques which will eventually lead us to the basic concept of the probability of an event. The Rare Event

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

First-year Statistics for Psychology Students Through Worked Examples. 2. Probability and Bayes Theorem

First-year Statistics for Psychology Students Through Worked Examples. 2. Probability and Bayes Theorem First-year Statistics for Psychology Students Through Worked Examples 2. Probability and Bayes Theorem by Charles McCreery, D.Phil Formerly Lecturer in Experimental Psychology Magdalen College Oxford Copyright

More information

Survey Sampling. Know How No 9 guidance for research and evaluation in Fife. What this is about? Who is it for? What do you need to know?

Survey Sampling. Know How No 9 guidance for research and evaluation in Fife. What this is about? Who is it for? What do you need to know? guidance for research and evaluation in Fife What this is about? Sampling allows you to draw conclusions about a particular population by examining a part of it. When carrying out a survey, it is not usually

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

5.4 The Quadratic Formula

5.4 The Quadratic Formula Section 5.4 The Quadratic Formula 481 5.4 The Quadratic Formula Consider the general quadratic function f(x) = ax + bx + c. In the previous section, we learned that we can find the zeros of this function

More information

Using Kruskal-Wallis to Improve Customer Satisfaction. A White Paper by. Sheldon D. Goldstein, P.E. Managing Partner, The Steele Group

Using Kruskal-Wallis to Improve Customer Satisfaction. A White Paper by. Sheldon D. Goldstein, P.E. Managing Partner, The Steele Group Using Kruskal-Wallis to Improve Customer Satisfaction A White Paper by Sheldon D. Goldstein, P.E. Managing Partner, The Steele Group Using Kruskal-Wallis to Improve Customer Satisfaction KEYWORDS Kruskal-Wallis

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

Power and Sample Size Determination

Power and Sample Size Determination Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 Power 1 / 31 Experimental Design To this point in the semester,

More information

Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22

Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22 CS 70 Discrete Mathematics for CS Fall 2006 Papadimitriou & Vazirani Lecture 22 Introduction to Discrete Probability Probability theory has its origins in gambling analyzing card games, dice, roulette

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level S2 of challenge: B/C S2 Mathematical goals Starting points Materials required Time needed Evaluating probability statements To help learners to: discuss and clarify some common misconceptions about

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation 10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

More information

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological

More information

Discrete Random Variables and their Probability Distributions

Discrete Random Variables and their Probability Distributions CHAPTER 5 Discrete Random Variables and their Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Discrete Random Variable

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results.

, has mean A) 0.3. B) the smaller of 0.8 and 0.5. C) 0.15. D) which cannot be determined without knowing the sample results. BA 275 Review Problems - Week 9 (11/20/06-11/24/06) CD Lessons: 69, 70, 16-20 Textbook: pp. 520-528, 111-124, 133-141 An SRS of size 100 is taken from a population having proportion 0.8 of successes. An

More information

ELEMENTARY PROBABILITY

ELEMENTARY PROBABILITY ELEMENTARY PROBABILITY Events and event sets. Consider tossing a die. There are six possible outcomes, which we shall denote by elements of the set {A i ; i =1, 2,...,6}. A numerical value is assigned

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

CHAPTER 3 Numbers and Numeral Systems

CHAPTER 3 Numbers and Numeral Systems CHAPTER 3 Numbers and Numeral Systems Numbers play an important role in almost all areas of mathematics, not least in calculus. Virtually all calculus books contain a thorough description of the natural,

More information

Problem of the Month: Fair Games

Problem of the Month: Fair Games Problem of the Month: The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards:

More information

THE STATISTICAL TREATMENT OF EXPERIMENTAL DATA 1

THE STATISTICAL TREATMENT OF EXPERIMENTAL DATA 1 THE STATISTICAL TREATMET OF EXPERIMETAL DATA Introduction The subject of statistical data analysis is regarded as crucial by most scientists, since error-free measurement is impossible in virtually all

More information

Standard Deviation Calculator

Standard Deviation Calculator CSS.com Chapter 35 Standard Deviation Calculator Introduction The is a tool to calculate the standard deviation from the data, the standard error, the range, percentiles, the COV, confidence limits, or

More information

LINEAR INEQUALITIES. Mathematics is the art of saying many things in many different ways. MAXWELL

LINEAR INEQUALITIES. Mathematics is the art of saying many things in many different ways. MAXWELL Chapter 6 LINEAR INEQUALITIES 6.1 Introduction Mathematics is the art of saying many things in many different ways. MAXWELL In earlier classes, we have studied equations in one variable and two variables

More information

Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

More information

Section 1.1 Linear Equations: Slope and Equations of Lines

Section 1.1 Linear Equations: Slope and Equations of Lines Section. Linear Equations: Slope and Equations of Lines Slope The measure of the steepness of a line is called the slope of the line. It is the amount of change in y, the rise, divided by the amount of

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information