Chapter 20: chance error in sampling



Similar documents
Stat 20: Intro to Probability and Statistics

AMS 5 CHANCE VARIABILITY

The Normal Approximation to Probability Histograms. Dice: Throw a single die twice. The Probability Histogram: Area = Probability. Where are we going?

Elementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.

Stat 20: Intro to Probability and Statistics

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Chapter 16: law of averages

The overall size of these chance errors is measured by their RMS HALF THE NUMBER OF TOSSES NUMBER OF HEADS MINUS NUMBER OF TOSSES

Chapter 16. Law of averages. Chance. Example 1: rolling two dice Sum of draws. Setting up a. Example 2: American roulette. Summary.

$ ( $1) = 40

Elementary Statistics and Inference. Elementary Statistics and Inference. 16 The Law of Averages (cont.) 22S:025 or 7P:025.

MA 1125 Lecture 14 - Expected Values. Friday, February 28, Objectives: Introduce expected values.

Lecture 13. Understanding Probability and Long-Term Expectations

The Math. P (x) = 5! = = 120.

Chapter 4 - Practice Problems 1

John Kerrich s coin-tossing Experiment. Law of Averages - pg. 294 Moore s Text

STA 130 (Winter 2016): An Introduction to Statistical Reasoning and Data Science

Probability Distribution for Discrete Random Variables

Northumberland Knowledge

Ch. 13.3: More about Probability

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Introductory Probability. MATH 107: Finite Mathematics University of Louisville. March 5, 2014

Problem Solving and Data Analysis

Chapter 4 & 5 practice set. The actual exam is not multiple choice nor does it contain like questions.

2. How many ways can the letters in PHOENIX be rearranged? 7! = 5,040 ways.

How are polls conducted? by Frank Newport, Lydia Saad, David Moore from Where America Stands, 1997 John Wiley & Sons, Inc.

Problem sets for BUEC 333 Part 1: Probability and Statistics

Week 5: Expected value and Betting systems

Question: What is the probability that a five-card poker hand contains a flush, that is, five cards of the same suit?

Math 58. Rumbos Fall Solutions to Review Problems for Exam 2

V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

Sampling Procedures Y520. Strategies for Educational Inquiry. Robert S Michael

Lotto Master Formula (v1.3) The Formula Used By Lottery Winners

9. Sampling Distributions

The Margin of Error for Differences in Polls

WISE Sampling Distribution of the Mean Tutorial

Basic Probability Theory II

Elementary Statistics

Am I Decisive? Handout for Government 317, Cornell University, Fall Walter Mebane

13.0 Central Limit Theorem

A probability experiment is a chance process that leads to well-defined outcomes. 3) What is the difference between an outcome and an event?

In all 1336 (35.9%) said yes, 2051 (55.1%) said no and 335 (9.0%) said not sure

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 13. Random Variables: Distribution and Expectation

Building Java Programs

Probabilities. Probability of a event. From Random Variables to Events. From Random Variables to Events. Probability Theory I

Example. A casino offers the following bets (the fairest bets in the casino!) 1 You get $0 (i.e., you can walk away)

6.3 Conditional Probability and Independence

Probability. Sample space: all the possible outcomes of a probability experiment, i.e., the population of outcomes

THE STATISTICAL TREATMENT OF EXPERIMENTAL DATA 1

Need for Sampling. Very large populations Destructive testing Continuous production process

Windows-Based Meta-Analysis Software. Package. Version 2.0

Mathematical Expectation

STT 200 LECTURE 1, SECTION 2,4 RECITATION 7 (10/16/2012)

In the situations that we will encounter, we may generally calculate the probability of an event

SAMPLING DISTRIBUTIONS

Chapter 4 Lecture Notes

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Lesson 1. Basics of Probability. Principles of Mathematics 12: Explained! 314

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

Programming Your Calculator Casio fx-7400g PLUS

6.042/18.062J Mathematics for Computer Science. Expected Value I

36 Odds, Expected Value, and Conditional Probability

Probability and Expected Value

6.4 Special Factoring Rules

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

Random variables, probability distributions, binomial random variable

Measurement in ediscovery

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Regular smoker

Unit 19: Probability Models

Find the indicated probability. 1) If a single fair die is rolled, find the probability of a 4 given that the number rolled is odd.

Probability Using Dice

How to Verify Performance Specifications

CALCULATIONS & STATISTICS

Basic Probability. Probability: The part of Mathematics devoted to quantify uncertainty

Math 55: Discrete Mathematics

Simple Random Sampling

Ch. 13.2: Mathematical Expectation

7.S.8 Interpret data to provide the basis for predictions and to establish

Statistical & Technical Team

MAT 155. Chapter 1 Introduction to Statistics. Key Concept. Basics of Collecting Data. 155S1.5_3 Collecting Sample Data.

Contemporary Mathematics- MAT 130. Probability. a) What is the probability of obtaining a number less than 4?

Section 7C: The Law of Large Numbers

5.1 Identifying the Target Parameter

Chapter 4 - Practice Problems 2

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Inaugural Lecture. Jan Vecer

MrMajik s Money Management Strategy Copyright MrMajik.com 2003 All rights reserved.

Queuing Theory. Long Term Averages. Assumptions. Interesting Values. Queuing Model

Monte Carlo simulations and option pricing

Point and Interval Estimates

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

Free Pre-Algebra Lesson 8 page 1

American Association for Laboratory Accreditation

Statistics 100A Homework 2 Solutions

Probability, Statistics, & Data Analysis (PSD) Numbers: Concepts & Properties (NCP)

Transcription:

Chapter 20: chance error in sampling Context 2 Overview................................................................ 3 Population and parameter..................................................... 4 Sample and statistic........................................................ 5 Expected value and SE for percentage 6 Example................................................................ 7 When we do this 10,000 times................................................... 8 Example................................................................ 9 Strategy................................................................. 10 Example................................................................ 11 Formulas for percentage...................................................... 12 Exercise................................................................. 13 Accuracy depends on sample size................................................ 14 How does accuracy depend on sample size?......................................... 15 Compare sample size 100 and 400, for 10,000 repetitions................................. 16 Normal approximation....................................................... 17 Other sampling methods..................................................... 18 1

Context 2 / 18 Overview So far we have looked at simple chance processes: flipping coins, rolling dice, playing roulette We will now look at a new chance process: sampling Outlook for the coming lectures: Today: Assume we know the box/population, and study counts and percentages in samples (Ch 20) Next: We do not assume that we know the box/population. From the sample, we want to make inference about the population, like in election polls (Ch 21, 23). 3 / 18 Population and parameter Recall: Usually we want to know a parameter (=numerical fact) about a population. Example: What is the percentage of Democratic voters in the next Presidential election? Population: voters in next Presidential election Parameter: percentage of Democratic voters The population parameter is a fixed number that we will never know exactly (because it is infeasible to look at the entire population) 4 / 18 Sample and statistic We only examine part of the population, a sample We compute a statistic from the sample Percentage of Democratic voters in our sample The statistic is a random variable. It changes if we take a different sample, due to chance error. can be computed exactly for each sample. The parameter is what we want to know. The statistic is what we know. Statistic = parameter ( + bias) + chance error How big is the chance error? 5 / 18 2

Expected value and SE for percentage 6 / 18 Example Village: 5,000 voters 3,000 Democrats (60%) 2,000 Republicans (40%) Take a simple random sample of size 400, that is, drawing at random without replacement and compute number and percentage of Democrats. Do this 10,000 times... 7 / 18 When we do this 10,000 times... nr of democrats, sample size = 400 percentage of democrats, sample size = 400 210 220 230 240 250 260 270 nr of democrats percentage of democrats 8 / 18 Example The number of democrats in a sample of size 400 is around... (expected value for count), give or take...(se for count) or so. Fill in the blank. The percentage of democrats in a sample of size 400 is around... (expected value for percentage), give or take... (SE for percentage) or so. Fill in the blank. 9 / 18 3

Strategy Make a box model: 3000 tickets 1, and 2000 tickets 0 We first look at the count of democrats in our sample. The number of democrats in a sample of size 400 is like the sum of 400 draws from this box without replacement We first consider the sum of the draws with replacement. Why? Because this is easier and we know how to do it. Average of the box: 3000 1+2000 0 5000 = 0.6 SD of the box: shortcut formula (large nr - small nr) x fraction largenr fraction smallnr = (1 0) 0.6 0.4 0.5 Expected value for sum of the draws: (nr of draws) (average of the box) = 400 0.6 = 240 SE for sum of the draws: square root formula: nr of draws (SDof box) = 400 0.5 = 10 10 / 18 Example The number of democrats in a sample of size 400 is around 240, give or take 10 or so. Convert to percentages: 240 out of 400 = 240 400 100% = 60% 10 out of 400 = 10 400 100% = 2.5% So the percentage of democrats in a sample of size 400 is around 60%, give or take 2.5% or so 11 / 18 Formulas for percentage Expected value for percentage = percentage of the box SEfor sum SEfor percentage = nr of draws 100% These formulas are exact when we draw with replacement If the sample is small compared to the population (say, less then 1/10th), then the formulas are good approximations for drawing without replacement. Why? The book gives a correction factor for switching between drawing with and without replacement. You should know that a correction factor exists, but you don t need to use it. 12 / 18 4

Exercise Ch 20, Exercise set E, Problem 1 A public opinion poll uses a simple random sample of size 1500, drawn from a population of 25,000 Another poll uses a simple random sample of size 1500 from a town with a population of 250,000 The polls are trying to estimate the percentage of voters who favor single-payer health insurance Choose one: The 1st poll is quite a bit more accurate than the 2nd The 2nd poll is quite a bit more accurate than the 1st There is not much difference in the accuracy of the two polls 13 / 18 Accuracy depends on sample size SEfor percentage = = = SEfor sum nr of draws 100% nr of draws (SDof box) nr of draws SDof box 100% nr of draws 100% The SE for a percentage depends on the number of draws = sample size, and not on the population size. (This is true if the sample is small relative to the population, so that we don t have to worry about the correction factor) 14 / 18 How does accuracy depend on sample size? What happens to the SE for a percentage if we multiply the sample size by a factor of 4? Then the SE is divided by a factor 4 = 2. So if the sample size is 4 times as large, our estimate becomes twice as precise. With a larger sample size we can estimate percentages more accurately. 15 / 18 5

Compare sample size 100 and 400, for 10,000 repetitions... nr of democrats, sample size = 100 percentage of democrats, sample size = 100 nr of democrats percentage of democrats nr of democrats, sample size = 400 percentage of democrats, sample size = 400 210 220 230 240 250 260 270 nr of democrats percentage of democrats 16 / 18 Normal approximation We can use the normal approximation as before: Determine whether the question is about counts or percentages Use new average = expected value for count/percentage Use new SD = SE for count/percentage Example: what is the chance to get more than 65% democrats in a sample of size 400 from the village we looked at before? This question is about percentages New average: expected value for percentage = 60% New SD: SE for percentage = 2.5% See overhead. Answer: the chance to get more than 65% democrats in a sample of size 400 is about 2.5%. 17 / 18 Other sampling methods Our formulas work for simple random samples The Gallup Poll and Iraq study were (multistage) cluster samples. For such samples we need other, more complicated formulas. But statisticians can still figure out the chance error. For non-probability sampling methods, it is very hard/impossible to determine the chance error. 18 / 18 6