3. Statistical Inference

Size: px
Start display at page:

Download "3. Statistical Inference"

Transcription

1 3. Statistical Inference 3.1 Introduction In order to say something about the distribution of a variable in a given population, it is impractical to observe all the values taken by that variable in the population. Hence, we observe a sample of n individuals. It is assumed that these n observations are independent and are taken from the distribution of the variable in the population as a whole. When the sample is chosen at random from the appropriate sampling frame (see Chapter 1), this is a reasonable assumption. 1 / 47

2 Parameters We will refer to such a sample as a simple sample from the population. A parameter is a value describing the distribution of a variable (e.g. the population mean, µ, the population standard deviation, σ). In general, these parameters are unknown. 2 / 47

3 Statistics From a sample we can gain various statistics e.g. i) the sample mean - X ii) the sample variance - s 2 iii) the sample proportion - ˆp (e.g. the proportion of people in a sample wishing to vote for the Labour party). 3 / 47

4 Statistics Using these statistics, we wish to say something about the appropriate parameters of a population (distribution) i) the population (theoretical) mean - µ ii) the population variance - σ 2 iii) the population proportion - p. 4 / 47

5 Sampling errors Example illustrates that when we use the sample mean, x to estimate the population mean, µ, there will be a random error which depends on the sample observed. e.g. Sampling error for mean = x µ We do not know what value the sampling error takes, but we can say something about its distribution for samples of a fixed size. As the sample size increases, the distribution of the sample mean becomes more concentrated around the population mean. Hence, estimation becomes (on average) more accurate. 5 / 47

6 Confidence intervals A statistic used to estimate the value of a parameter is called an estimator. e.g. the sample mean is used to estimate the population mean. We consider the following two problems. 1. Given an estimator of a parameter (e.g. the sample mean as an estimator of the population mean) can we define an interval, such that the appropriate parameter (the population mean) is very likely to belong to that interval? This is done by interval estimation (using confidence intervals). 6 / 47

7 Statistical testing 2. Suppose we have a hypothesis regarding a parameter of the population (e.g. 12% of the population want to vote for the Green Party). How do we decide whether this is a realistic hypothesis or not? This is the goal of hypothesis testing. 7 / 47

8 3.1.1 The distribution of the sample mean and sample proportion Before addressing these goals we consider a few results considering two important statistics 1. the sample mean - X 2. the sample proportion - ˆp. 8 / 47

9 The distribution of the sample mean Let X i denote the value of the i-th observation from the sample. Since these X i s all come from the distribution of the variable in the population, we have E(X i ) = µ; Var(X i ) = σ 2 It follows that n E( X i ) = i=1 Since the X i s are independent n Var( X i ) = i=1 n E(X i ) = nµ i=1 n Var(X i ) = nσ 2 i=1 9 / 47

10 The distribution of the sample mean It follows from the central limit theorem that n X i approx N(nµ, nσ 2 ). i=1 Dividing by n (the expected value is divided by n and the variance by n 2 ) X approx N(µ, σ2 n ) 10 / 47

11 The standard error of the sample mean The standard deviation of the sample mean is called the standard error of the sample mean and denoted S.E.(X ), where S.E.(X ) = σ n. It can be seen from this distribution that, given that a random sample is taken from the population in question, the expected value of the sample mean is the population mean (there is no systematic error in estimation). Also, the dispersion of the sample mean decreases as the sample size increases. Since we do not know σ, we estimate the standard error of the sample mean using S.E.(X ) s. n 11 / 47

12 Systematic errors in estimation It should be noted that e.g. if the sampling frame is inappropriate, then systematic errors may occur. For example, if we use the average height of a sample of Irish students to estimate the average height of all Irish adults we will tend to overestimate the mean height of Irish adults. 12 / 47

13 The distribution of the sample proportion Let p be the proportion of a population showing a given trait and Y be the number of individuals in a sample exhibiting that trait. Then Y Bin(n, p). For large n, Y approx N(np, np[1 p]). It follows that the distribution of the sample proportion ˆp = Y n is ˆp approx N(p, p[1 p] ) n Hence, for large samples the distribution of the sample mean and the sample proportion are approximately normal. 13 / 47

14 Standard error of the sample proportion The standard error of the sample proportion is its standard deviation, i.e. p(1 p) S.E.(ˆp) =. n Since we do not know the population proportion p, we may estimate the standard error of the sample proportion using ˆp(1 ˆp) S.E.(X ). n The standard error of the sample proportion is approximately the average error made when using the sample proportion to estimate the population proportion. Similarly, the standard error of the sample mean is approximately the average error made when using the sample mean to estimate the population mean. 14 / 47

15 3.2 Confidence Intervals For the population mean µ, large sample size (n 30) Suppose we take a large number of samples of a variable with n observations. Since n is large these sample means will have a normal (bell-shaped) distribution. From the tables for the normal distribution, P(Z > 1.96) = It follows that 95% of such samples will have a sample mean less than 1.96 standard errors from the population mean (see next slide). 15 / 47

16 Confidence intervals 16 / 47

17 Confidence intervals for the population mean with a large sample It follows that if the population variance is known, then X ± 1.96S.E.(X ). is a 95% confidence interval for the population mean. The term 1.96S.E.(X ) is called the radius of the confidence interval. This means that if we took a large number of such samples, approximately 95% of such confidence intervals calculated from these samples would contain the population mean. 17 / 47

18 Confidence intervals for the population mean with a large sample Similarly, 99% of the sample means lie less than standard errors from the population mean. Hence, a 99% confidence interval for the population mean is given by X ± 2.576S.E.(X ). This means that if we took a large number of such samples, approximately 99% of the confidence intervals calculated from these samples would contain the population mean. The confidence levels used here are 95% and 99%. These are the most commonly used significance levels. 18 / 47

19 Confidence intervals for the population mean with a large sample The confidence level will be denoted as 100(1 α)%. For large sample sizes, the approximation S.E.(X ) s n will be reasonably good. Hence we can use the following as an approximate 100(1 α)% confidence interval for the population mean: X ± sz α/2 X ± Z n α/2 S.E.(X ), where Z q satisfies P(Z > Z q ) = q and Z N(0, 1). Z α/2 is called the critical value. 19 / 47

20 Confidence intervals for the population mean with a large sample 20 / 47

21 Confidence intervals for the population mean with a small sample In this case s is no longer a good approximation of the population standard deviation. In order to reflect the increased uncertainty, we take our critical values from the Student t distribution with n 1 degrees of freedom (see Table 7 in the script). Assume that we have n observations from a normal distribution, then Z = X µ σ/ n N(0, 1); T = X µ s/ n t n 1, where t n 1 denotes the Student distribution with n 1 degrees of freedom. 21 / 47

22 The Student distribution The Student distribution is very similar to the standard normal distribution, but has a greater variance. It is symmetric about 0. As the number of degrees of freedom tends to infinity, then the student distribution tends to the standard normal distribution. This follows from the definition of the Student distribution (see previous slide). 22 / 47

23 Confidence intervals for the population mean with a small sample Assuming that the observations come from a distribution which is similar to the normal distribution, the following is a 100(1 α)% confidence interval for the population mean: X ± st n 1,α/2 n, where t n 1,q satisfies P(T > t n 1,q ) = q when T has a student distribution with n 1 degrees of freedom. This can be written in the form X ± t n 1,α/2 S.E.(X ), where S.E.(X ) is our approximation of the standard error of the s sample mean, n. 23 / 47

24 Table for the student distribution In Table 7 the number of degrees of freedom corresponds to the rows and α/2 corresponds to the column. Hence, t 6,0.005 = As n, the t n 1 distribution tends to the standard normal distribution. Z α/2 = t,α/2 Hence, for large samples (n > 30) we can read the appropriate critical values from the bottom row of Table / 47

25 Example Suppose the mean and variance of the height of 100 Irish students are 174cm and 144cm 2, respectively. Calculate 95% and 99% confidence intervals for the average height of all Irish students. 25 / 47

26 Example For a 95% confidence interval 100(1 α) = 95 α = Since the sample size is large, the 95% confidence interval is given by s X ± Z α/2 S.E.(X )=X ± t,0.025 n =174 ± =174 ± 2.35 = [171.65, ] 26 / 47

27 Example For a 99% confidence interval 100(1 α) = 99 α = Since the sample size is large, the 99% confidence interval is given by s X ± Z α/2 S.E.(X )=X ± t,0.005 n =174 ± =174 ± 3.10 = [170.9, 177.1] 27 / 47

28 Estimating the population mean to a given accuracy It should be noted that the higher the confidence level, the wider the confidence interval (at a higher confidence level a confidence interval must be more likely to contain the population mean). We can also use these formulae to calculate the number of observations required to estimate the population mean to within δ with a given probability (normally 0.95 or 0.99). It is assumed that the required sample size is large and that we have a reasonable estimate of the population variance (i.e. we have an initial sample of reasonably large size). We require that the radius of the confidence interval is less than or equal to δ i.e. t,α/2 S.E.(X ) δ 28 / 47

29 Example Calculate the sample size needed to estimate the mean height of all Irish students to within 1cm with a probability of a) 0.95, b) 0.99 Here, we use the data from Example The sample variance was 144cm / 47

30 Example We require that the radius of the confidence interval is bounded above by 1cm. In case a) the confidence level is 95% (i.e. α = 0.05). The radius of the confidence interval is t,α/2 S.E.(X ) = st,0.025 n 1. Hence, n n n Thus, at least 554 observations are required to estimate the population mean to within 1cm with a probability of 95%. 30 / 47

31 Example In case b) the confidence level is 99% (i.e. α = 0.01). The radius of the confidence interval is t,α/2 S.E.(X ) = st,0.005 n 1. Hence, n n n Thus, at least 956 observations are required to estimate the population mean to within 1cm with a probability of 99%. 31 / 47

32 Assumptions underlying the calculation of a confidence interval It should be noted that these calculations assume that the sample mean has a normal distribution. If the sample is large (n > 30), then this is a reasonable assumption. However, if the sample size is small, this assumption is only reasonable when the observations come from a distribution which is similar to the normal distribution. 32 / 47

33 Example The masses of a sample of 25 Irish students were taken. The sample mean was 74kg and the sample variance 121kg 2. Calculate an approximate 90% confidence interval for the mean mass of all Irish students. 33 / 47

34 Example b) In this case we have a small sample. The confidence level is 90%, thus α = 0.1. The confidence interval is given by X ± t n 1,α/2s = 74 ± t 24, n 25 From Table 7, t 24,0.05 = Hence, the confidence interval is given by 74 ± = 74 ± 3.76 = [70.24, 77.76] 34 / 47

35 Assumptions used in this calculation The distribution of mass is slightly right skewed (i.e. not normal). It follows that for small samples the sample mean will not have a normal distribution (although the approximation here will be reasonable, the sample size is not very small and the distribution of mass is not highly skewed). In this case the confidence level will be approximately 90%. 35 / 47

36 3.2.3 Confidence intervals for the population proportion In this case we only consider large samples (n > 30). The standard error of the sample proportion is p(1 p) S.E.(ˆp) = n It can be seen that the standard error of the sample proportion depends on the population proportion, which is unknown. 36 / 47

37 Estimation of the standard error of the sample proportion We may estimate the standard error in 2 ways. 1. Using the sample proportion ˆp(1 ˆp) S.E.(ˆp) = n 37 / 47

38 Estimation of the standard error of the sample proportion 2. Using a conservative estimate of the standard error (i.e. we use the maximum possible standard error for the given sample size). Since 0 p 1, it follows that p(1 p) 1 4. Hence, S.E.(ˆp) 1 4n = 1 2 n. 38 / 47

39 Confidence interval for a population proportion A 100(1 α)% confidence interval for the population proportion p is given by ˆp ± Z α/2 S.E.(ˆp) = ˆp ± t,α/2 S.E.(ˆp). Note that this formula is analogous to the formula for a confidence interval for the population mean for large samples. 39 / 47

40 Confidence interval for a population proportion When we simply wish to calculate a confidence interval, we use the first approximation for the standard error of the sample proportion. Hence, ˆp ± t,α/2 ˆp(1 ˆp) n is an approximate 100(1 α)% confidence interval for the population proportion. 40 / 47

41 Confidence interval for a population proportion The conservative approximation of the standard error of the sample proportion is used when we wish to determine the sample size required to estimate a population proportion to a required accuracy δ. In this case the maximum radius of the confidence interval for the population proportion is given by R max = t,α/2 2 n. The advantage of this approach is that such a sample size is sufficient regardless of what the population proportion is. 41 / 47

42 Confidence interval for a population proportion It should be noted that when p is close to 0 or 1, then the confidence interval may not be accurate (in this case the normal approximation to the binomial tends to be inaccurate). 42 / 47

43 Example In a survey 150 out of 1000 voters questioned said that they would vote Labour. i) Calculate a 95% confidence interval for the proportion of the population that would vote Labour. ii) What sample size is required to measure any population proportion to an accuracy of ±3% with a probability of 95%? 43 / 47

44 Example The confidence level is 95%, hence α = The appropriate confidence interval is ˆp ± t,α/2 S.E.(ˆp) ˆp ± t,α/2 ˆp(1 ˆp) n ˆp is the sample proportion ˆp = 150/1000 = t,0.025 = / 47

45 Example Hence, an approximate 95% confidence interval for the proportion of the population that would vote Labour is 0.15(1 0.15) 0.15 ± 1.96 =0.15 ± =[0.128, 0.172] 45 / 47

46 Example The maximum radius of the confidence interval is given by R max = t,α/2 2 n. Since the confidence level is 95%, α = We need to find n such that R max Hence, t, n n n n = Since, n must be an integer, the sample size must be at least / 47

47 Public opinion polls Note that in most public opinion polls the error is given as ±3%. Normally such polls use a sample of just over 1000 individuals. The maximum radius of a 95% confidence interval is 0.03 in this case. 47 / 47

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

3.4 Statistical inference for 2 populations based on two samples

3.4 Statistical inference for 2 populations based on two samples 3.4 Statistical inference for 2 populations based on two samples Tests for a difference between two population means The first sample will be denoted as X 1, X 2,..., X m. The second sample will be denoted

More information

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

5.1 Identifying the Target Parameter

5.1 Identifying the Target Parameter University of California, Davis Department of Statistics Summer Session II Statistics 13 August 20, 2012 Date of latest update: August 20 Lecture 5: Estimation with Confidence intervals 5.1 Identifying

More information

Normal distribution. ) 2 /2σ. 2π σ

Normal distribution. ) 2 /2σ. 2π σ Normal distribution The normal distribution is the most widely known and used of all distributions. Because the normal distribution approximates many natural phenomena so well, it has developed into a

More information

Estimation and Confidence Intervals

Estimation and Confidence Intervals Estimation and Confidence Intervals Fall 2001 Professor Paul Glasserman B6014: Managerial Statistics 403 Uris Hall Properties of Point Estimates 1 We have already encountered two point estimators: th e

More information

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

CHI-SQUARE: TESTING FOR GOODNESS OF FIT CHI-SQUARE: TESTING FOR GOODNESS OF FIT In the previous chapter we discussed procedures for fitting a hypothesized function to a set of experimental data points. Such procedures involve minimizing a quantity

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

Lecture 8. Confidence intervals and the central limit theorem

Lecture 8. Confidence intervals and the central limit theorem Lecture 8. Confidence intervals and the central limit theorem Mathematical Statistics and Discrete Mathematics November 25th, 2015 1 / 15 Central limit theorem Let X 1, X 2,... X n be a random sample of

More information

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions Typical Inference Problem Definition of Sampling Distribution 3 Approaches to Understanding Sampling Dist. Applying 68-95-99.7 Rule

More information

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test

Experimental Design. Power and Sample Size Determination. Proportions. Proportions. Confidence Interval for p. The Binomial Test Experimental Design Power and Sample Size Determination Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 3 8, 2011 To this point in the semester, we have largely

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Point and Interval Estimates

Point and Interval Estimates Point and Interval Estimates Suppose we want to estimate a parameter, such as p or µ, based on a finite sample of data. There are two main methods: 1. Point estimate: Summarize the sample by a single number

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

Social Studies 201 Notes for November 19, 2003

Social Studies 201 Notes for November 19, 2003 1 Social Studies 201 Notes for November 19, 2003 Determining sample size for estimation of a population proportion Section 8.6.2, p. 541. As indicated in the notes for November 17, when sample size is

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2

Math 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2 Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable

More information

Lecture 10: Depicting Sampling Distributions of a Sample Proportion

Lecture 10: Depicting Sampling Distributions of a Sample Proportion Lecture 10: Depicting Sampling Distributions of a Sample Proportion Chapter 5: Probability and Sampling Distributions 2/10/12 Lecture 10 1 Sample Proportion 1 is assigned to population members having a

More information

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference 0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures

More information

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key

University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key Name: OUTLINE SOLUTIONS University of Chicago Graduate School of Business Business 41000: Business Statistics Solution Key Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Unit 26 Estimation with Confidence Intervals

Unit 26 Estimation with Confidence Intervals Unit 26 Estimation with Confidence Intervals Objectives: To see how confidence intervals are used to estimate a population proportion, a population mean, a difference in population proportions, or a difference

More information

Confidence Intervals for Cp

Confidence Intervals for Cp Chapter 296 Confidence Intervals for Cp Introduction This routine calculates the sample size needed to obtain a specified width of a Cp confidence interval at a stated confidence level. Cp is a process

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Confidence Intervals for One Standard Deviation Using Standard Deviation

Confidence Intervals for One Standard Deviation Using Standard Deviation Chapter 640 Confidence Intervals for One Standard Deviation Using Standard Deviation Introduction This routine calculates the sample size necessary to achieve a specified interval width or distance from

More information

Population Mean (Known Variance)

Population Mean (Known Variance) Confidence Intervals Solutions STAT-UB.0103 Statistics for Business Control and Regression Models Population Mean (Known Variance) 1. A random sample of n measurements was selected from a population with

More information

The Standard Normal distribution

The Standard Normal distribution The Standard Normal distribution 21.2 Introduction Mass-produced items should conform to a specification. Usually, a mean is aimed for but due to random errors in the production process we set a tolerance

More information

Week 4: Standard Error and Confidence Intervals

Week 4: Standard Error and Confidence Intervals Health Sciences M.Sc. Programme Applied Biostatistics Week 4: Standard Error and Confidence Intervals Sampling Most research data come from subjects we think of as samples drawn from a larger population.

More information

Need for Sampling. Very large populations Destructive testing Continuous production process

Need for Sampling. Very large populations Destructive testing Continuous production process Chapter 4 Sampling and Estimation Need for Sampling Very large populations Destructive testing Continuous production process The objective of sampling is to draw a valid inference about a population. 4-

More information

Lecture Notes Module 1

Lecture Notes Module 1 Lecture Notes Module 1 Study Populations A study population is a clearly defined collection of people, animals, plants, or objects. In psychological research, a study population usually consists of a specific

More information

Name: Date: Use the following to answer questions 3-4:

Name: Date: Use the following to answer questions 3-4: Name: Date: 1. Determine whether each of the following statements is true or false. A) The margin of error for a 95% confidence interval for the mean increases as the sample size increases. B) The margin

More information

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8.

Random variables P(X = 3) = P(X = 3) = 1 8, P(X = 1) = P(X = 1) = 3 8. Random variables Remark on Notations 1. When X is a number chosen uniformly from a data set, What I call P(X = k) is called Freq[k, X] in the courseware. 2. When X is a random variable, what I call F ()

More information

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution

Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. Part 3: Discrete Uniform Distribution Binomial Distribution Chapter 3: DISCRETE RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS Part 3: Discrete Uniform Distribution Binomial Distribution Sections 3-5, 3-6 Special discrete random variable distributions we will cover

More information

The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math. P (x) = 5! = 1 2 3 4 5 = 120. The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

More information

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013

STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico. Fall 2013 STAT 145 (Notes) Al Nosedal anosedal@unm.edu Department of Mathematics and Statistics University of New Mexico Fall 2013 CHAPTER 18 INFERENCE ABOUT A POPULATION MEAN. Conditions for Inference about mean

More information

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22

Math 151. Rumbos Spring 2014 1. Solutions to Assignment #22 Math 151. Rumbos Spring 2014 1 Solutions to Assignment #22 1. An experiment consists of rolling a die 81 times and computing the average of the numbers on the top face of the die. Estimate the probability

More information

Math 251, Review Questions for Test 3 Rough Answers

Math 251, Review Questions for Test 3 Rough Answers Math 251, Review Questions for Test 3 Rough Answers 1. (Review of some terminology from Section 7.1) In a state with 459,341 voters, a poll of 2300 voters finds that 45 percent support the Republican candidate,

More information

SAMPLING DISTRIBUTIONS

SAMPLING DISTRIBUTIONS 0009T_c07_308-352.qd 06/03/03 20:44 Page 308 7Chapter SAMPLING DISTRIBUTIONS 7.1 Population and Sampling Distributions 7.2 Sampling and Nonsampling Errors 7.3 Mean and Standard Deviation of 7.4 Shape of

More information

The Normal distribution

The Normal distribution The Normal distribution The normal probability distribution is the most common model for relative frequencies of a quantitative variable. Bell-shaped and described by the function f(y) = 1 2σ π e{ 1 2σ

More information

Constructing and Interpreting Confidence Intervals

Constructing and Interpreting Confidence Intervals Constructing and Interpreting Confidence Intervals Confidence Intervals In this power point, you will learn: Why confidence intervals are important in evaluation research How to interpret a confidence

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

Chapter 4. Probability and Probability Distributions

Chapter 4. Probability and Probability Distributions Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the

More information

1. How different is the t distribution from the normal?

1. How different is the t distribution from the normal? Statistics 101 106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M 7.1 and 7.2, ignoring starred parts. Reread M&M 3.2. The effects of estimated variances on normal approximations. t-distributions.

More information

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7.

THE FIRST SET OF EXAMPLES USE SUMMARY DATA... EXAMPLE 7.2, PAGE 227 DESCRIBES A PROBLEM AND A HYPOTHESIS TEST IS PERFORMED IN EXAMPLE 7. THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM

More information

Probability Distributions

Probability Distributions Learning Objectives Probability Distributions Section 1: How Can We Summarize Possible Outcomes and Their Probabilities? 1. Random variable 2. Probability distributions for discrete random variables 3.

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Sampling Distributions

Sampling Distributions Sampling Distributions You have seen probability distributions of various types. The normal distribution is an example of a continuous distribution that is often used for quantitative measures such as

More information

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: Density Curve A density curve is the graph of a continuous probability distribution. It must satisfy the following properties: 1. The total area under the curve must equal 1. 2. Every point on the curve

More information

Non-random/non-probability sampling designs in quantitative research

Non-random/non-probability sampling designs in quantitative research 206 RESEARCH MET HODOLOGY Non-random/non-probability sampling designs in quantitative research N on-probability sampling designs do not follow the theory of probability in the choice of elements from the

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

Random variables, probability distributions, binomial random variable

Random variables, probability distributions, binomial random variable Week 4 lecture notes. WEEK 4 page 1 Random variables, probability distributions, binomial random variable Eample 1 : Consider the eperiment of flipping a fair coin three times. The number of tails that

More information

Important Probability Distributions OPRE 6301

Important Probability Distributions OPRE 6301 Important Probability Distributions OPRE 6301 Important Distributions... Certain probability distributions occur with such regularity in real-life applications that they have been given their own names.

More information

CURVE FITTING LEAST SQUARES APPROXIMATION

CURVE FITTING LEAST SQUARES APPROXIMATION CURVE FITTING LEAST SQUARES APPROXIMATION Data analysis and curve fitting: Imagine that we are studying a physical system involving two quantities: x and y Also suppose that we expect a linear relationship

More information

Chapter 3 RANDOM VARIATE GENERATION

Chapter 3 RANDOM VARIATE GENERATION Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.

More information

Fixed-Effect Versus Random-Effects Models

Fixed-Effect Versus Random-Effects Models CHAPTER 13 Fixed-Effect Versus Random-Effects Models Introduction Definition of a summary effect Estimating the summary effect Extreme effect size in a large study or a small study Confidence interval

More information

Stat 704 Data Analysis I Probability Review

Stat 704 Data Analysis I Probability Review 1 / 30 Stat 704 Data Analysis I Probability Review Timothy Hanson Department of Statistics, University of South Carolina Course information 2 / 30 Logistics: Tuesday/Thursday 11:40am to 12:55pm in LeConte

More information

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test

Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Bivariate Statistics Session 2: Measuring Associations Chi-Square Test Features Of The Chi-Square Statistic The chi-square test is non-parametric. That is, it makes no assumptions about the distribution

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

Opgaven Onderzoeksmethoden, Onderdeel Statistiek

Opgaven Onderzoeksmethoden, Onderdeel Statistiek Opgaven Onderzoeksmethoden, Onderdeel Statistiek 1. What is the measurement scale of the following variables? a Shoe size b Religion c Car brand d Score in a tennis game e Number of work hours per week

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing

Chapter 8 Hypothesis Testing Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing Chapter 8 Hypothesis Testing 1 Chapter 8 Hypothesis Testing 8-1 Overview 8-2 Basics of Hypothesis Testing 8-3 Testing a Claim About a Proportion 8-5 Testing a Claim About a Mean: s Not Known 8-6 Testing

More information

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS

CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CHAPTER 7 INTRODUCTION TO SAMPLING DISTRIBUTIONS CENTRAL LIMIT THEOREM (SECTION 7.2 OF UNDERSTANDABLE STATISTICS) The Central Limit Theorem says that if x is a random variable with any distribution having

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

PROBABILITY AND SAMPLING DISTRIBUTIONS

PROBABILITY AND SAMPLING DISTRIBUTIONS PROBABILITY AND SAMPLING DISTRIBUTIONS SEEMA JAGGI AND P.K. BATRA Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 0 0 seema@iasri.res.in. Introduction The concept of probability

More information

You flip a fair coin four times, what is the probability that you obtain three heads.

You flip a fair coin four times, what is the probability that you obtain three heads. Handout 4: Binomial Distribution Reading Assignment: Chapter 5 In the previous handout, we looked at continuous random variables and calculating probabilities and percentiles for those type of variables.

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

2 ESTIMATION. Objectives. 2.0 Introduction

2 ESTIMATION. Objectives. 2.0 Introduction 2 ESTIMATION Chapter 2 Estimation Objectives After studying this chapter you should be able to calculate confidence intervals for the mean of a normal distribution with unknown variance; be able to calculate

More information

Math 108 Exam 3 Solutions Spring 00

Math 108 Exam 3 Solutions Spring 00 Math 108 Exam 3 Solutions Spring 00 1. An ecologist studying acid rain takes measurements of the ph in 12 randomly selected Adirondack lakes. The results are as follows: 3.0 6.5 5.0 4.2 5.5 4.7 3.4 6.8

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Chapter 2. Hypothesis testing in one population

Chapter 2. Hypothesis testing in one population Chapter 2. Hypothesis testing in one population Contents Introduction, the null and alternative hypotheses Hypothesis testing process Type I and Type II errors, power Test statistic, level of significance

More information

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference)

Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Chapter 45 Two-Sample T-Tests Allowing Unequal Variance (Enter Difference) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when no assumption

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

TImath.com. F Distributions. Statistics

TImath.com. F Distributions. Statistics F Distributions ID: 9780 Time required 30 minutes Activity Overview In this activity, students study the characteristics of the F distribution and discuss why the distribution is not symmetric (skewed

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = 6 0.4 x = 12. f(x) =

0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = 6 0.4 x = 12. f(x) = . A mail-order computer business has si telephone lines. Let X denote the number of lines in use at a specified time. Suppose the pmf of X is as given in the accompanying table. 0 2 3 4 5 6 p(.0.5.20.25.20.06.04

More information

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved.

3.4. The Binomial Probability Distribution. Copyright Cengage Learning. All rights reserved. 3.4 The Binomial Probability Distribution Copyright Cengage Learning. All rights reserved. The Binomial Probability Distribution There are many experiments that conform either exactly or approximately

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI)

Objectives. 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) CI) Objectives 6.1, 7.1 Estimating with confidence (CIS: Chapter 10) Statistical confidence (CIS gives a good explanation of a 95% CI) Confidence intervals. Further reading http://onlinestatbook.com/2/estimation/confidence.html

More information

Notes on Continuous Random Variables

Notes on Continuous Random Variables Notes on Continuous Random Variables Continuous random variables are random quantities that are measured on a continuous scale. They can usually take on any value over some interval, which distinguishes

More information

12: Analysis of Variance. Introduction

12: Analysis of Variance. Introduction 1: Analysis of Variance Introduction EDA Hypothesis Test Introduction In Chapter 8 and again in Chapter 11 we compared means from two independent groups. In this chapter we extend the procedure to consider

More information

Multivariate Normal Distribution

Multivariate Normal Distribution Multivariate Normal Distribution Lecture 4 July 21, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #4-7/21/2011 Slide 1 of 41 Last Time Matrices and vectors Eigenvalues

More information

Lesson 17: Margin of Error When Estimating a Population Proportion

Lesson 17: Margin of Error When Estimating a Population Proportion Margin of Error When Estimating a Population Proportion Classwork In this lesson, you will find and interpret the standard deviation of a simulated distribution for a sample proportion and use this information

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

TEACHER NOTES MATH NSPIRED

TEACHER NOTES MATH NSPIRED Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when

More information

Confidence Intervals

Confidence Intervals Confidence Intervals I. Interval estimation. The particular value chosen as most likely for a population parameter is called the point estimate. Because of sampling error, we know the point estimate probably

More information

Inference for two Population Means

Inference for two Population Means Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example

More information

Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics

Chapter 6: Point Estimation. Fall 2011. - Probability & Statistics STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of

More information

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1 Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

More information

A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion

A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion A Determination of g, the Acceleration Due to Gravity, from Newton's Laws of Motion Objective In the experiment you will determine the cart acceleration, a, and the friction force, f, experimentally for

More information

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k.

REPEATED TRIALS. The probability of winning those k chosen times and losing the other times is then p k q n k. REPEATED TRIALS Suppose you toss a fair coin one time. Let E be the event that the coin lands heads. We know from basic counting that p(e) = 1 since n(e) = 1 and 2 n(s) = 2. Now suppose we play a game

More information