1 Computing the Standard Deviation of Sample Means



Similar documents
Confidence Intervals for One Mean

Hypothesis testing. Null and alternative hypotheses

I. Chi-squared Distributions

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

Math C067 Sampling Distributions

Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean

Determining the sample size

Properties of MLE: consistency, asymptotic normality. Fisher information.

5: Introduction to Estimation

Lesson 15 ANOVA (analysis of variance)

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

Measures of Spread and Boxplots Discrete Math, Section 9.4

1. C. The formula for the confidence interval for a population mean is: x t, which was

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Sampling Distribution And Central Limit Theorem

Chapter 7: Confidence Interval and Sample Size

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

Exam 3. Instructor: Cynthia Rudin TA: Dimitrios Bisias. November 22, 2011

Chapter 7 Methods of Finding Estimators

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

One-sample test of proportions

Normal Distribution.

Maximum Likelihood Estimators.

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Confidence Intervals

Chapter 5: Inner Product Spaces

FIBONACCI NUMBERS: AN APPLICATION OF LINEAR ALGEBRA. 1. Powers of a matrix

Lesson 17 Pearson s Correlation Coefficient

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

Example 2 Find the square root of 0. The only square root of 0 is 0 (since 0 is not positive or negative, so those choices don t exist here).

Present Value Factor To bring one dollar in the future back to present, one uses the Present Value Factor (PVF): Concept 9: Present Value

Multi-server Optimal Bandwidth Monitoring for QoS based Multimedia Delivery Anup Basu, Irene Cheng and Yinzhe Yu

Soving Recurrence Relations

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

3 Basic Definitions of Probability Theory


Output Analysis (2, Chapters 10 &11 Law)

Practice Problems for Test 3

1 Correlation and Regression Analysis

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

hp calculators HP 12C Statistics - average and standard deviation Average and standard deviation concepts HP12C average and standard deviation

Inference on Proportion. Chapter 8 Tests of Statistical Hypotheses. Sampling Distribution of Sample Proportion. Confidence Interval

This document contains a collection of formulas and constants useful for SPC chart construction. It assumes you are already familiar with SPC.

INVESTMENT PERFORMANCE COUNCIL (IPC)

Hypergeometric Distributions

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

Chapter 14 Nonparametric Statistics

LECTURE 13: Cross-validation

CS103X: Discrete Structures Homework 4 Solutions

Overview of some probability distributions.

Center, Spread, and Shape in Inference: Claims, Caveats, and Insights

Listing terms of a finite sequence List all of the terms of each finite sequence. a) a n n 2 for 1 n 5 1 b) a n for 1 n 4 n 2

Section 8.3 : De Moivre s Theorem and Applications

Section 11.3: The Integral Test

Elementary Theory of Russian Roulette

Project Deliverables. CS 361, Lecture 28. Outline. Project Deliverables. Administrative. Project Comments

CHAPTER 3 THE TIME VALUE OF MONEY

Descriptive Statistics

Quadrat Sampling in Population Ecology

.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth

Simple Annuities Present Value.

COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS

Sequences and Series

3. Greatest Common Divisor - Least Common Multiple

A Recursive Formula for Moments of a Binomial Distribution

5 Boolean Decision Trees (February 11)

Infinite Sequences and Series

5.3. Generalized Permutations and Combinations

Week 3 Conditional probabilities, Bayes formula, WEEK 3 page 1 Expected value of a random variable

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE The absolute value of the complex number z a bi is

PSYCHOLOGICAL STATISTICS

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Basic Elements of Arithmetic Sequences and Series

Modified Line Search Method for Global Optimization

Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find

Department of Computer Science, University of Otago

Research Method (I) --Knowledge on Sampling (Simple Random Sampling)

Statistical inference: example 1. Inferential Statistics

AP Calculus AB 2006 Scoring Guidelines Form B

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 9 Spring 2006

Incremental calculation of weighted mean and variance

Asymptotic Growth of Functions

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

NATIONAL SENIOR CERTIFICATE GRADE 12

The Stable Marriage Problem

BINOMIAL EXPANSIONS In this section. Some Examples. Obtaining the Coefficients

Your organization has a Class B IP address of Before you implement subnetting, the Network ID and Host ID are divided as follows:

A probabilistic proof of a binomial identity


7.1 Finding Rational Solutions of Polynomial Equations

Confidence Intervals for Linear Regression Slope

Theorems About Power Series

NATIONAL SENIOR CERTIFICATE GRADE 11

5.4 Amortization. Question 1: How do you find the present value of an annuity? Question 2: How is a loan amortized?

Transcription:

Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis. Items withi a sample lose their idividual characteristics i the aalysis. Rather a summary statistic, e.g. sample mea, is used to represet the iformatio i the sample. See the examples of samples below:. A sectio of BA3352 studets i the curret semester is a sample of studets. The the sample size is the umber of studets i the sectio. Differet sectios costitute differet samples. The umber of sectios offered i the curret semester would be the umber of samples. 2. Voters surveyed by a give pollig agecy o a sigle day is a sample. The sample size is the umber of voters surveyed o that particular day. Polls made o differet days costitute differet samples. The umber of the polls is the umber of samples. 3. Customers buyig a particular brad of perfume over a specified moth ca be cosidered as a sample. The sample size is the umber of customers buyig the perfume over the specified moth. Aother sample ca be geerated by cosiderig customers buyig aother brad of perfume. If we cosider four brads of perfumes, we ed up with four samples. The umber of samples ad the sample size ca potetially be cofusig. Sample size is the umber of items withi a group. Number of samples is the umber of groups. Example : After a midterm exam for a course that is give to five sectios of a course, the average exam grade x j i sectio j is computed ad reported below. Sec Sec 2 Sec 3 Sec 4 Sec 5 Average grade 68 72 74 82 7 Suppose that there are 50 studets i each sectio ad use x i,j to deote the ith studet s grade i Sec j. The the average grades are computed by 50 x j = x i,j for j {, 2, 3, 4, 5}. 50 Sice all 50 grades withi a sectio are reduced to a sigle summary statistic (the sample mea), all the studets withi a sectio are represeted merely by the sectio s summary statistic (the sample mea); Idividual studet grades are immaterial for a aalysis that checks if a certai secti is performig better tha the others. Clearly, the sample size is 50 ad the umber of samples is 5. There are two ways to compute the stadard deviatio σ x of sample meas. The first way requires the kowledge of the stadard deviatio σ x of the idividual values withi a sample, the secod way does ot require σ x.. Computig σ x with kow σ x I order to uderstad what we have ad what we wat, first recall that V ar(x) = σ 2 x ad V ar( X) = σ 2 x. Note that V ar(x) is kow ad we wat to compute V ar( X). I order to perform this computatio, we eed to recall the followig propositio from statistics:

Propositio. i) If X is a radom variable ad c is a costat, the V ar(c X) = c 2 V ar(x). ii) If X ad X 2 are two idepedet radom variables, the V ar(x + X 2 ) = V ar(x ) + V ar(x 2 ). Proof: i) First covice yourself that the mea of cx would be c x where x is the mea of X. We start with V ar(c X) ad use the defiitio of variace V ar(cx) = (cx i c x) 2 = c 2 (x i x) 2 = c 2 V ar(x). ii) Agai by usig the defiitio V ar(x + X 2 ) = = = = (x,i + x 2,i x x 2 ) 2 {(x,i x ) 2 + (x 2,i x 2 ) 2 + 2(x,i x )(x 2,i x 2 )} (x,i x ) 2 + (x,i x ) 2 + = V ar(x ) + V ar(x 2 ) (x 2,i x 2 ) 2 + 2 (x 2,i x 2 ) 2 + 0 (x,i x )(x 2,i x 2 ) The fourth equality is due to the fact that X ad X 2 are idepedet so the sum of the cross products is zero. This sum would be the covariace of X ad X 2, if X ad X 2 were ot idepedet. Now Propositio ca be used to relate the variace of the sample mea to the variace of the observatio withi the samples. We start with the defiitio ofthe sample mea, proceed as follows ( ) V ar( X) = V ar X i ( ) ( P rop..i 2 ) = V ar X i ( ) P rop..ii 2 = V ar(x i ) = 2 V ar(x) = V ar(x) () where we use the fact that each idividual observatio has the same variace as the other idividuals: V ar(x ) = V ar(x 2 ) = V ar(x i ) = V ar(x) where X stads for a geeric observatio ad represets oe of X, X 2,... X. This fact is assumed whe costructig samples; otherwise, we would be groupig apples with orages. Give () which relates varaices, relatig the stadard deviatios is easy. Just take the square root of the both sides i () to arrive at σ x = σ x. (2) 2

Example 2: Refer to Example ad suppose that the idivudual scores has a stadard deviatio of 20, compute the stadard deviatio of the sample meas. Solutio: We are give σ = 20, sample size is already kow as = 50. The by usig (2), σ x = σ x = 50 20..2 Computig σ x with ukow σ x This method is rather direct; Without σ x, the oly iformatio available is the populatio of the sample meas { x, x 2,... x m } where the umber of samples is deoted by m. We could use this populatio to estimate the stadard deviatio of the sample meas. First let us compute the variace: V ar( X) = m where x is the grad mea which ca be computed by x = m ( x j x) 2 x j. Fially the stadard deviatio of the sample mea is σ x = ( x j x) m 2. (3) Example 3: Refer to Example ad compute the stadard deviatio of the sample meas from the populatio {68, 72, 74, 82, 7}. Solutio: First we compute the grad mea x = m x j = 73.4. The the stadard deviatio of the sample meas by (3) is σ x = 5 {(68 73.4)2 + (72 73.4) 2 + (74 73.4) 2 + (82 73.4) 2 + (7 73.4) 2 }..3 Remark Whe σ x is ukow, you must use (3) to compute σ x. I this case, you do ot have ay choice. Whe σ x is kow, you have to choose betwee equatios (2) ad (3). Uless otherwise is specified, use (2) to fid σ x. Ratioale here is that the computatio i (2) is exact whereas (3) gives you oly a estimate. The geeral priciple applies: use the iformatio available to you as much as possible ad refrai from estimatio uless absolutely ecessary. 3

2 Exercise Questios. Every year about 500 people apply for UTD s full time MBA program. Over the years it has bee observed that GMAT score of each of these people are distributed ormally with mea 600 ad variace 300. a) If UTD decides to accept all applicats whose GMAT score is above 620, o average how may people will be accepted per year? b) If UTD decides to accept 50 studets with highest GMAT scores every year, what should be the cut off GMAT score (lowest score amog the 50 accepted studets). 2. Draw a Ishikawa diagram listig the possible causes of your midterm grade. Iclude Eviromet, Materials, Method, Persoel, etc. 3. Read Cotiuous Improvemet o the Free-Throw Lie pp.42-44 of the textbook. I couple seteces explai a process from your ow life, which you have improved by studyig reasos for failure or substadard performace. Example processes are parallel parkig, speakig i public, washig dishes, fidig the closest parkig spot to your office/class, etc. 4. The DFW passeger data below pertais to the first eight moths of 200. Suppose that every moth has 30 days. Number of passegers flyig out of DFW airport per day ad the umber of passegers who are searched per day are: Ja Feb Mar Apr May Ju Jul Aug ȳ Ja ȳ F eb ȳ Mar ȳ Apr ȳ May ȳ Ju ȳ Jul ȳ Aug Average # of passegers/day 5000 4000 2600 3300 4700 400 6800 7500 z Ja z F eb z Mar z Apr z May z Ju z Jul z Aug Average # of searched passegers/day 47 53 6 4 42 44 5 43 The average umber of passegers per day is computed as follows. Let y i,j be the umber of the passegers o the ith day of moth j. The average umber of passegers per day for moth j is ȳ j defied as ȳ j = 30 y i,j 30 for j {Ja, F eb, Mar, Apr, May, Ju, Jul, Aug}. The average umber of passegers searched per day is computed similarly. Let z i,j be the umber of the passegers searched o the ith day of moth j. The average umber of passegers searched per day for moth j is z j defied as z j = 30 z i,j 30 for j {Ja, F eb, Mar, Apr, May, Ju, Jul, Aug}. a) What is the sample size for computig averages i the table? b) Suppose that the stadard deviatio of the umber of passegers (y i,j ) flyig out of DFW every day is 3000, what is the stadard deviatio of the average umber of passegers (ȳ j ) flyig out of DFW per day? c) Assumig a Normal distributio for the umber of passegers, how may sigmas (σ) will give you a Type I error of 20% for a x-chart o the average umber of passegers flyig out of DFW per day? 5. Refer to questio 4. a) Fid out 3-sigma UCL ad LCL for a x chart o the average umber of passegers flyig out of DFW 4

per day. b) Is the process i cotrol durig the first eight moths? Explai. 6. Refer to questio 4. a) Compute the variace of the average umber of passegers searched ( z j ) per day durig the first eight moths. I other words, fid the variace of the populatio { z Ja, z F eb, z Mar, z Apr, z May, z Ju, z Jul, z Aug } by usig the data i the table. Let us call this variace σ 2 z. b) Compute the ratio of σ 2 z to the grad mea of the averages of the passegers searched per day durig the first eight moths. Lookig at this ratio ad cosiderig the fact that the umber of searches per day is a iteger umber, what distributio would be appropriate to study the umber of searches? c) What are UCL ad LCL for a 2.5-sigma c-cotrol chart for the umber of passegers searched per day? 7. Refer to questio 4. a) Obtai the proportio r j of passegers searched per day for each moth. I other words, costruct the populatio { r Ja, r F eb, r Mar, r Apr, r May, r Ju, r Jul, r Aug } by usig the data i the table. b) Compute the grad mea ad the variace σ 2 r of the populatio i a). c) What are UCL ad LCL for a 2.5-sigma p-cotrol chart for the proportio of passegers searched? 8. Refer to questios 4,6 ad 7. Below are average umber of passegers ad average the umber of passegers searched i September ad October 200. Sep Oct Average umber of passegers/day 900 6200 Average umber of searched passegers/day 57 63 Usig c- ad p-cotrol charts obtaied i questios 6 ad 7 ad the recet umbers above determie if a) The umber of passegers searched per day is i cotrol? b) The proportio of passegers searched per day is i cotrol? c) How ca you recocile your aswers if you say yes to either a) or b) above, ad o to the other? 5