7. Sample Covariance and Correlation

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "7. Sample Covariance and Correlation"

Transcription

1 1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y are real-valued radom variables for the experimet. Equivaletly, (X, Y) is a radom vector takig values i R 2. Please recall the basic properties of the meas, (X) ad (Y), the variaces, var(x) ad var(y) ad the covariace. I particular, recall that the correlatio is We will also eed a higher order bivariate momet. Let cor(x, Y) = sd(x) sd(y) d(x, Y) = (((X (X)) (Y (Y))) 2 ) Now suppose that we ru the basic experimet times. This creates a compoud experimet with a sequece of idepedet radom vectors ((X 1, Y 1 ), (X 2, Y 2 ),..., (X, Y )) each with the same distributio as (X, Y). I statistical terms, this is a radom sample of size from the distributio of (X, Y). As usual, we will let X = (X 1, X 2,..., X ) deote the sequece of first coordiates; this is a radom sample of size from the distributio of X. Similarly, we will let Y = (Y 1, Y 2,..., X ) deote the sequece of secod coordiates; this is a radom sample of size from the distributio of Y. Recall that the sample meas ad sample variaces for X are defied as follows (ad of course aalogous defiitios hold for Y):. M(X) = 1 i =1 X i, W 2 (X) = 1 i =1 (X i (X)) 2, S 2 (X) = 1 i =1 (X i M(X)) 2 I this sectio, we will defie ad study statistics that are atural estimators of the distributio covariace ad correlatio. These statistics will be measures of the liear relatioship of the sample poits i the plae. As usual, the defiitios deped o what other parameters are kow ad ukow. A Special Sample Covariace Suppose first that the distributio meas (X) ad (Y) are kow. This is usually a urealistic assumptio, of course, but is still a good place to start because the aalysis is very simple ad the results we obtai will be useful below. A atural estimator of i this case is

2 2 of 8 7/16/2009 6:06 AM W(X, Y) = 1 i =1 (X i (X)) (Y i (Y)) 1. Show that W(X, Y) is the sample mea for a radom sample of size from the distributio of (X (X)) (Y (Y)). 2. Use the result of Exercise 1 to show that (W(X, Y)) = var(w(x, Y)) = 1 (d(x, Y) cov 2 (X, Y)) W(X, Y) as with probability 1. I particular, W(X, Y) is a ubiased ad cosistet estimator of. Properties The formula i the followig exercise is sometimes better tha the defiitio for computatioal purposes. 3. With X Y defied to be the sequece (X 1 Y 1, X 2 Y 2,..., X Y ), show that W(X, Y) = M(X Y) M(X) (Y) M(Y) (X) + (X) (Y) The properties established i the followig exercises are aalogies of properties for the distributio covariace 4. Show that W(X, X) = W 2 (X) 5. Show that W(X, Y) = W(Y, X) 6. Show that if a is a costat the W(a X, Y) = a W(X, Y) 7. Show that W(X + Y, Z) = W(X, Z) + W(Y, Z) The followig exercise gives a formula for the sample variace of a sum. The result exteds aturally to larger sums. 8. Show that W 2 (X + Y) = W 2 (X) + W 2 (Y) + 2 W(X, Y) The Stadard Sample Covariace Cosider ow the more realistic assumptio that the distributio meas (X) ad (Y) are ukow. A atural approach i this case is to average (X i M(X)) (Y i M(Y)) over i {1, 2,..., }. But rather tha dividig by i our average, we should divide by whatever costat gives a ubiased estimator of. 9. Iterpret the sig of (X i M(X)) (Y i M(Y)) geometrically, i terms of the scatterplot of poits ad its ceter.

3 3 of 8 7/16/2009 6:06 AM Derivatio. 10. Use the biliearity of the covariace operator to show that cov(m(x), M(Y)) = 11. Expad ad sum term by term to show that i =1 (X i M(X)) (Y i M(Y)) = i =1 X i Y i M(X) M(Y) 12. Use the result of Exercises 10 ad 11, ad basic properties of expected value, to show that ( i =1 (X i M(X)) (Y i M(Y))) = ( 1) Therefore, to have a ubiased estimator of, we should defie the sample covariace to be the radom variable = 1 i =1 (X i M(X)) (Y i M(Y)) As with the sample variace, whe the sample size is large, it makes little differece whether we divide by or 1. Properties The formula i the followig exercise is sometimes better tha the defiitio for computatioal purposes. 13. With X Y defied as i Exercise 3, show that = 1 i =1 X i Y i M(X) M(Y) = (M(X Y) M(X) M(Y)) Use the result of the previous exercise ad the strog law of large umbers to show that as with probability 1. The properties established i the followig exercises are aalogies of properties for the distributio covariace 15. Show that S(X, X) = S 2 (X) 16. Show that = S(Y, X) 17. Show that if a is a costat the S(a X, Y) = a 18. Show that S(X + Y, Z) = S(X, Z) + S(Y, Z) 19. Show that

4 4 of 8 7/16/2009 6:06 AM = (W(X, Y) (M(X) (X)) (M(Y) (Y))) 1 The followig exercise gives a formula for the sample variace of a sum. The result exteds aturally to larger sums. 20. Show that S 2 (X + Y) = S 2 (X) + S 2 (Y) + 2 Variace I this subsectio we will derive the followig formuala for the variace of the sample covariace. The derivatio was cotributed by Rajith Uikrisha, ad is similar to the derivatio of the variace of the sample variace. var() = d(x, Y) + var(x) var(y) ( 1 1 cov2 (X, Y) ) 21. Verify the followig result. Hit: Start with the expressio o the right. Expad the product (X i X j) (Y i Y j), ad take the sums term by term. 1 = 2 ( 1) i =1 j =1 (X i X j) (Y i Y j) It follows that var() is the sum of all of the pairwise covariaces of the terms i the expasio of Exercise Now, derive the formula for var() by showig that cov((x i X j) (Y i Y j), (X k X l ) (Y k Y l )) = 0 if i = j or k = l or i, j, k, l are distict. cov((x i X j) (Y i Y j), (X i X j) (Y i Y j)) = 2 d(x, Y) + 2 var(x) var(y) if i j, ad there are 2 ( 1) such terms i the sum of covariaces. cov((x i X j) (Y i Y j), (X k X j) (Y k Y j)) = d(x, Y) cov 2 (X, Y) if i, j, k are distict, ad there are 4 ( 1) ( 2) such terms i the sum of covariaces. 23. Show that var() > var(w(x, Y)). Does this seem reasoable? 24. Show that var() 0 as. Thus, the sample covariace is a cosistet estimator of the distributio covariace. Sample Correlatio By aalogy with the distributio correlatio, the sample correlatio is obtaied by dividig the sample covariace by the product of the sample stadard deviatios:

5 5 of 8 7/16/2009 6:06 AM R(X, Y) = S(X) S(Y) 25. Use the strog law of large umbers to show that R(X, Y) cor(x, Y) as with probability Click i the iteractive scatterplot to defie 20 poits ad try to come as close as possible to the followig coditios: sample meas 0, sample stadard deviatios 1, sample correlatio as follows: 0, 0.5, 0.5, 0.7, 0.7, 0.9, Click i the iteractive scatterplot to defie 20 poits ad try to come as close as possible to the followig coditios: X sample mea 1, Y sample mea 3, Xsample stadard deviatio 2, Y sample stadard deviatio 1, sample correlatio as follows: 0, 0.5, 0.5, 0.7, 0.7, 0.9, 0.9. The Best Liear Predictor The Distributio Versio Recall that i the sectio o (distributio) correlatio ad regressio, we showed that the best liear predictor of Y based o X, i the sese of miimizig mea square error, is the radom variable L(Y X) = (Y) + (X (X)) var(x) Moreover, the (miimum) value of the mea square error is ((Y L(Y X)) 2 ) = var(y) (1 cor(x, Y) 2 ) The distributio regressio lie is give by y = L(Y X = x) = (Y) + (x (X)) var(x) The S ample Versio Of course, i real applicatios, we are ulikely to kow the distributio parameters (X), (Y), var(x), ad. Thus, i this sectio, we are iterested i the problem of estimatig the best liear predictor of Y based o X from our radom sample ((X 1, Y 1 ), (X 2, Y 2 ),..., (X, Y )). Oe atural approach is to fid the lie y = A x + B that fits the sample poits best. This is a basic ad importat problem i may areas of mathematics, ot just statistics. The term best meas that we wat to fid the lie (that is, fid A ad B) that miimizes the average of the squared errors betwee the actual y values i our data ad the predicted y values: MSE(A, B) = 1 i =1 (Y i (A X i + B)) 2 Fidig A ad B that miimize M SE is a stadard problem i calculus.

6 6 of 8 7/16/2009 6:06 AM 28. Show that MSE is miimized for A(X, Y) = S 2, B(X, Y) = M(Y) (X) S 2 M(X) (X) ad thus the sample regressio lie is y = M(Y) + S 2 (x M(X)) (X) 29. Show that the miimum mea square error, usig the coefficiets i the previous exercise, is MSE(A(X, Y), B(X, Y)) = S 2 (Y) (1 R 2 (X, Y)) 30. Use the result of the previous exercise to show that 1 R(X, Y) 1 R(X, Y) = 1 if ad oly if the sample poits lie o a lie with egative slope. R(X, Y) = 1 if ad oly if the sample poits lie o a lie with positive slope. Thus, the sample correlatio measures the degree of liearity of the sample poits. The results i the previous exercise ca also be obtaied by otig that the sample correlatio is simply the correlatio of the empirical distributio. Of course, properties (a), (b), ad (c) are kow for the distributio correlatio. The fact that the results i Exercise 28 ad Exercise 29 are the sample aalogies of the correspodig distributio results is beautiful ad reassurig. Note that the sample regressio lie passes through (M(X), M(Y)), the ceter of the empirical distributio. Naturally, the coefficiets of the sample regressio lie ca be viewed as estimators of the respective coefficiets i the distributio regressio lie. 31. Assumig that the appropriate higher order momets are fiite, use the law of large umbers to show that, with probability 1, the coefficiets of the sample regressio lie coverge to the coefficiets of the distributio regressio lie: S 2 as (X) var(x) M(Y) S 2 M(X) (Y) (X) as (X) var(x) As with the distributio regressio lies, the choice of predictor ad respose variables is importat. 32. Show that the sample regressio lie for Y based o X ad the sample regressio lie for X based o Y are ot the same lie, except i the trivial case where the sample poits all lie o a lie. Recall that the costat B that miimizes MSE(B) = 1 i =1 (Y i B) 2

7 7 of 8 7/16/2009 6:06 AM is the sample mea M(Y), ad the miimum value of the mea square error is the sample variace S 2 (Y). Thus, the differece betwee this value of the mea square error ad the oe i Exercise 29, amely S 2 (Y) R 2 (X, Y) is the reductio i the variability of the Y data whe the liear term i X is added to the predictor. The fractioal reductio is R 2 (X, Y), ad hece this statistics is called the (sample) coefficiet of determiatio. Exercises S imulatio Exercises 33. Click i the iteractive scatterplot, i various places, ad watch how the regressio lie chages. 34. Click i the iteractive scatterplot to defie 20 poits. Try to geerate a scatterplot i which the mea of the x values is 0, the stadard deviatio of the x values is 1, ad i which the regressio lie has slope 1, itercept 1 slope 3, itercept 0 slope 2, itercept Click i the iteractive scatterplot to defie 20 poits with the followig properties: the mea of the x values is 1, the mea of the y values is 1, ad the regressio lie has slope 1 ad itercept 2. If you had a difficult time with the previous exercise, it's because the coditios imposed are impossible to satisfy! 36. Ru the bivariate uiform experimet 2000 times, with a update frequecy of 10, i each of the followig cases. Note the apparet covergece of the sample meas to the distributio meas, the sample stadard deviatios to the distributio stadard deviatios, the sample correlatio to the distributio correlatio, ad the sample regressio lie to distributio regressio lie. The uiform distributio o the square The uiform distributio o the triagle. The uiform distributio o the circle. 37. Ru the bivariate ormal experimet 2000 times, with a update frequecy of 10, i each of the followig cases. Note the apparet covergece of the sample meas to the distributio meas, the sample stadard deviatios to the distributio stadard deviatios, the sample correlatio to the distributio correlatio, ad the sample regressio lie to the distributio regressio lie. sd(x) = 1, sd(y) = 2, cor(x, Y) = 0.5 sd(x) = 1.5, sd(y) = 0.5, cor(x, Y) = 0.7 Data Aalysis Exercises

8 8 of 8 7/16/2009 6:06 AM 38. Compute the correlatio betwee petal legth ad petal width for the followig cases i Fisher's iris dat Commet o the differeces. d. All cases Setosa oly Vergiica oly Versicolor oly 39. Compute the correlatio betwee each pair of color cout variables i the M&M data 40. Cosider all cases i Fisher's iris dat Compute the least squares regressio lie with petal legth as the predictor variable ad petal width as the respose variable. Draw the scatterplot ad the regressio lie together. Predict the petal width of a iris with petal legth Cosider the Setosa cases oly i Fisher's iris dat Compute the least squares regressio lie with sepal legth as the predictor variable ad sepal width as the ukow variable. Draw the scatterplot ad regressio lie together. Predict the sepal width of a iris with sepal legth 45. Virtual Laboratories > 6. Radom Samples > Cotets Applets Data Sets Biographies Exteral Resources Key words Feedback

3. Covariance and Correlation

3. Covariance and Correlation Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics

More information

1 Correlation and Regression Analysis

1 Correlation and Regression Analysis 1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio

More information

NPTEL STRUCTURAL RELIABILITY

NPTEL STRUCTURAL RELIABILITY NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics

More information

Properties of MLE: consistency, asymptotic normality. Fisher information.

Properties of MLE: consistency, asymptotic normality. Fisher information. Lecture 3 Properties of MLE: cosistecy, asymptotic ormality. Fisher iformatio. I this sectio we will try to uderstad why MLEs are good. Let us recall two facts from probability that we be used ofte throughout

More information

Chapter 7 Methods of Finding Estimators

Chapter 7 Methods of Finding Estimators Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of

More information

Gregory Carey, 1998 Linear Transformations & Composites - 1. Linear Transformations and Linear Composites

Gregory Carey, 1998 Linear Transformations & Composites - 1. Linear Transformations and Linear Composites Gregory Carey, 1998 Liear Trasformatios & Composites - 1 Liear Trasformatios ad Liear Composites I Liear Trasformatios of Variables Meas ad Stadard Deviatios of Liear Trasformatios A liear trasformatio

More information

AQA STATISTICS 1 REVISION NOTES

AQA STATISTICS 1 REVISION NOTES AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if

More information

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n

THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample

More information

Hypothesis testing. Null and alternative hypotheses

Hypothesis testing. Null and alternative hypotheses Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate

More information

Section 7-3 Estimating a Population. Requirements

Section 7-3 Estimating a Population. Requirements Sectio 7-3 Estimatig a Populatio Mea: σ Kow Key Cocept This sectio presets methods for usig sample data to fid a poit estimate ad cofidece iterval estimate of a populatio mea. A key requiremet i this sectio

More information

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5

0.7 0.6 0.2 0 0 96 96.5 97 97.5 98 98.5 99 99.5 100 100.5 96.5 97 97.5 98 98.5 99 99.5 100 100.5 Sectio 13 Kolmogorov-Smirov test. Suppose that we have a i.i.d. sample X 1,..., X with some ukow distributio P ad we would like to test the hypothesis that P is equal to a particular distributio P 0, i.e.

More information

I. Chi-squared Distributions

I. Chi-squared Distributions 1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.

More information

THE LEAST SQUARES REGRESSION LINE and R 2

THE LEAST SQUARES REGRESSION LINE and R 2 THE LEAST SQUARES REGRESSION LINE ad R M358K I. Recall from p. 36 that the least squares regressio lie of y o x is the lie that makes the sum of the squares of the vertical distaces of the data poits from

More information

Definition. Definition. 7-2 Estimating a Population Proportion. Definition. Definition

Definition. Definition. 7-2 Estimating a Population Proportion. Definition. Definition 7- stimatig a Populatio Proportio I this sectio we preset methods for usig a sample proportio to estimate the value of a populatio proportio. The sample proportio is the best poit estimate of the populatio

More information

Section IV.5: Recurrence Relations from Algorithms

Section IV.5: Recurrence Relations from Algorithms Sectio IV.5: Recurrece Relatios from Algorithms Give a recursive algorithm with iput size, we wish to fid a Θ (best big O) estimate for its ru time T() either by obtaiig a explicit formula for T() or by

More information

Chapter 9: Correlation and Regression: Solutions

Chapter 9: Correlation and Regression: Solutions Chapter 9: Correlatio ad Regressio: Solutios 9.1 Correlatio I this sectio, we aim to aswer the questio: Is there a relatioship betwee A ad B? Is there a relatioship betwee the umber of emploee traiig hours

More information

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method

Chapter 6: Variance, the law of large numbers and the Monte-Carlo method Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value

More information

Stat 104 Lecture 16. Statistics 104 Lecture 16 (IPS 6.1) Confidence intervals - the general concept

Stat 104 Lecture 16. Statistics 104 Lecture 16 (IPS 6.1) Confidence intervals - the general concept Statistics 104 Lecture 16 (IPS 6.1) Outlie for today Cofidece itervals Cofidece itervals for a mea, µ (kow σ) Cofidece itervals for a proportio, p Margi of error ad sample size Review of mai topics for

More information

Alternatives To Pearson s and Spearman s Correlation Coefficients

Alternatives To Pearson s and Spearman s Correlation Coefficients Alteratives To Pearso s ad Spearma s Correlatio Coefficiets Floreti Smaradache Chair of Math & Scieces Departmet Uiversity of New Mexico Gallup, NM 8730, USA Abstract. This article presets several alteratives

More information

ORDERS OF GROWTH KEITH CONRAD

ORDERS OF GROWTH KEITH CONRAD ORDERS OF GROWTH KEITH CONRAD Itroductio Gaiig a ituitive feel for the relative growth of fuctios is importat if you really wat to uderstad their behavior It also helps you better grasp topics i calculus

More information

1. C. The formula for the confidence interval for a population mean is: x t, which was

1. C. The formula for the confidence interval for a population mean is: x t, which was s 1. C. The formula for the cofidece iterval for a populatio mea is: x t, which was based o the sample Mea. So, x is guarateed to be i the iterval you form.. D. Use the rule : p-value

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio We have bee itroduced to the otio that a categorical variable could deped o differet levels of aother variable whe we discussed cotigecy tables. We ll exted this idea to the case

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should

More information

Overview of some probability distributions.

Overview of some probability distributions. Lecture Overview of some probability distributios. I this lecture we will review several commo distributios that will be used ofte throughtout the class. Each distributio is usually described by its probability

More information

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring

Non-life insurance mathematics. Nils F. Haavardsson, University of Oslo and DNB Skadeforsikring No-life isurace mathematics Nils F. Haavardsso, Uiversity of Oslo ad DNB Skadeforsikrig Mai issues so far Why does isurace work? How is risk premium defied ad why is it importat? How ca claim frequecy

More information

Confidence Intervals for One Mean with Tolerance Probability

Confidence Intervals for One Mean with Tolerance Probability Chapter 421 Cofidece Itervals for Oe Mea with Tolerace Probability Itroductio This procedure calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) with

More information

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008

In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008 I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces

More information

Maximum Likelihood Estimators.

Maximum Likelihood Estimators. Lecture 2 Maximum Likelihood Estimators. Matlab example. As a motivatio, let us look at oe Matlab example. Let us geerate a radom sample of size 00 from beta distributio Beta(5, 2). We will lear the defiitio

More information

Fourier Series and the Wave Equation Part 2

Fourier Series and the Wave Equation Part 2 Fourier Series ad the Wave Equatio Part There are two big ideas i our work this week. The first is the use of liearity to break complicated problems ito simple pieces. The secod is the use of the symmetries

More information

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1)

BASIC STATISTICS. f(x 1,x 2,..., x n )=f(x 1 )f(x 2 ) f(x n )= f(x i ) (1) BASIC STATISTICS. SAMPLES, RANDOM SAMPLING AND SAMPLE STATISTICS.. Radom Sample. The radom variables X,X 2,..., X are called a radom sample of size from the populatio f(x if X,X 2,..., X are mutually idepedet

More information

Sampling Distribution And Central Limit Theorem

Sampling Distribution And Central Limit Theorem () Samplig Distributio & Cetral Limit Samplig Distributio Ad Cetral Limit Samplig distributio of the sample mea If we sample a umber of samples (say k samples where k is very large umber) each of size,

More information

Confidence Intervals and Sample Size

Confidence Intervals and Sample Size 8/7/015 C H A P T E R S E V E N Cofidece Itervals ad Copyright 015 The McGraw-Hill Compaies, Ic. Permissio required for reproductio or display. 1 Cofidece Itervals ad Outlie 7-1 Cofidece Itervals for the

More information

USING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR

USING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR USING STATISTICAL FUNCTIONS ON A SCIENTIFIC CALCULATOR Objective:. Improve calculator skills eeded i a multiple choice statistical eamiatio where the eam allows the studet to use a scietific calculator..

More information

Measures of Spread and Boxplots Discrete Math, Section 9.4

Measures of Spread and Boxplots Discrete Math, Section 9.4 Measures of Spread ad Boxplots Discrete Math, Sectio 9.4 We start with a example: Example 1: Comparig Mea ad Media Compute the mea ad media of each data set: S 1 = {4, 6, 8, 10, 1, 14, 16} S = {4, 7, 9,

More information

Lesson 12. Sequences and Series

Lesson 12. Sequences and Series Retur to List of Lessos Lesso. Sequeces ad Series A ifiite sequece { a, a, a,... a,...} ca be thought of as a list of umbers writte i defiite order ad certai patter. It is usually deoted by { a } =, or

More information

Confidence Intervals for One Mean

Confidence Intervals for One Mean Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a

More information

1 Computing the Standard Deviation of Sample Means

1 Computing the Standard Deviation of Sample Means Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.

More information

Soving Recurrence Relations

Soving Recurrence Relations Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree

More information

PSYCHOLOGICAL STATISTICS

PSYCHOLOGICAL STATISTICS UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics

More information

Section 11.3: The Integral Test

Section 11.3: The Integral Test Sectio.3: The Itegral Test Most of the series we have looked at have either diverged or have coverged ad we have bee able to fid what they coverge to. I geeral however, the problem is much more difficult

More information

Sequences and Series

Sequences and Series CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their

More information

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution

University of California, Los Angeles Department of Statistics. Distributions related to the normal distribution Uiversity of Califoria, Los Ageles Departmet of Statistics Statistics 100B Istructor: Nicolas Christou Three importat distributios: Distributios related to the ormal distributio Chi-square (χ ) distributio.

More information

Simple linear regression

Simple linear regression Simple liear regressio Tro Aders Moger 3..7 Example 6: Populatio proportios Oe sample X Assume X ~ Bi(, P, so that P ˆ is a frequecy. P The ~ N(, P( P / (approximately, for large P Thus ~ N(, ( / (approximately,

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 007 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a uow mea µ = E(X) of a distributio by

More information

Math C067 Sampling Distributions

Math C067 Sampling Distributions Math C067 Samplig Distributios Sample Mea ad Sample Proportio Richard Beigel Some time betwee April 16, 2007 ad April 16, 2007 Examples of Samplig A pollster may try to estimate the proportio of voters

More information

Joint Probability Distributions and Random Samples

Joint Probability Distributions and Random Samples STAT5 Sprig 204 Lecture Notes Chapter 5 February, 204 Joit Probability Distributios ad Radom Samples 5. Joitly Distributed Radom Variables Chapter Overview Joitly distributed rv Joit mass fuctio, margial

More information

Riemann Sums y = f (x)

Riemann Sums y = f (x) Riema Sums Recall that we have previously discussed the area problem I its simplest form we ca state it this way: The Area Problem Let f be a cotiuous, o-egative fuctio o the closed iterval [a, b] Fid

More information

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown

Z-TEST / Z-STATISTIC: used to test hypotheses about. µ when the population standard deviation is unknown Z-TEST / Z-STATISTIC: used to test hypotheses about µ whe the populatio stadard deviatio is kow ad populatio distributio is ormal or sample size is large T-TEST / T-STATISTIC: used to test hypotheses about

More information

Methods of Evaluating Estimators

Methods of Evaluating Estimators Math 541: Statistical Theory II Istructor: Sogfeg Zheg Methods of Evaluatig Estimators Let X 1, X 2,, X be i.i.d. radom variables, i.e., a radom sample from f(x θ), where θ is ukow. A estimator of θ is

More information

Discrete Random Variables and Probability Distributions. Random Variables. Chapter 3 3.1

Discrete Random Variables and Probability Distributions. Random Variables. Chapter 3 3.1 UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig

More information

Module 4: Mathematical Induction

Module 4: Mathematical Induction Module 4: Mathematical Iductio Theme 1: Priciple of Mathematical Iductio Mathematical iductio is used to prove statemets about atural umbers. As studets may remember, we ca write such a statemet as a predicate

More information

Now here is the important step

Now here is the important step LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"

More information

when n = 1, 2, 3, 4, 5, 6, This list represents the amount of dollars you have after n days. Note: The use of is read as and so on.

when n = 1, 2, 3, 4, 5, 6, This list represents the amount of dollars you have after n days. Note: The use of is read as and so on. Geometric eries Before we defie what is meat by a series, we eed to itroduce a related topic, that of sequeces. Formally, a sequece is a fuctio that computes a ordered list. uppose that o day 1, you have

More information

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book)

MEI Structured Mathematics. Module Summary Sheets. Statistics 2 (Version B: reference to new book) MEI Mathematics i Educatio ad Idustry MEI Structured Mathematics Module Summary Sheets Statistics (Versio B: referece to ew book) Topic : The Poisso Distributio Topic : The Normal Distributio Topic 3:

More information

B1. Fourier Analysis of Discrete Time Signals

B1. Fourier Analysis of Discrete Time Signals B. Fourier Aalysis of Discrete Time Sigals Objectives Itroduce discrete time periodic sigals Defie the Discrete Fourier Series (DFS) expasio of periodic sigals Defie the Discrete Fourier Trasform (DFT)

More information

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the.

Confidence Intervals. CI for a population mean (σ is known and n > 30 or the variable is normally distributed in the. Cofidece Itervals A cofidece iterval is a iterval whose purpose is to estimate a parameter (a umber that could, i theory, be calculated from the populatio, if measuremets were available for the whole populatio).

More information

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis

Key Ideas Section 8-1: Overview hypothesis testing Hypothesis Hypothesis Test Section 8-2: Basics of Hypothesis Testing Null Hypothesis Chapter 8 Key Ideas Hypothesis (Null ad Alterative), Hypothesis Test, Test Statistic, P-value Type I Error, Type II Error, Sigificace Level, Power Sectio 8-1: Overview Cofidece Itervals (Chapter 7) are

More information

ACCESS - MATH July 2003 Notes on Body Mass Index and actual national data

ACCESS - MATH July 2003 Notes on Body Mass Index and actual national data ACCESS - MATH July 2003 Notes o Body Mass Idex ad actual atioal data What is the Body Mass Idex? If you read ewspapers ad magazies it is likely that oce or twice a year you ru across a article about the

More information

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem

Lecture 4: Cauchy sequences, Bolzano-Weierstrass, and the Squeeze theorem Lecture 4: Cauchy sequeces, Bolzao-Weierstrass, ad the Squeeze theorem The purpose of this lecture is more modest tha the previous oes. It is to state certai coditios uder which we are guarateed that limits

More information

1 Hypothesis testing for a single mean

1 Hypothesis testing for a single mean BST 140.65 Hypothesis Testig Review otes 1 Hypothesis testig for a sigle mea 1. The ull, or status quo, hypothesis is labeled H 0, the alterative H a or H 1 or H.... A type I error occurs whe we falsely

More information

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.

GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number. GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all

More information

Hypothesis Tests Applied to Means

Hypothesis Tests Applied to Means The Samplig Distributio of the Mea Hypothesis Tests Applied to Meas Recall that the samplig distributio of the mea is the distributio of sample meas that would be obtaied from a particular populatio (with

More information

5: Introduction to Estimation

5: Introduction to Estimation 5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample

More information

Incremental calculation of weighted mean and variance

Incremental calculation of weighted mean and variance Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically

More information

Multiple Regression Analysis

Multiple Regression Analysis Extesio of Bi-variate Statistics Y~ radom variables where ~ vectors of radom variables [ ] Y ~ a sigle radom variable c Pogsa Porchaiwisesul Faculty of Ecoomics ultiple Regressio Aalysis Focus o the depedecy

More information

, a Wishart distribution with n -1 degrees of freedom and scale matrix.

, a Wishart distribution with n -1 degrees of freedom and scale matrix. UMEÅ UNIVERSITET Matematisk-statistiska istitutioe Multivariat dataaalys D MSTD79 PA TENTAMEN 004-0-9 LÖSNINGSFÖRSLAG TILL TENTAMEN I MATEMATISK STATISTIK Multivariat dataaalys D, 5 poäg.. Assume that

More information

Convexity, Inequalities, and Norms

Convexity, Inequalities, and Norms Covexity, Iequalities, ad Norms Covex Fuctios You are probably familiar with the otio of cocavity of fuctios. Give a twicedifferetiable fuctio ϕ: R R, We say that ϕ is covex (or cocave up) if ϕ (x) 0 for

More information

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas:

Chapter 7 - Sampling Distributions. 1 Introduction. What is statistics? It consist of three major areas: Chapter 7 - Samplig Distributios 1 Itroductio What is statistics? It cosist of three major areas: Data Collectio: samplig plas ad experimetal desigs Descriptive Statistics: umerical ad graphical summaries

More information

THE TWO-VARIABLE LINEAR REGRESSION MODEL

THE TWO-VARIABLE LINEAR REGRESSION MODEL THE TWO-VARIABLE LINEAR REGRESSION MODEL Herma J. Bieres Pesylvaia State Uiversity April 30, 202. Itroductio Suppose you are a ecoomics or busiess maor i a college close to the beach i the souther part

More information

S. Tanny MAT 344 Spring 1999. be the minimum number of moves required.

S. Tanny MAT 344 Spring 1999. be the minimum number of moves required. S. Tay MAT 344 Sprig 999 Recurrece Relatios Tower of Haoi Let T be the miimum umber of moves required. T 0 = 0, T = 7 Iitial Coditios * T = T + $ T is a sequece (f. o itegers). Solve for T? * is a recurrece,

More information

1 The Binomial Theorem: Another Approach

1 The Binomial Theorem: Another Approach The Biomial Theorem: Aother Approach Pascal s Triagle I class (ad i our text we saw that, for iteger, the biomial theorem ca be stated (a + b = c a + c a b + c a b + + c ab + c b, where the coefficiets

More information

Section 7.2 Confidence Interval for a Proportion

Section 7.2 Confidence Interval for a Proportion Sectio 7.2 Cofidece Iterval for a Proportio Before ay ifereces ca be made about a proportio, certai coditios must be satisfied: 1. The sample must be a SRS from the populatio of iterest. 2. The populatio

More information

Chapter 14 Nonparametric Statistics

Chapter 14 Nonparametric Statistics Chapter 14 Noparametric Statistics A.K.A. distributio-free statistics! Does ot deped o the populatio fittig ay particular type of distributio (e.g, ormal). Sice these methods make fewer assumptios, they

More information

if A S, then X \ A S, and if (A n ) n is a sequence of sets in S, then n A n S,

if A S, then X \ A S, and if (A n ) n is a sequence of sets in S, then n A n S, Lecture 5: Borel Sets Topologically, the Borel sets i a topological space are the σ-algebra geerated by the ope sets. Oe ca build up the Borel sets from the ope sets by iteratig the operatios of complemetatio

More information

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations

CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad

More information

Lesson 17 Pearson s Correlation Coefficient

Lesson 17 Pearson s Correlation Coefficient Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig

More information

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009)

Lecture 13. Lecturer: Jonathan Kelner Scribe: Jonathan Pines (2009) 18.409 A Algorithmist s Toolkit October 27, 2009 Lecture 13 Lecturer: Joatha Keler Scribe: Joatha Pies (2009) 1 Outlie Last time, we proved the Bru-Mikowski iequality for boxes. Today we ll go over the

More information

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: A SIGNIFICANCE TEST FOR REGRESSION MODELS*

ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: A SIGNIFICANCE TEST FOR REGRESSION MODELS* Kobe Uiversity Ecoomic Review 52 (2006) 27 ON SMALL SAMPLE PROPERTIES OF PERMUTATION TESTS: A SIGNIFICANCE TEST FOR REGRESSION MODELS* By HISASHI TANIZAKI I this paper, we cosider a oparametric permutatio

More information

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx

SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval

More information

Statistics Lecture 14. Introduction to Inference. Administrative Notes. Hypothesis Tests. Last Class: Confidence Intervals

Statistics Lecture 14. Introduction to Inference. Administrative Notes. Hypothesis Tests. Last Class: Confidence Intervals Statistics 111 - Lecture 14 Itroductio to Iferece Hypothesis Tests Admiistrative Notes Sprig Break! No lectures o Tuesday, March 8 th ad Thursday March 10 th Exteded Sprig Break! There is o Stat 111 recitatio

More information

Section 9.2 Series and Convergence

Section 9.2 Series and Convergence Sectio 9. Series ad Covergece Goals of Chapter 9 Approximate Pi Prove ifiite series are aother importat applicatio of limits, derivatives, approximatio, slope, ad cocavity of fuctios. Fid challegig atiderivatives

More information

Using Excel to Construct Confidence Intervals

Using Excel to Construct Confidence Intervals OPIM 303 Statistics Ja Stallaert Usig Excel to Costruct Cofidece Itervals This hadout explais how to costruct cofidece itervals i Excel for the followig cases: 1. Cofidece Itervals for the mea of a populatio

More information

Confidence Intervals for the Mean of Non-normal Data Class 23, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Confidence Intervals for the Mean of Non-normal Data Class 23, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom Cofidece Itervals for the Mea of No-ormal Data Class 23, 8.05, Sprig 204 Jeremy Orloff ad Joatha Bloom Learig Goals. Be able to derive the formula for coservative ormal cofidece itervals for the proportio

More information

Review for College Algebra Final Exam

Review for College Algebra Final Exam Review for College Algebra Fial Exam (Please remember that half of the fial exam will cover chapters 1-4. This review sheet covers oly the ew material, from chapters 5 ad 7.) 5.1 Systems of equatios i

More information

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means)

CHAPTER 7: Central Limit Theorem: CLT for Averages (Means) CHAPTER 7: Cetral Limit Theorem: CLT for Averages (Meas) X = the umber obtaied whe rollig oe six sided die oce. If we roll a six sided die oce, the mea of the probability distributio is X P(X = x) Simulatio:

More information

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles

The following example will help us understand The Sampling Distribution of the Mean. C1 C2 C3 C4 C5 50 miles 84 miles 38 miles 120 miles 48 miles The followig eample will help us uderstad The Samplig Distributio of the Mea Review: The populatio is the etire collectio of all idividuals or objects of iterest The sample is the portio of the populatio

More information

Case Study. Normal and t Distributions. Density Plot. Normal Distributions

Case Study. Normal and t Distributions. Density Plot. Normal Distributions Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca

More information

428 CHAPTER 12 MULTIPLE LINEAR REGRESSION

428 CHAPTER 12 MULTIPLE LINEAR REGRESSION 48 CHAPTER 1 MULTIPLE LINEAR REGRESSION Table 1-8 Team Wis Pts GF GA PPG PPcT SHG PPGA PKPcT SHGA Chicago 47 104 338 68 86 7. 4 71 76.6 6 Miesota 40 96 31 90 91 6.4 17 67 80.7 0 Toroto 8 68 3 330 79.3

More information

LECTURE 13: Cross-validation

LECTURE 13: Cross-validation LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M

More information

The second difference is the sequence of differences of the first difference sequence, 2

The second difference is the sequence of differences of the first difference sequence, 2 Differece Equatios I differetial equatios, you look for a fuctio that satisfies ad equatio ivolvig derivatives. I differece equatios, istead of a fuctio of a cotiuous variable (such as time), we look for

More information

x : X bar Mean (i.e. Average) of a sample

x : X bar Mean (i.e. Average) of a sample A quick referece for symbols ad formulas covered i COGS14: MEAN OF SAMPLE: x = x i x : X bar Mea (i.e. Average) of a sample x i : X sub i This stads for each idividual value you have i your sample. For

More information

Equation of a line. Line in coordinate geometry. Slope-intercept form ( 斜 截 式 ) Intercept form ( 截 距 式 ) Point-slope form ( 點 斜 式 )

Equation of a line. Line in coordinate geometry. Slope-intercept form ( 斜 截 式 ) Intercept form ( 截 距 式 ) Point-slope form ( 點 斜 式 ) Chapter : Liear Equatios Chapter Liear Equatios Lie i coordiate geometr I Cartesia coordiate sstems ( 卡 笛 兒 坐 標 系 統 ), a lie ca be represeted b a liear equatio, i.e., a polomial with degree. But before

More information

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals

Overview. Learning Objectives. Point Estimate. Estimation. Estimating the Value of a Parameter Using Confidence Intervals Overview Estimatig the Value of a Parameter Usig Cofidece Itervals We apply the results about the sample mea the problem of estimatio Estimatio is the process of usig sample data estimate the value of

More information

= 1. n n 2 )= n n 2 σ2 = σ2

= 1. n n 2 )= n n 2 σ2 = σ2 SAMLE STATISTICS A rado saple of size fro a distributio f(x is a set of rado variables x 1,x,,x which are idepedetly ad idetically distributed with x i f(x for all i Thus, the joit pdf of the rado saple

More information

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is

Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is 0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values

More information

BINOMIAL EXPANSIONS 12.5. In this section. Some Examples. Obtaining the Coefficients

BINOMIAL EXPANSIONS 12.5. In this section. Some Examples. Obtaining the Coefficients 652 (12-26) Chapter 12 Sequeces ad Series 12.5 BINOMIAL EXPANSIONS I this sectio Some Examples Otaiig the Coefficiets The Biomial Theorem I Chapter 5 you leared how to square a iomial. I this sectio you

More information

LIMIT DISTRIBUTION FOR THE WEIGHTED RANK CORRELATION COEFFICIENT, r W

LIMIT DISTRIBUTION FOR THE WEIGHTED RANK CORRELATION COEFFICIENT, r W REVSTAT Statistical Joural Volume 4, Number 3, November 2006, 189 200 LIMIT DISTRIBUTION FOR THE WEIGHTED RANK CORRELATION COEFFICIENT, r W Authors: Joaquim F. Pito da Costa Dep. de Matemática Aplicada,

More information

A Recursive Formula for Moments of a Binomial Distribution

A Recursive Formula for Moments of a Binomial Distribution A Recursive Formula for Momets of a Biomial Distributio Árpád Béyi beyi@mathumassedu, Uiversity of Massachusetts, Amherst, MA 01003 ad Saverio M Maago smmaago@psavymil Naval Postgraduate School, Moterey,

More information

Chapter 5 Discrete Probability Distributions

Chapter 5 Discrete Probability Distributions Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide Chapter 5 Discrete Probability Distributios Radom Variables Discrete Probability Distributios Epected Value ad Variace Poisso Distributio

More information