Queston 2: What s the varance and standard devaton of a dataset? The varance of the data uses all of the data to compute a measure of the spread n the data. The varance may be computed for a sample of data or a populaton of data. In ether case, we must compute how much each data value dffers from the mean and square that dfference. Let s compute the varance for the mleage of Toyota sedans. Vehcle Mles per Gallon x Prus 50 Camry Hybrd LE 2.5 lter, automatc 41 Camry Hybrd XLE 2.5 lter, automatc 40 Yars 1.5 lter, manual 33 Yars 1.5 lter, automatc 32 Corolla 1.8 lter, manual 30 Corolla 1.8 lter, automatc 29 Camry 2.5 lter, automatc 28 Camry 3.5 lter, automatc 25 Avalon 3.5 lter, automatc 23 4
Start by computng the mean of ths populaton, 50 4140 33 32 30 29 28 25 23 33.1 10 ext we subtract the mean from each data value and square the result. Mles per Gallon x x x 2 50 16.9 285.61 41 7.9 62.41 40 6.9 47.61 33-0.1 0.01 32-1.1 1.21 30-3.1 9.61 29-4.1 16.81 28-5.1 26.01 25-8.1 65.61 23-10.1 102.61 Sum = 0 Sum = 617.5 The sum at the bottom s found by addng the values n the column. The second column measures how much each data value devates from the mean. Values hgher than the mean gve a postve devaton and values lower than the mean gve a negatve devaton. Snce the mean s n the center of the data, the sum of the devatons s zero. 5
Whether a data value falls above or below the mean should not affect the spread of the data. For ths reason, each devaton s squared. The farther the data value s from the mean, the larger the squared devaton s. Values lke 23 or 50 have a hgh squared devaton snce they are farther from the mean of 33.1. Populaton Varance 2 The populaton varance (sgma squared) of data x s the mean of the squared devatons, 2 1 x 2 where s the populaton mean and s the populaton sze. The varance measures the average amount the square of the dstance each data value s from the mean. Based on the table above, x 2 2 1 617.5 61.75 10 The sum n the numerator s the sum of the entres n the thrd column of the table. On average, each data values squared dstance from the mean s 61.75 mpg 2 from the mean. Workng n terms of the squared dstance s nconvenent. To remedy ths, take the square root of the varance. Ths measure s called the populaton standard devaton and measures the spread of the data n terms of the unts on the data. 6
Populaton Standard Devaton The populaton standard devaton s the square root of the populaton varance, 2 1 x 2 where s the populaton mean and s the populaton sze. For the Toyota fleet, the standard devaton s 61.75 7.86 mles per gallon The larger the varance or standard devaton s, the more spread out the data values are about the mean. If the data s from a sample nstead of a populaton, the defntons for varance and standard devaton s slghtly dfferent. Sample Varance The populaton varance the mean of the squared devatons, 2 s (sgma squared) of data x s s n 2 1 x x 2 n 1 where x s the sample mean and n s the sample sze. 7
Sample Standard Devaton The sample standard devaton s s the square root of the sample varance, s s n 2 1 x x 2 n 1 where x s the sample mean and n s the sample sze. The man dfference between the sample and populaton standard devaton s the denomnator. In the populaton expressons, the sum of the squared devatons from the mean s dvded by the populaton sze. In the sample expressons, the sum of the squared devatons from the mean s dvded by one less than the sample sze n. Although the reason for ths dfference s beyond the scope of ths text, usng n 1 nstead of n nsures that the varance s well behaved. Specfcally, f we were to average all sample varances from a populaton, the resultng average s equal to the populaton varance. Despte ths dfference, the steps for calculatng varance and standard devaton for samples or populatons s very smlar. Steps for Computng the Varance and Standard Devaton 1. Identfy the data values x. 2. Fnd the mean of the data values. 8
3. Compute the dfference between the data and the mean for each data value. 4. Square each dfference between the data and the mean. 5. Sum the squares of the dfferences. 6. If the data s a populaton, dvde the sum by the number of data values to fnd the varance. If the data s a sample, dvde the sum by one less than the sample sze, n 1. 7. To fnd the standard devaton, take the square root of the varance. Let s apply these steps to compute the spread n several datasets. Example 1 Compute the Sample Varance and Sample Standard Devaton The table below shows the dvdend yelds of sx companes n the ew York Stock Exchange energy sector. 9
Company Dvdend Yeld July 2012 (%) BP 4.80 Chevron 3.41 Exxon Mobl 2.66 PetroChna 3.50 Petroleo Braslero 1.20 Royal Dutch Shell 4.30 a. Fnd the sample mean. Soluton The data n ths example are the dvdend yelds for each company. The sample mean s x n x 4.80 3.412.66 3.50 1.20 4.30 6 3.312 The mean has been rounded to three decmal places. b. Fnd the sample varance. Soluton Use a table to compute the dfferences from the mean and the squared dfferences from the mean. 10
x x x 2 x x 4.80 1.488 2.214 3.41 0.098 0.010 2.66-0.652 0.425 3.50 0.188 0.035 1.20-2.112 4.461 4.30 0.988 0.976 Sum = 8.121 Dvde the sum at the bottom of the thrd column by 5 to gve the sample varance, s n 2 1 x x 2 n 1 8.121 61 1.624 c. Fnd the sample standard devaton. Soluton The sample standard devaton s the square root of the sample varance, s s 2 1.624 1.27 11
In ths example, the orgnal data was wrtten to two decmal places. To nsure that we can wrte the standard devaton to the same number of decmal places, we wrte numbers n the ntermedate steps to one extra decmal place. Example 2 Compute the Populaton Varance and Populaton Standard Devaton Stock quotes also gve the percentage change n a stock from the prevous days closng prce. For nstance, the quote above ndcates that Ford closed at $9.33 per share. Ths was down from $9.31 per share on the prevous days close. Ths s a percentage change of 9.33 9.35 Percent Change 0.21% 9.35 Percentage changes are often used to determne the volatlty of a companes stock. By computng some statstcs on the percentage change, we can get an dea whether a change n the prce s normal or not. Consder the percentage changes n Ford s prce per share over ten tradng days n June. Date 6/1 6/4 6/5 6/6 6/7 6/8 6/11 6/12 6/13 6/14 % Change -4.17-0.79 1.49 3.73-0.19 1.04-1.97 0.48-1.90 1.07 12
a. Fnd the populaton mean. Soluton For the purpose of ths example, we ll consder the percentage changes over the ten day perod to be a populaton. The mean s x 4.17 0.79 1.49 3.73 0.19 1.04 1.97 0.48 1.90 1.07 0.121 10 b. Fnd the populaton varance. Soluton Calculate the dffference from the mean and the squared dfference from the mean. x -4.17-0.79 1.49 3.73-0.19 1.04-1.97 0.48-1.90 1.07 x -4.049-0.669 1.611 3.851-0.069 1.161-1.849 0.601-1.779 1.191 x 2 16.394 0.448 2.595 14.830 0.005 1.348 3.419 0.361 3.165 1.418 The sum of the bottom row s 43.983. The populaton varance s 2 1 x 2 43.983 10 4.3983 c. Fnd the populaton standard devaton. Soluton The standard devaton s the square root of the varance, 13
s s 2 4.3983 2.10 We ll see n later chapters that stock traders assume that 68% of stock changes le wthn one standard devaton of the mean. A change n prce of greater that 2.10% ndcates above normal strength or weakness, dependng on whether the prce rses or falls. 14