1/8/014 AGSC 30 Statistial Methods Numerial desriptive measures Data representation 1. Measures o entral tendeny e.g., mean, mode, median, midrange. e.g., range, variane, standard deviation 3. Measures o distribution shape e.g., normal, skewed, uniorm, random 4. Measures o position e.g., perentiles, quartiles, standard sores Data organization Height o 0 trees: 50, 45, 3, 48, 56, 38, 4, 48, 55, 36, 41, 51, 30, 59, 53, 47, 57, 51, 46, 44 7 6 5 4 3 1 0 30 35 40 45 50 55 60 Height 3 1
1/8/014 Measures o entral tendeny 1. mean or arithmeti average Deinition: sum o values divided by the total number o observations 4 Measures o entral tendeny. Median: Deinition: the midpoint / middle value in a group o data The point that separate the data in two set with the same number o observations Steps: arrange the data in order ind the midpoint 5 Measures o entral tendeny 3. Mode Deinition: the most requently ourring value / observation Notes: not always unique an also be bimodal, multimodal 4. Midrange Deinition: sum o the lowest and highest values divided by 6
1/8/014 Measures o entral tendeny Summary Statistis Mean Value 7 Relationship among mean,median,mode Depending on the shape o the histogram / requeny distribution the mean an be loated dierently in respet with median or mode Mean=Median=Mode Mean<Median<Mode Mode<Median<Mean 8 1. Range: Deinition: the dierene between the largest and smallest observation Range = x max - x min where x max largest observation x min smallest observation 9 3
1/8/014. Variane: Deinition: sum o the squared dierenes between eah observation and the mean, divided by the number o observations. Population Sample 10 Working ormulas or Variane and Standard deviation 11 3. Standard deviation Deinition: the square root o the variane A measure o the spread o the observations in the original units Population Sample 1 4
1/8/014 Variane and Standard deviation Using deinition: Using working ormulas: 13 Range rule o thumb A rough estimate o the standard deviation is a quarter o range range s 4 Example using tree data... s...... 14 4. Coeiient o variation Ratio between standard deviation and mean sample s CV 100 x population CV 100 Example using tree data... CV 100... [%]... 15 5
1/8/014 Measures o entral tendeny Mean or Arithmeti average Deinition: sum o values divided by the total number o observations sample data x 1 x 1 number o lasses Population data th x, value o the lass midpoint requeny o the th lass 1 1 16 Variane and Standard deviation or requeny distribution sample data s 1 ( x x) 1 1 population data 1 ( ) 1 th x, value o the lass midpo int number o lasses requeny o the th lass 17 Example: Daily ommuting times, in minutes Calulate mean, variane, standard deviation, CV Daily ommuting time Number o employees Less than 10 min 4 10 0 min 9 0 30 min 6 30 40 min 4 40 50 min 18 6
1/8/014 Remember: in a lass all individuals are assumed to have the mid-value o the respetive lass Mid-value o the lass = lass mean Commuting time # employees Class mean x μ < 10 min 4 5 0 10 0 min 9 0 30 min 6 30 40 min 4 40 50 min Total 19 Mean ommuting time: Variane: 1 ( ) 4(5...) 9(15...)... (4 9 6 4 ) 1 Standard deviation: σ= Coeiient o variation: CV=σ/μ x100= /.= 0 Use o standard deviation Connet mean with standard deviation Chebyshev s Theorem: For any k>1, at least 1-1/k o the data lie within k standard deviation rom the mean Example: i k= 1-1/k =1-1/4=0.75 or 75% This means that 75% o data values are within two standard deviation rom the mean 1 7
1/8/014 Measures o Distribution Shape Skeweness: a measure o the asymmetry o the requeny distribution n 3 ( xi x) /( n 1) i1 3/ n ( x x) /( n 1) i i1 Kurtosis: measure o the "peakeness" o the requeny distribution n 4 ( xi x) /( n 1) i1 3 n ( x x) /( n 1) i i1 Measure o position Loate the relative position o an observation /data within dataset PERCENTILES divide the data set into 100 groups with equal number o observations indiate the position o an individual in a group Eduation Health related industry Lie sienes (# observations less than x) 0.5 perentile 100 total # observations [%] 3 Perentiles harts 4 8
1/8/014 Standard sores Compare the relative position o observations within their deining dataset Standard sore or z-sore observation ' s value mean x x z standard deviation s Allows omparison o dierent datasets or dierent type o data 5 Standard sore Example: Student reeived 9% Statistis and 75% English Was the overall student s perormane bad? Additional ino: Mean grade or Statistis was 85 and or English was 70 Variane or Statistis was 36 and or English was 9. Compute the z-sores: x x...... x x...... z Statistis... z English... s s Conlusion: 6 Population vs. statistis Various numerial measures an be omputed or the population as well as or a sample Mean, median, variane, oeiient o variation When the measure is omputed or the entire population then the measure is alled population parameter or simply PARAMATER When the measure is omputed or a portion o the population (namely sample), then the measure is alled sample statistis, or simply STATISTIC. 7 9