1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers
1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate and interpret measures of spread (IQR, range, standard deviation) in context. n Identify outliers using the 1.5 IQR rule. n Make a boxplot. n Selecta ppropriate measures of center and spread. n Use appropriate graphs and numerical summaries to compare distributions of quantitative variables.
Example: Amt of fat in McDonald s beef sandwiches 9, 12, 19, 19, 23, 24, 26, 26, 28, 29, 39, 39, 40, 42
Calculations n Measures of Center Mean Median n Measures of Spread Range IQR Standard deviation & variance (later) n Outliers
Measuring Center n Mean average value balance point n Median typical value midpoint n www.whfreeman.com/tps4e Mean and Median Applet n When are the mean and median close to the same?
Skewed Distributions n In a right skewed distribution, where is the mean compared to the median? n In a left skewed distribution, where is the mean compared to the median?
Resistance n Which is more resistant to extreme values, the mean or median?
Check Your Understanding n Pg. 55
Center isn t enough n The correct mean concentration of hair dye isn t good enough if some boxes are extremely weak and others are extremely strong. n The correct mean weight of a football isn t good enough if some are extremely light and some are extremely heavy.
Measuring Spread n Can two dotplots with the same center and same shape, have different spreads? Explain.
Range and Interquartile Range n Range Hi-Low n Interquartile Range IQR = Q3 - Q1 n See pg. 56 n Which is a more resistant measure of spread, range or IQR? Explain.
Outliers n 1.5 x IQR n If an observation falls more than 1.5 x IQR above the third quartile or below the first quartile, it can be considered an outlier. n Outliers may reveal interesting information or they may reveal errors. n In the NASA Nimbus-7 data story, outliers were not the error, they were the message!
Mean or Median? Range or IQR? n The mean is sensitive to a few extreme values while the median is not. n A statistician could have his head in an oven and his feet in ice, and he will say that on average he feels fine. n The range is sensitive to extreme values while the IQR is not.
Boxplots n Make a boxplot without and with a calculator for the McDonald s beef sandwiches data on amount of fat. n Assess the center, spread, symmetry, and skewness from the boxplot. n Boxplots are a way to visualize the 5-number summary. n Boxplots do not show the mode like other graphs.
5-number Summary & Boxplot n Minimum n Q1 n Median n Q3 n Maximum
Comparing Distributions n Boxplots show less detail than histograms or stemplots, so they are best used for side-by-side comparison of more than one distribution.
Comparing Quiz Grades n Class A 10, 5, 6, 5, 6, 7, 8, 5, 6, 2 n Class B 2, 10, 10, 4, 2, 5, 1, 10, 9, 7
Comparing Quiz Grades n Mean n Median n Range n IQR n Outliers? Class A Class B
Comparing Quiz Grades Side-by-Side Boxplots Make boxplots by hand and/or on the calculator. Which class did better?
One more measure Exploring Data- Shape, Outliers, Center, Spread n Graphically Histogram, pie chart, dotplot, stemplot, back to back stemplot, boxplot n Numerically Measures of center mean, mode, median Measures of Spread or Variability range, IQR standard deviation & variance
Standard Deviation n Standard deviation measures spread by looking at how far the observations are from the mean. n Be able to interpret the standard deviation It is roughly the average distance each data value is from the mean of the distribution.
Standard Deviation Formula The standard deviation is the square root of the average squared difference from the mean.
Variance n Variance is the standard deviation squared. n Standard deviation is the square root of the variance.
Find & interpret the standard deviation for Class A and for Class B
Standard deviation with a calculator n Understanding what the standard deviation means is more important than being able to calculate it by hand. n Use s if data is from a sample, and use if data consists of an entire population.
Questions about standard deviation n Standard deviation is a measure of spread about the mean as center, so -When is the standard deviation 0? -What makes the standard deviation larger?
What to use and when Mean with standard deviation (not resistant) Median with range & IQR (resistant) Which is best used to describe symmetric distributions without outliers (such as the Normal distribution)? skewed distributions? *Numerical summaries do not fully describe the shape of a distribution. Always plot your data!
Assignment n Read pg. 50-69 n Do pg. 70 (79, 81, 83, 89, 91, 95, 97, 103, 105, 107-110)