Chapter 2. Objectives. Tabulate Qualitative Data. Frequency Table. Descriptive Statistics: Organizing, Displaying and Summarizing Data.

Size: px
Start display at page:

Download "Chapter 2. Objectives. Tabulate Qualitative Data. Frequency Table. Descriptive Statistics: Organizing, Displaying and Summarizing Data."

Transcription

1 Objectives Chapter Descriptive Statistics: Organizing, Displaying and Summarizing Data Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically Qualitative data pie charts, bar charts, Pareto Charts. Quantitative data Histograms, Stemplots, Dot plots and Boxplots. Describe the shape of the plot. Summarize data numerically Quantitative data only Measure of center mean, median, midrange, and mode. Measure of position quartiles and percentiles. Measure of spread/variation range, variance, standard deviation, and inter-quartile range. Use TI graphing calculator to obtain statistics. Tabulate Qualitative Data Organize Data Tabulate data into frequency and relative frequency Tables Qualitative data values can be organized by a frequency distribution A frequency distribution lists Each of the categories The frequency/counts for each category Frequency Table A simple data set is blue, blue, green, red, red, blue, red, blue A frequency table for this qualitative data is Color Blue Green Red Frequency The most commonly occurring color is blue 4 3 What Is A Relative Frequency? The relative frequencies are the proportions (or percents) of the observations out of the total A relative frequency distribution lists Each of the categories The relative frequency for each category Relative frequency = Frequency Total

2 Relative Frequency Table A relative frequency table for this qualitative data is Color Blue Green Relative Frequency.500 (= 4/8).5 (= /8) Red.375 (= 3/8) A relative frequency table can also be constructed with percents (50%,.5%, and 37.5% for the above table) Tabulate Quantitative Data Suppose we recorded number of customers served each day for total of 40 days as below: We would like to compute the frequencies and the relative frequencies Frequency/Relative Frequency Table The resulting frequencies and the relative frequencies: Display Data graphically Qualitative data Bar, Pareto, Pie Charts Quantitative data Histograms, Stemplots, Dot plots Bar and Pie Charts for Qualitative Data Graphic Display for Qualitative Data Bar Charts, Pareto Charts, Pie Charts Relative Frequency Bar charts for our simple data (generated with Chart command in Excel) Frequency bar chart Relative frequency bar chart Note: Always label the axes, provide category and numeric scales, and title when you present graphs Relative Frequency Bar Chart Blue Green Red Color Frequency Frequency Bar Chart Blue Green Red Color

3 Pareto Charts A Pareto chart is a particular type of bar graph A Pareto differs from a bar chart only in that the categories are arranged in order The category with the highest frequency is placed first (on the extreme left) The second highest category is placed second Etc. Pareto charts are often used when there are many categories but only the top few are of interest Pareto Charts Here shows a Pareto chart for the simple data set: Color Blue Red Green Relative Frequency Relative Frequency 60% 50% 40% 30% 0% 0% 0% Pareto Chart Blue Red Green Color Side-by-Side Bar Charts Use it to compare multiple bar charts. An example side-by-side bar chart comparing educational attainment in 990 versus 003 Pie Charts Pie Charts are used to display qualitative data. It shows the amount of data that belong to each category as a proportional part of a circle. Pie Chart Green, 3% Blue, 50% Red, 38% Notice that Bar charts show the amount of data that belong to each category as a proportionally sized rectangular area. Pie Charts Another example of a pie chart Summary Qualitative data can be organized in several ways Tables are useful for listing the data, its frequencies, and its relative frequencies Charts such as bar charts, Pareto charts, and pie charts are useful visual methods for organizing data Side-by-side bar charts are useful for comparing multiple sets of qualitative data 3

4 Histogram Graphic Display Quantitative Data Histograms, Stemplots, Dot Plots Histogram is a bar graph which represents a frequency distribution of a quantitative variable. It is a term used only for a bar graph of quantitative data. A histogram is made up of the following components:. A title, which identifies the population of interest. A vertical scale, which identifies the frequencies or relative frequency in the various classes 3. A horizontal scale, which identifies the variable x. Values or ranges of values may be labeled along the x-axis. Use whichever method of labeling the axis best presents the variable. When you make a graph, make sure you label (give descriptions to) both axes clearly, and give a title for the graph too. Histogram for discrete Quantitative data Example of histograms for discrete data Frequencies Relative frequencies Note: The term histogram is used only for a bar graph to summarize quantitative data. The bar chart for qualitative data can not be called a histogram. Also, there are no gaps between bars in a histogram. Categorize/Group Continuous Quantitative Data Continuous type of quantitative data cannot be put directly into frequency tables since they do not have any obvious categories Categories are created using classes, or intervals/ranges of numbers The continuous data is then put into the classes Categorize/Group Continuous Quantitative Data For ages of adults, a possible set of classes is and older For the class is the lower class limit 39 is the upper class limit The class width is the difference between the upper class limit and the lower class limit For the class 30 39, the class width is = 0 (The difference between two adjacent lower class limits) The class midpoint = Average of the lower limits for the two adjacent classes Categorize/Group Continuous Quantitative Data All the classes should have the same widths, except for the last class The class 60 and above is an openended class because it has no upper limit Classes with no lower limits are also called open-ended classes 4

5 Categorize/Group Continuous Quantitative Data The classes and the number of values in each can be put into a frequency table Age and older Number (frequency) In this table, there are 47 subjects between 30 and 39 years old Categorize/Group Continuous Quantitative Data Good practices for constructing tables for continuous variables The classes should not overlap The classes should not have any gaps between them The classes should have the same width (except for possible open-ended classes at the extreme low or extreme high ends) The class boundaries should be reasonable numbers The class width should be a reasonable number Histogram for continuous Quantitative data Just as for discrete data, a histogram can be created from the frequency table Instead of individual data values, the categories are the classes the intervals of data You can label/scale the bars with the lower class limits or class midpoints. Stemplots A stem-and-leaf plot ( or simply Stemplot) is a different way to represent data that is similar to a histogram To draw a stem-and-leaf plot, each data value must be broken up into two components The stem consists of all the digits except for the right most one The leaf consists of the right most digit For the number 73, for example, the stem would be 7 and the leaf would be 3 Example of a Stemplot In the stem-and-leaf plot below The smallest value is 56 The largest value is 80 The second largest value is 78 Stemplots Construction To draw a stem-and-leaf plot Write all the values in ascending order Find the stems and write them vertically in ascending order For each data value, write its leaf in the row next to its stem The resulting leaves will also be in ascending order The list of stems with their corresponding leaves is the stem-and-leaf plot 5

6 Modification to Stemplots Modifications to stem-and-leaf plots Sometimes there are too many values with the same stem we would need to split the stems (such as having 0-4 in one stem and 5-9 in another) If we wanted to compare two sets of data, we could draw two stem-and-leaf plots using the same stem, with leaves going left (for one set of data) and right (for the other set) a sideby-side stem plot Dot Plots A dot plot is a graph where a dot is placed over the observation each time it is observed The following is an example of a dot plot Shapes of Plots for Quantiative Data The pattern of variability displayed by the data of a variable is called distribution. The distribution displays how frequent each value of the variable occurs. A useful way to describe a quantitative variable is by the shape of its distribution Some common distribution shapes are Uniform Bell-shaped (or normal) Skewed right Skewed left Bimodal Uniform Distribution A variable has a uniform distribution when Each of the values tends to occur with the same frequency The histogram looks flat Note: We are not concerned about the shapes of the plots for qualitative data, because there is no particular order arrangement for the categories of the nominal data. Once we change the order, the shape of the graph will be changed. Normal Distribution A variable has a bell-shaped (normal) distribution when Most of the values fall in the middle The frequencies tail off to the left and to the right It is symmetric Right-skewed Distribution A variable has a skewed right distribution when The distribution is not symmetric The tail to the right is longer than the tail to the left The arrow from the middle to the long tail points right In Other words: The direction of skewness is determined by the side of distribution with a longer tail. That is, if a distribution has a longer tail on its right side, it is called a right-skewed distribution. Right 6

7 Left-skewed Distribution A variable has a skewed left distribution when The distribution is not symmetric The tail to the left is longer than the tail to the right The arrow from the middle to the long tail points left Bimodal Distribution There are two peaks/humps or highest points in the distribution. Often implies two populations are sampled. The graph below shows a bimodal distribution for body mass. It implies that data come from two populations, each with its own separate average. Here, one group has an average body mass of 47 grams and the other has a average body mass of 78 grams. Left Summary Quantitative data can be organized in several ways Histogram is the most used graphical tool. Histograms based on data values are good for discrete data Histograms based on classes (intervals) are good for continuous data The shape of a distribution describes a variable histograms are useful for identifying the shapes Summarize data numerically Measure of Center, Spread, and Position Measures of Center Measure of Center Mean, Median, Mode, Midrange Numerical values used to locate the middle of a set of data, or where the data is most clustered The term mean/average is often associated with the measure of center of a distribution. 7

8 Mean An arithmetic mean For a population the population mean Is computed using all the observations in a population Is denoted by a Greek letter µ ( called mu) Is a parameter For a sample the sample mean Is computed using only the observations in a sample Is denoted x (called x bar) Is a statistic Note: We usually cannot measure µ (due to the size of the population) but would like to estimate its value with a sample mean x Formula for Means The sample mean is the sum of all the values divided by the size of the sample, n: x = xi = ( x+ x xn) n n The population mean is the sum of all the values divided by the size of the population, N: Note: µ = xi = ( x + x x N N N is called summation, means summing all values. It is a short-cut notation for adding a set of numbers. ) Example Example:The following sample data represents the number of accidents in each of the last 6 years at a dangerous intersection. Find the mean number of accidents: 8, 9, 3, 5,, 6, 4, 5: Solution: x= = 8 ( ) 5. 5 In the data above, change 6 to 6: Solution: x= = ( ). Note: The mean can be greatly influenced by outliers (extremely large or small values) Median The median denoted by M of a variable is the center. The median splits the data into halves When the data is sorted in order, the median is the middle value The calculation of the median of a variable is slightly different depending on If there are an odd number of points, or If there are an even number of points How to Obtain a Median? To calculate the median of a data set Arrange the data in order Count the number of observations, n If n is odd There is a value that s exactly in the middle That value is the median If n is even There are two values on either side of the exact middle Take their mean to be the median Example An example with an odd number of observations (5 observations) Compute the median of 6,,,, Sort them in order,, 6,, The middle number is 6, so the median is 6 8

9 Example An example with an even number of observations (4 observations) Compute the median of 6,,, Sort them in order,, 6, Take the mean of the two middle values ( + 6) / = 4 The median is 4 Quick Way to Locate Median. Rank the data (Suppose, the sample size is n.). Find the position of the median (counting from either end) using the formula: i= n+ Then, the median is the ith smallest value. Example Suppose we want to find the median of the data set 4, 8, 3, 8,, 9,,, 3,. Rank the data:,, 3, 3, 4, 8, 8, 9,. Find the position of the median using the formula: n+ For the data given, n is 9 (because the size of the sample is 9, that is, there are 9 data values given), so the median position is 9+ = 5 The median is the 5 th smallest or 5 th largest value, which is 4. Example Consider this data set 4, 8, 3, 8,, 9,,, 3, 5. Rank the data:,, 3, 3, 4, 8, 8, 9,, 5. Find the position of the median using the formula: n+ For the data given, n is 0 (because the size of the sample is 0, that is, there are 0 data values given), so the median position is 0+ = 5.5 The median is the 5.5 th smallest or largest value. In other words, it is in the middle of the 5 th and 6 th smallest or largest values. Since the 5 th value is 4 and the 6 th value is 8. We average out 4 and 8, so the median is 6. Mode The mode of a variable is the most frequently occurring value. For instance, Find the mode of the data 6,,, 6,, 7, 3 Since the data contain 6 distinct values:,, 3, 6, 7, and, the value 6 occurs twice, all the other values occur only once, so the mode is 6 Midrange Another useful measure of the center of the distribution is Midrange, which is the number exactly midway between a lowest value data L and a highest value data H. It is found by averaging the low and the high values: midrange= L+ H Note: If two or more values in a sample are tied for the highest frequency (number of occurrences), there is no mode 9

10 Comparing mean and Median The mean and the median are often different This difference gives us clues about the shape of the distribution Is it symmetric? Is it skewed left? Is it skewed right? Are there any extreme values? Mean and Median Symmetric the mean will usually be close to the median Skewed left the mean will usually be smaller than the median Skewed right the mean will usually be larger than the median Symmetric Distribution If a distribution is symmetric, the data values above and below the mean will balance The mean will be in the middle The median will be in the middle Thus the mean will be close to the median, in general, for a distribution that is symmetric Left-skewed Distribution If a distribution is skewed left, there will be some data values that are larger than the others The mean will decrease The median will not decrease as much Thus the mean will be smaller than the median, in general, for a distribution that is skewed left Right-skewed Distribution If a distribution is skewed right, there will be some data values that are larger than the others The mean will increase The median will not increase as much Thus the mean will be larger than the median, in general, for a distribution that is skewed right Mean and Median If one value in a data set is extremely different from the others? For instance, if we made a mistake and 6,, was recorded as 6000,, The mean is now ( ) / 3 = 00 The median is still The median is resistant to extreme values than the mean. 0

11 Round-off Rule When rounding off an answer, a common rule-of-thumb is to keep one more decimal place in the answer than was present in the original data To avoid round-off buildup, round off only the final answer, not intermediate steps Measure of Spread Range, Variance, Standard Deviation Measures of Spread/Dispersion Measures of central tendency alone cannot completely characterize a set of data. Two very different data sets may have similar measures of central tendency. Measures of dispersion are used to describe the spread, or variability, of a distribution Common measures of dispersion: range, variance, and standard deviation Range The range of a variable is the largest data value minus the smallest data value Compute the range of 6,,, 6,, 7, 3, 3 The largest value is The smallest value is Subtracting the two = 0 the range is 0 Note: Please do not confused the range with the midrange which is a measure for the center of data distribution Range The range only uses two values in the data set the largest value and the smallest value The range is affected easily by extreme values in the data. (i.e., not resistant to outliers) If we made a mistake and 6,, was recorded as 6000,, The range is now ( 6000 ) = 5999 Deviations From The Mean The variance is based on the deviation from the mean ( x i µ ) for populations ( x i x ) for samples Deviation may be positive or negative depending on if value is above the mean or below the mean. So, the sum of all deviations will be zero. To avoid the cancellation of the positive deviations and the negative deviations when we add them up, we square the deviations first: ( x i µ ) for populations ( x i x ) for samples

12 Population Variance The population variance of a variable is the average of these squared deviations, i.e. is the sum of these squared deviations divided by the number in the population ( xi µ ) ( x µ ) + ( x µ ) ( xn µ ) = N N The population variance is represented by σ (namely sigma square) Note: For accuracy, use as many decimal places as allowed by your calculator during the calculation of the squared deviations, if the average is not a whole number. Example Compute the population variance of 6,,, Compute the population mean first µ = ( ) / 4 = 5 Now compute the squared deviations ( 5) = 6, ( 5) = 9, (6 5) =, ( 5) = 36 Average the squared deviations ( ) / 4 = 5.5 The population variance σ is 5.5 Sample Variance The sample variance of a variable is the average deviations for the sample data, i.e., is the sum of these squared deviations divided by one less than the number in the sample ( xi x) ( x x) + ( x x) ( xn x) = n n The sample variance is represented by s Note: we use n as the devisor. Example Compute the sample variance of 6,,, Compute the sample mean first = ( ) / 4 = 5 Now compute the squared deviations ( 5) = 6, ( 5) = 9, (6 5) =, ( 5) = 36 Average the squared deviations ( ) / 3 = 0.7 The sample variance s is 0.7 Computational Formulas for the Sample Variance A shortcut (a quick way to compute) formula for the sample variance: ( because you do not need to compute all the deviations from the mean.) s = ( x) x n n x is the sum of the squars of each data value. ( x) is the square of the sum of all data values. For the above example, x = = 6,( x) = ( ) = S = 4 = Compare Population and Sample Variances Why are the population variance (5.5) and the sample variance (0.7) different for the same set of numbers? In the first case, { 6,,, } was the entire population (divide by N) In the second case, { 6,,, } was just a sample from the population (divide by n ) These are two different situations

13 Why Population and Sample Variances are different? Why do we use different formulas? The reason is that using the sample mean is not quite as accurate as using the population mean If we used n in the denominator for the sample variance calculation, we would get a biased result Bias here means that we would tend to underestimate the true variance Standard Deviation The standard deviation is the square root of the variance The population standard deviation Is the square root of the population variance (σ ) Is represented by σ The sample standard deviation Is the square root of the sample variance (s ) Is represented by s Note: Standard deviation can be interpreted as the average deviation of the data. It has the same measuring unit as the original data ( e.g. inches). The variance has a squared unit (e.g. inches ). Example If the population is { 6,,, } The population variance σ = 5.5 The population standard deviation σ = 5.5 = If the sample is { 6,,, } The sample variance s = 0.7 The sample standard deviation s = 0.7 = 4.5 The population standard deviation and the sample standard deviation apply in different situations 3.9 Compute mean and Variance for A Frequency Distribution To calculate the mean, variance for a set of sample data: In a grouped frequency distribution, we use the frequency of occurrence associated with each class midpoint In an ungrouped frequency distribution, use the frequency of occurrence, f, of each observation x xf = f s = ( xf) f f x f Grouped Data To compute the mean, variance, and standard deviation for grouped data Assume that, within each class, the mean of the data is equal to the class midpoint (which is an average of two adjacent lower lass limits.) Use the class midpoint as an approximated value for all data in the same class, since their actual values are not provided. The number of times the class midpoint value is used is equal to the frequency of the class For instance, if 6 values are in the interval [ 8, 0 ], then we assume that all 6 values are equal to 9 (the midpoint of [ 8, 0 ] Example of Grouped Data As an example, for the following frequency table, Class Midpoint Frequency 0.9 we calculate the mean as if The value occurred 3 times The value 3 occurred 7 times The value 5 occurred 6 times The value 7 occurred time

14 Example of Grouped Data Example of Grouped Data Class Midpoint 3 5 Frequency The calculation for the mean would be Since the sample size = f = = 7 the Sum of squared values = x f = = 65 f the square of the sum = ( x ) = ( ) = 6 = Or ( 3) + (3 7) + (5 6) + (7 ) 7 Which follows the formula xf X = f = 3.6 Follow the short-cut formula for the sample variance, we obtain the sample variance the sample standard deviation S = = = S =.88=.7 Summary The mean for grouped data Use the class midpoints Obtain an approximation for the mean The variance and standard deviation for grouped data Use the class midpoints Obtain an approximation for the variance and standard deviation Example of Ungrouped Data Example: A survey of students in the first grade at a local school asked for the number of brothers and/or sisters for each child. The results are summarized in the table below. Here, we see 5 students responded o sibling, 7 students responded sibling, etc. Total number of students in this survey is 6, which is n= f. Find ) the mean, ) the variance, and 3) the standard deviation: ) Solutions: First: Sum: x= 93/ 6= 5. x f x f x f ( 93) ) s = 6 6 = 63. 3) s= 63. = 8. Measures of Position Measure of Position Percentiles, Quartiles Measures of position are used to describe the relative location of an observation within a data set. Quartiles and percentiles are two of the most popular measures of position Quartiles are part of the 5-number summary 4

15 Percentile The median divides the lower 50% of the data from the upper 50% The median is the 50 th percentile If a number divides the lower 34% of the data from the upper 66%, that number is the 34 th percentile Quartiles Quartiles divide the data set into four equal parts The quartiles are the 5 th, 50 th, and 75 th percentiles Q = 5 th percentile Q = 50 th percentile = median Q 3 = 75 th percentile Quartiles are the most commonly used percentiles The 50 th percentile and the second quartile Q are both other ways of defining the median How to Find Quartiles?. Order the data from smallest to largest.. Find the median Q. 3. The first quartile (Q ) is then the median of the lower half of the data; that is, it is the median of the data falling below the median (Q ) position (and not including Q ). 4. The third quartile (Q 3 ) is the median of the upper half of the data; that is, it is the median of the data falling above the Q position (not including Q ). Note: Excel has a set of different rules to compute these quartiles than the TI graphing calculator which will follow the rules stated above. So, different software may give different quartiles, particularly if the sample size is an odd-numbed. However, for a large data set, the values are often not much different. In our class, we will only follow the rules stated here. Example The following data represents the ph levels of a random sample of swimming pools in a California town. Find the three quartiles. Solutions: ) Median= Q = the average of the 0 th and th smallest values = ( )/ =6.55 ) The first quartile = Q = the median of the 0 values below the median = the average of the 5 th and 6 th smallest values = ( )/ = 6.0 3) The third quartile =Q 3 = the median of the 0 values above the median = the average of the 5 th and 6 th smallest values = ( )/ = 6.95 Outliers Extreme observations in the data are referred to as outliers Outliers should be investigated Outliers could be Chance occurrences Measurement errors Data entry errors Sampling errors Outliers are not necessarily invalid data How To Detect Outliers? One way to check for outliers uses the quartiles Outliers can be detected as values that are significantly too high or too low, based on the known spread The fences used to identify outliers are Lower fence = LF = Q.5 IQR Upper fence = UF = Q IQR Values less than the lower fence or more than the upper fence could be considered outliers 5

16 Example Is the value 54 an outlier?, 3, 4, 7, 8, 5, 6, 9, 3, 4, 7, 3, 33, 54 Calculations Q = (4 + 7) / = 5.5 Q 3 = (7 + 3) / = 9 IQR = = 3.5 UF = Q IQR = = 64 Using the fence rule, the value 54 is not an outlier Another Measure of the Spread Inter-quartile range (IQR) Inter-quartile Range (IQR) The inter-quartile range (IQR) is the difference between the third and first quartiles IQR = Q 3 Q The IQR is a resistant measurement of spread. Its value will not be affected easily by extremely large or small values in a data set, since IQR covers only the middle 50% of values.) Another Graphical Tool to Summarize Data Five-number Summary & Boxplot Five-number Summary The five-number summary is the collection of The smallest value The first quartile (Q or P 5 ) The median (M or Q or P 50 ) The third quartile (Q 3 or P 75 ) The largest value These five numbers give a concise description of the distribution of a variable Why These Five Numbers? The median Information about the center of the data Resistant measure of a center The first quartile and the third quartile Information about the spread of the data Resistant measure of a spread The smallest value and the largest value Information about the tails of the data 6

17 Example Compute the five-number summary for the ordered data:, 3, 4, 7, 8, 5, 6, 9, 3, 4, 7, 3, 33, 54 Calculations The minimum = Q = P 5, Q = 7 M = Q = P 50 = (6 + 9) / = 7.5 Q 3 = P 75 = 7 The maximum = 54 The five-number summary is, 7, 7.5, 7, 54 Boxplot The five-number summary can be illustrated using a graph called the boxplot An example of a (basic) boxplot is The middle box shows Q, Q, and Q 3 The horizontal lines (sometimes called whiskers ) show the minimum and maximum How to draw A Boxplot? To draw a (basic) boxplot:. Calculate the five-number summary. Draw & scale a horizontal number line which will cover all the data from the minimum to the maximum 3. Mark the 5 numbers on the number line according to the scale. 4. Superimpose these five marked points on some distance above the lines. 5. Draw a box with the left edge at Q and the right edge at Q 3 6. Draw a line inside the box at M = Q 7. Draw a horizontal line from the Q edge of the box to the minimum and one from the Q 3 edge of the box to the maximum Example To draw a (basic) boxplot Draw the middle box Draw in the median Draw the minimum and maximum Voila! A Modified Boxplot An example of a more sophisticated boxplot is The middle box shows Q, Q, and Q 3 The horizontal lines (sometimes called whiskers ) show the minimum and maximum The asterisk on the right shows an outlier (determined by using the upper fence) How To Draw A Modified Boxplot? To draw a modified boxplot. Draw the center box and mark the median, as before. Compute the upper fence and the lower fence 3. Temporarily remove the outliers as identified by the upper fence and the lower fence (but we will add them back later with asterisks) 4. Draw the horizontal lines to the new minimum and new maximum (These are the minimum and maximum within the fence) 5. Mark each of the outliers with an asterisk Note: Sometimes, data contain no outliers. You will obtain a basic boxplot. 7

18 To draw this boxplot Example Draw the middle box and the median Draw in the fences, remove the outliers (temporarily) Draw the minimum and maximum Draw the outliers as asterisks Interpret a Boxplot The distribution shape and boxplot are related Symmetry (or lack of symmetry) Quartiles Maximum and minimum Relate the distribution shape to the boxplot for Symmetric distributions Skewed left distributions Skewed right distributions Symmetric Distribution Left-skewed Distribution Distribution Q is equally far from the median as Q 3 is The min is equally far from the median as the max is Boxplot The median line is in the center of the box The left whisker is equal to the right whisker Distribution Q is further from the median than Q 3 is The min is further from the median than the max is Boxplot The median line is to the right of center in the box The left whisker is longer than the right whisker Q M Q 3 Min Q M Q 3 Max Min Q MQ 3 Max Min Q MQ 3 Max Right-skewed Distribution Side-by-side Boxplot Distribution Q is closer to the median than Q 3 is The min is closer to the median than the max is Boxplot The median line is to the left of center in the box The left whisker is shorter than the right whisker We can compare two distributions by examining their boxplots We draw the boxplots on the same horizontal scale We can visually compare the centers We can visually compare the spreads We can visually compare the extremes Min Q M Q 3 Max Min Q M Q 3 Max 8

19 Example Comparing the flight with the control samples Center Spread Summary 5-number summary Minimum, first quartile, median, third quartile maximum Resistant measures of center (median) and spread (interquartile range) Boxplots Visual representation of the 5-number summary Related to the shape of the distribution Can be used to compare multiple distributions Entering Data into TI Calculator Using Technology for Statistics Instruction for TI Graphing Calculator Enter data in lists: Press STAT then choose EDIT menu. (We ll denote the sequence of the key strokes by STAT EDIT). Entering data one by one (press Return after each entry) under a blank column which represents a variable (a list). Note:. Clear a list: on EDIT screen, use the up arrow to place the cursor on the list name, press CLEAR, then ENTER (that is, CLEAR ENTER). You need to always clear a list before entering a new set of data into the list. Warning! Pressing the DEL key instead of CLEAR will delete the list from the calculator. You can get it back with the INS key. See Insert a new list below.. List name: there are six built-in lists, L through L6, and you can add more with your own names. You can get the L symbol by pressing the ND key, then key [ nd ].(The instruction in the brackets shows the sequence of keys you need to press, here, you press ND key, then key to have a L symbol.) 3. Insert a new list (optional): STAT EDIT, use the up arrow to place the cursor on a list name, then press INS [ nd DEL ]. Type the name of a list using the alpha character keys. The ALPHA key is locked down for you. Press ENTER. The new list is placed just before the point where the cursor was. To obtain a quick statistics, just use one of the build-in list L through L6 to enter the data, you do not need to create a new list with a name. Obtain Numeric Measures from TI Calculator. After entering data, return to home screen by pressing QUIT[ nd MODE].. Press STAT Key, select CALC menu, then choose the number operation : -Var Stats, then ENTER. Enter the name of the list, say L. That is, STAT CALC ENTER L Note: L is the default list. You do not need to enter it, if the data is on L Obtain Statistics from a Frequency Distribution Enter the values in one list, say L, and their corresponding frequencies in another list, say L. Then, STAT CALC ENTER L, L Note: Need to enter comma L after L. The calculator will use the second list as the frequency for the values entered on its list before to calculate the appropriate statistics. 9

20 Example Example: A random sample of students in a sixth grade class was selected. Their weights are given in the table below. Find the mean and variance, standard deviation, 5-number summary for this data using the TI calculator: The output shows: Example Consider the grouped data we considered previously: Class Midpoint Frequency Use TI calculator to obtain the statistics: x = x= 6 x = S x = σ x = n= 5 min X = 63 Q = 84 Med = 9 Q3 = 99 max X = Note:. Since this a sample data, we take S x as the standard deviation.. You may need to press the arrow key on the calculator several times to view these many statistics. x = x= 6 x = 65 Sx = σ x = n= 7 min X = Q = 3 Med = 3 Q3 = 5 max X = 7 The output shows: Note: Here, the notations used in the calculator correspond to the notations used in the formula for computing mean, variance and standard deviation of a frequency distribution: n= f x= x f x = x f 0

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

2 Describing, Exploring, and

2 Describing, Exploring, and 2 Describing, Exploring, and Comparing Data This chapter introduces the graphical plotting and summary statistics capabilities of the TI- 83 Plus. First row keys like \ R (67$73/276 are used to obtain

More information

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.) Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

3: Summary Statistics

3: Summary Statistics 3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Chapter 2: Frequency Distributions and Graphs

Chapter 2: Frequency Distributions and Graphs Chapter 2: Frequency Distributions and Graphs Learning Objectives Upon completion of Chapter 2, you will be able to: Organize the data into a table or chart (called a frequency distribution) Construct

More information

How To Write A Data Analysis

How To Write A Data Analysis Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175) Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

More information

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers 1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

Statistics Chapter 2

Statistics Chapter 2 Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

a. mean b. interquartile range c. range d. median

a. mean b. interquartile range c. range d. median 3. Since 4. The HOMEWORK 3 Due: Feb.3 1. A set of data are put in numerical order, and a statistic is calculated that divides the data set into two equal parts with one part below it and the other part

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

More information

Mind on Statistics. Chapter 2

Mind on Statistics. Chapter 2 Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table

More information

Box-and-Whisker Plots

Box-and-Whisker Plots Mathematics Box-and-Whisker Plots About this Lesson This is a foundational lesson for box-and-whisker plots (boxplots), a graphical tool used throughout statistics for displaying data. During the lesson,

More information

Bar Graphs and Dot Plots

Bar Graphs and Dot Plots CONDENSED L E S S O N 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs

More information

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck! STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.

More information

+ Chapter 1 Exploring Data

+ Chapter 1 Exploring Data Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1 Analyzing Categorical Data 1.2 Displaying Quantitative Data with Graphs 1.3 Describing Quantitative Data with Numbers Introduction

More information

Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

AP STATISTICS REVIEW (YMS Chapters 1-8)

AP STATISTICS REVIEW (YMS Chapters 1-8) AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

More information

Lesson 4 Measures of Central Tendency

Lesson 4 Measures of Central Tendency Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

More information

How Does My TI-84 Do That

How Does My TI-84 Do That How Does My TI-84 Do That A guide to using the TI-84 for statistics Austin Peay State University Clarksville, Tennessee How Does My TI-84 Do That A guide to using the TI-84 for statistics Table of Contents

More information

How To Check For Differences In The One Way Anova

How To Check For Differences In The One Way Anova MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Module 4: Data Exploration

Module 4: Data Exploration Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Chapter 3. The Normal Distribution

Chapter 3. The Normal Distribution Chapter 3. The Normal Distribution Topics covered in this chapter: Z-scores Normal Probabilities Normal Percentiles Z-scores Example 3.6: The standard normal table The Problem: What proportion of observations

More information

Introduction; Descriptive & Univariate Statistics

Introduction; Descriptive & Univariate Statistics Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of

More information

Measures of Central Tendency and Variability: Summarizing your Data for Others

Measures of Central Tendency and Variability: Summarizing your Data for Others Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :

More information

Gestation Period as a function of Lifespan

Gestation Period as a function of Lifespan This document will show a number of tricks that can be done in Minitab to make attractive graphs. We work first with the file X:\SOR\24\M\ANIMALS.MTP. This first picture was obtained through Graph Plot.

More information

6 3 The Standard Normal Distribution

6 3 The Standard Normal Distribution 290 Chapter 6 The Normal Distribution Figure 6 5 Areas Under a Normal Distribution Curve 34.13% 34.13% 2.28% 13.59% 13.59% 2.28% 3 2 1 + 1 + 2 + 3 About 68% About 95% About 99.7% 6 3 The Distribution Since

More information

Intro to Statistics 8 Curriculum

Intro to Statistics 8 Curriculum Intro to Statistics 8 Curriculum Unit 1 Bar, Line and Circle Graphs Estimated time frame for unit Big Ideas 8 Days... Essential Question Concepts Competencies Lesson Plans and Suggested Resources Bar graphs

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

Sta 309 (Statistics And Probability for Engineers)

Sta 309 (Statistics And Probability for Engineers) Instructor: Prof. Mike Nasab Sta 309 (Statistics And Probability for Engineers) Chapter 2 Organizing and Summarizing Data Raw Data: When data are collected in original form, they are called raw data. The

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

5/31/2013. 6.1 Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives. The Normal Distribution C H 6A P T E R The Normal Distribution Outline 6 1 6 2 Applications of the Normal Distribution 6 3 The Central Limit Theorem 6 4 The Normal Approximation to the Binomial Distribution

More information

Chapter 2 Data Exploration

Chapter 2 Data Exploration Chapter 2 Data Exploration 2.1 Data Visualization and Summary Statistics After clearly defining the scientific question we try to answer, selecting a set of representative members from the population of

More information

Topic 9 ~ Measures of Spread

Topic 9 ~ Measures of Spread AP Statistics Topic 9 ~ Measures of Spread Activity 9 : Baseball Lineups The table to the right contains data on the ages of the two teams involved in game of the 200 National League Division Series. Is

More information

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables

STT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random

More information

Describing and presenting data

Describing and presenting data Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data

More information

AP Statistics Solutions to Packet 2

AP Statistics Solutions to Packet 2 AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that

More information

Common Tools for Displaying and Communicating Data for Process Improvement

Common Tools for Displaying and Communicating Data for Process Improvement Common Tools for Displaying and Communicating Data for Process Improvement Packet includes: Tool Use Page # Box and Whisker Plot Check Sheet Control Chart Histogram Pareto Diagram Run Chart Scatter Plot

More information

Microsoft Excel 2010 Part 3: Advanced Excel

Microsoft Excel 2010 Part 3: Advanced Excel CALIFORNIA STATE UNIVERSITY, LOS ANGELES INFORMATION TECHNOLOGY SERVICES Microsoft Excel 2010 Part 3: Advanced Excel Winter 2015, Version 1.0 Table of Contents Introduction...2 Sorting Data...2 Sorting

More information

Measurement with Ratios

Measurement with Ratios Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical

More information

Practice#1(chapter1,2) Name

Practice#1(chapter1,2) Name Practice#1(chapter1,2) Name Solve the problem. 1) The average age of the students in a statistics class is 22 years. Does this statement describe descriptive or inferential statistics? A) inferential statistics

More information

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

consider the number of math classes taken by math 150 students. how can we represent the results in one number? ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.

More information

3 Describing Distributions

3 Describing Distributions www.ck12.org CHAPTER 3 Describing Distributions Chapter Outline 3.1 MEASURES OF CENTER 3.2 RANGE AND INTERQUARTILE RANGE 3.3 FIVE-NUMBER SUMMARY 3.4 INTERPRETING BOX-AND-WHISKER PLOTS 3.5 REFERENCES 46

More information

Week 3&4: Z tables and the Sampling Distribution of X

Week 3&4: Z tables and the Sampling Distribution of X Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Expression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds

Expression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds Isosceles Triangle Congruent Leg Side Expression Equation Polynomial Monomial Radical Square Root Check Times Itself Function Relation One Domain Range Area Volume Surface Space Length Width Quantitative

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab 1 Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab I m sure you ve wondered about the absorbency of paper towel brands as you ve quickly tried to mop up spilled soda from

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics 2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions. Algebra I Overview View unit yearlong overview here Many of the concepts presented in Algebra I are progressions of concepts that were introduced in grades 6 through 8. The content presented in this course

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite

ALGEBRA. sequence, term, nth term, consecutive, rule, relationship, generate, predict, continue increase, decrease finite, infinite ALGEBRA Pupils should be taught to: Generate and describe sequences As outcomes, Year 7 pupils should, for example: Use, read and write, spelling correctly: sequence, term, nth term, consecutive, rule,

More information

Dongfeng Li. Autumn 2010

Dongfeng Li. Autumn 2010 Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

More information

A and B This represents the probability that both events A and B occur. This can be calculated using the multiplication rules of probability.

A and B This represents the probability that both events A and B occur. This can be calculated using the multiplication rules of probability. Glossary Brase: Understandable Statistics, 10e A B This is the notation used to represent the conditional probability of A given B. A and B This represents the probability that both events A and B occur.

More information

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize

More information