# Table 2-1. Sucrose concentration (% fresh wt.) of 100 sugar beet roots. Beet No. % Sucrose. Beet No.

Save this PDF as:

Size: px
Start display at page:

Download "Table 2-1. Sucrose concentration (% fresh wt.) of 100 sugar beet roots. Beet No. % Sucrose. Beet No."

## Transcription

1 Chapter 2. DATA EXPLORATION AND SUMMARIZATION 2.1 Frequency Distributions Commonly, people refer to a population as the number of individuals in a city or county, for example, all the people in California. In the statistical sense, however, there are many populations that can be defined for the people of California such as their heights, weights, annual incomes and so on. Statistically speaking, therefore, a population is the entire set of measurements of a specific characteristic for a defined number of individuals. Populations are characterized by certain constants called parameters. Most of the populations we deal with in biological research are too large to be completely observed. Samples must be drawn to obtain information about a population. Parameters estimated from samples are called statistics. To summarize the information from a large sample, a frequency table may be constructed as the first step in data analysis. To construct a frequency table the range (i.e., the largest minus the smallest) of the observed values is divided into equal sized classes, and the number of frequency of observations falling into each class is recorded. The set of frequencies so obtained, together with their respective classes or class midpoints, is called a frequency distribution. In Table 2-1 are shown the sucrose concentrations of 100 sugar beets which were collected in 1949 by F. J. Hills at Woodland, California. Table 2-1. Sucrose concentration (% fresh wt.) of 100 sugar beet roots. Beet No. % Sucrose Beet No. % Sucrose Beet No. % Sucrose Beet No. % Sucrose Steps in the procedure for developing the frequency table (Table 2-2) are enumerated below and will be applied to the sucrose data for Table 2-1.

2 (1) Determine the range (R) which is the largest measurement minus the smallest measurement of the data. In Table 2-1, R = = (2) Select the number (k) of classes into which the data are to be grouped. (a) (b) For most data 8 to 20 classes are recommended, i.e., 8 < k < 20. With fewer classes too much accuracy is lost, and with more the summary is too extensive. A rough approximation for k may be obtained from Sturges' rule, k = log N, where N is the number of measurements in the data. For Table 2-1, k = log 100 = (2) = 7.6. Therefore, no less than 8 classes should be used. We shall select 9 classes. (3) Select a convenient class interval (C) which is the difference between the upper and lower boundaries of each class. The quantity R/k provides an estimate for the class interval, but the final choice is a matter of udgment depending upon the research worker's knowledge of the variable under study and fidelity and clarity with which the table will reveal the essential characteristics of the mass of data being classified. Interpretation and subsequent computation is facilitated if the class interval is kept simple and the same for all classifications in the table. From Table 2-1, r/k = 12.8/9 = We will, however, use a class interval of 1.5. (4) Select a value for the lower boundary of the lowest class to include the smallest measurements. Successive additions of the class interval will establish the other class boundaries or class limits. The process is continued until all measurements are included. Class boundaries should be selected to avoid ambiguity and in such a way that there is no question as to the class in which each measurement falls. In the case of discrete data, since counts are involved, noncontinuous boundaries are usually used, and there is no ambiguity as to the class into which each count falls. Various methods used in representing class boundaries for continuous data are described below. (a) The most satisfactory method to avoid having measurements falling on class boundaries is to express the boundaries to one-half unit beyond the accuracy of the measurements. Since the sucrose figures in Table 2-1 have been rounded to the nearest one-tenth of one percent, class boundaries would be expressed as , , etc. No sucrose values would therefore fall on a class boundary. The midpoints of these classes are 4.8, 6.3, etc. (b) Sometimes the classes are semi-open intervals indicated as 4.0 and under 5.5, 5.5 and under 7.0, etc., with midpoints 4.75, 6.25, etc. (5) Tally the number of measurements falling into each class. The resulting frequency table is shown in Table 2-2.

3 Table 2-2. Frequency table from data of Table 2-1. Class Class Boundaries Class Midpoints Tallied Frequency Frequency (f) Cumulative Frequency This table shows more clearly than the textual data that the category contains more sucrose values than any other category. Fifty-six percent of the values are less than percent. Graphic presentations are sometimes more effective portrayals of quantitative data than are numbers in a frequency table. The histogram (block diagram) and frequency polygon are common methods of presentation. The histogram presents the data as a series of adacent rectangles with horizontal bases centered at the class midpoints and equal in length to the class interval, and heights equal to the corresponding frequencies. Since these rectangles have equal bases, their areas are also proportional to the frequencies. The frequency polygon is a broken-line graph having class midpoints represented on the horizontal axis and frequencies of the vertical axis. The frequency corresponding to each class midpoint is marked by a point, and consecutive points are connected by straight lines. This procedure is equivalent to oining with straight lines the midpoints of the tops of the rectangles of the histogram. The histogram and frequency polygon for the data of Table 2-2 are shown in Figure 2-1. Figure 2-1. Histogram and frequency polygon for the data of Table 2-2.

4 Relative cumulative frequencies are calculated by dividing the cumulative frequencies by the total number of observations and can be plotted as in Figure 2-2. Figure 2-2. A cumulative frequency plot for the data of Table 2-2. The figure estimates the percentage of the individuals in the population whose sucrose concentration is less than or equal to a given value. For example, 22% of the individual beets have sucrose concentrations less than or equal to 10.05%, or 79% have concentrations less than or equal to 14.55%. Assume similar data from two varieties of sugarbeet were available. The relative frequencies of both varieties can be plotted to compare the likelihood of their sugar concentrations. In figure 2-3 the relative cumulative frequency of the sugar content of variety A is compared with variety B. Figure 2-3. Relative cumulative frequencies of sugar concentrations of two varieties of sugarbeets

5 It is clear from the figure that variety A is superior as 50% of the beets contain 12.3% or less sucrose, while in the case of variety B 50% of the beets have sucrose contents of only 9.5% or less Stem-and-leaf display. The conventional frequency distribution loses track of the individual observations. The stemand-leaf display is a frequency distribution in a tabular form that retains all individual observations. The sucrose concentrations of Table 2-1 are displayed as "stems and leaves" in Figure 2-4. Each line is a stem starting with a whole number. The leaves are the numbers after the decimal. The asterisks "*" indicate the number of digits in a leaf. Therefore, one "*" means that one digit equals one leaf and "**" means that two digits equal one leaf. for example stem 7 contains the leaves 10.0, 10.1, 10.4, 10.5, 10.6 and If there were two asterisks then stem 7 would contain the leaves 10.01, 10.45, It is possible for a stem-and-leaf display to have stems with different numbers of digits per leaf. Stem No. Initial # Digits Values of Leaves # Leaves Value in a Leaf (unit = 0.1%) on Stem 1 4 * * * * * * * * * * * * * * 2 1 Figure 2-4. A stem-and-leaf display of the data of Table 2-1. The unit of the leaf is 0.1 of a percentage point. It may be desirable to examine in more detail the distribution of the observations. For this, the stems can be further divided. For example, stems starting with 12, 13 and 14 can be "stretched" by placing all leaves with values of 4 or less in one line ("*") and 5 or more in the second line ("."). Thus each stem is stretched into two as shown in Figure 2-5A. The stems can be stretched further if needed. In B of Figure 2-5, each starting number has been stretched into 5 stems. The symbols after the starting number represent the following grouping of leaves: "*" for 0 and 1; "t" and for 2 and 3; "f" for 4 and 5; "s" for 6 and 7; and "." for 8 and 9.

6 12 *. 13 *. 14 * * t f s. 13 * t f s. 14 * t f s A B Figure 2-5. Stretched stem-and-leaf displays. While frequency distributions and graphs summarize a mass of data, it is useful to simplify a presentation further by defining certain measures which describe important features of the distribution. Two important feature are used to describe a distribution, a measure of location or central tendency and a measure of spread or dispersion. The more common measure of central tendency and dispersion will be briefly discussed Measures of Central Tendency Mean. The mean is the most frequently used measure of central tendency. Two main advantages of the mean are that it lends itself well to algebraic manipulation, and its reliability in sampling. The symbol used to represent a population mean is μ and Y or X is used to represent a sample mean. For a sample of size r: r Y = Σ Y / r = ( Y1 + Y Yr) / r = 1 For the data in Table 2-1: Y = ( )/100 = 12.2 Median. The median of a set of r numbers arranged in order of magnitude is the middle number of the set if r is odd and the mean of the middle numbers if r is even. The median is easy to define, easy to calculate, and is not influenced by extreme values. In economic statistics it is often desirable to disregard extreme variates, which may be due to unusual circumstances. The median, however, does not lend itself to algebraic expression and manipulation.

7 When the data of Table 2-1 are ranked in ascending order, the 50th and 51st variates are Therefore the median is Mode. The mode of a set of r numbers is the value in the set which occurs most frequently. There may be no mode, and there may be several if more than one variate occurs with equal maximum frequency. The mode is even less important than the median because of its ambiguity. The possibility of more than one modal value and the impossibility of expressing it algebraically render it unsatisfactory as a useful measure. The mode for the data of Table 2-1 is 15.0 as this value occurred with the greatest frequency (5 times). We can also define the modal class of grouped data as the class containing the largest number of variates. For Table 2-2, the modal class is the one with the midpoint 12.4 and a frequency of 24 observations. If a distribution is symmetrical, the mean, the median and mode are the same. If a distribution is skewed to the right, the mean is greater than the median which in turn is greater than the mode. The relationships are reversed if the distribution is skewed to the left. Figure 2-6 illustrates these relationships. 2.4 Measures of Dispersion Figure 2-6. The relationships among mean, median and mode for symmetrical and asymmetrical distributions. The variability among the variates of a population is an important property of the population and can be quantified in several ways. A measure of variability along with a measure of central tendency gives a more complete description of a population that either parameter alone. Variance and Standard Deviation The most common measure of variability and the one shown to be most useful is the standard deviation and its square, the variance.

8 The population variance is usually denoted by 2. The best estimator of 2 from a sample of variates is S 2 : S = r Σ ( Y Y) /( r 1) 2 2 = 1 which will be referred to as the variance of a sample. There are two important aspects of this definition. The sum of the squared deviates (numerator of the expression) is the minimal sum of squares that can be obtained, that is, substituting any other value for Y will result in a larger sum of squares. The deviates (Y Y) sum to zero. Because of this property there are only r-1 independent squares in the numerator so that when r-1 squares are specified, the last one is also determined. The number of independent terms is called the degrees of freedom and symbolized df. It can be shown that: 2 2 Σ( Y Y) = ΣY ( ΣY) 2 / r which provides a second expression of S 2, i.e., S = [ ΣY ( ΣY) / r]/( r 1) This expression is known as the computing form for the sample variance, and the right hand term ( Y ) 2 / r, is called the correction term or correction factor. The variance is expressed in units which are the squares of the units in which the original variates and the mean are expressed. For practical purposes it is desirable to have all these quantities expressed in the same units. The square root of the variance, called the standard deviation, provides a measure of dispersion about the mean expressed in the same units as the original variates and the mean: For the data in Table 2-1, S = S = [ Σ( Y Y) ] / r and = 1 [( ) + (.. ) + (.. ) (.. ) 2 ] =. S S = 69. = 26. % sucrose Range The simplest measure of dispersion is the difference between the largest and smallest values, called the range. Although the range is easy to calculate, it possesses the following disadvantages:

9 (a) (b) (c) It only uses a fraction of the available information concerning variation in the data. It tends to become larger as the number of observations is increased. Thus ranges calculated from sets of data with unequal number of values cannot be compared directly. It depends on extreme values and therefore is not stable from sample to sample, particularly for large sample sizes. For small numbers of observations, the range is a reasonably good measure of variation. The range value for the data in Table 2-1 is ( ) = Measures of Central Tendency and Dispersion for Grouped Data For large samples when the data have been organized in a frequency table (e.g. Table 2-2), measures of central tendency and dispersion may be closely approximated. Computation made from the frequency distribution are based on the assumption that all values included within the limits of a class interval are distributed to represent each of the values in the class, i.e., the midpoint is considered as occurring as many times as the class frequency. Formulas for the measures of central tendency and dispersion for grouped data are shown below. Mean. For a set of r measurements arranged in a frequency table, in which the class midpoints are Y 1, Y 2, Y 3,..., Y k with corresponding class frequencies f 1, f 2, f 3,..., f k, the mean is: Y = (f 1 Y 1 + f 2 Y 2 _ f 3 Y f k Y k ) / r = Σf Y /r, and r = Σf. For the data in Table 2-2, Y = [4.8(1) + 6.3(4) (4)] / 100 = 1230/100 = 12.3 Median. The median class is the class which contains the median. The median (M) may be estimated from the formula. M = L + C [(r/2) - F] / f m where C is the class interval, L is the lower boundary of the median class, F is the cumulative frequency of all classes below the median class, and f m is the frequency of the median class. For the data in Table 2-2, M = [100/2-32]/24 = 12.7 Modal Class. The class with the greatest frequency is called the modal class. For the data in Table 2-2, the 6th class is the modal class. Variance S 2 = [Σf (Y - Y) 2 /(r-1)] = [Σf Y 2 - (Σf Y ) 2 /Σf ] / (Σf -1) for the data in Table 2-2,

10 S 2 = 1( ) 2 /99 + 4( ) 2 / ( ) 2 /99 = 706.5/99 = 7.1 Standard Deviation S = S 2 For the sucrose concentration data in Table 2-1, S = 71. = The Five-Number Summaries We have shown how to describe a population from a large sample by computing measures of central tendency and dispersion. Among these, the median and the extreme values are relatively easy to find with little computation. With additional numbers, called "hinges" we will have a so-called fivenumber summary. Hinges are the values mid-way between each extreme and the median. Hinges can be easily found by adding a cumulative frequency column to stem-and-leaf diagram Figure 2-4. This is obtained by counting and summing the leaves for each stand from the extremes to the stem that contains the median. The count for the median stem is the number of leaves it contains. Figure 2-7 shows the stem-and-leaf display of Figure 2-4 and a column of cumulative counts from the two extremes. Cumulative Leaves on Stem Initial Value of the Stem Leaves 1 4* (14) Figure 2-7 Stem-and-leaf display with cumulative counts from the extremes to the median class. The median is the average of the 50th and 51st value in the ranked data, Note that the median can be determined by counting to the 50th and 51st values. The rank (H) of the hinge number is defined as:

11 1 N + 1 H = ( I+ ) if N is odd, I = 0 if (N+1)/2 is even otherwise I = = 1 N (I + ) if N is even, I = 0 if N/2 id even otherwise I = For example the rank of hinge for various N-values are shown below: N H For the sucrose data of Figure 2-7, the hinge rank is (0+100/2)/2 = 25. The hinge values are found by counting to the 25th rank from each extreme and are 10.5 and The five values are 4.4 (smallest), 10.5 (hinge), 12.6 (median), 14.5 (hinge) and 17.2 (greatest). These can be summarized in the following form: N = 100 M H where N is total observations, M is median, H is hinge rank and the numbers outside of the box are ranks from each extreme value. These values can be plotted as a "box-and-whisker" diagram. Figure 2-8. Box-and-whisker plot of sucrose concentration data.

12 It is common practice to describe a large sample by its mean, standard deviation and extreme values. Such a description for the sucrose data is plotted as B in Figure Measures of central tendency. SUMMARY (a) Ungrouped data (1) Mean. Y = ΣY / r for a set of r items Y 1, Y 2, Y 3,..., Y r (2) Median. The median is the middle item of an odd number of ranked measurements or the mean of the two middle items of an even number of ranked measurements. (3) Mode. The mode is the item(s) which appears most frequently. (b) Grouped data (1) Mean. Y = Σf Y /r for a set of items Y 1, Y 2, Y 3,..., Y k with corresponding frequencies f 1, f 2,..., f k (2) Median. Median = L + C (r/2 - F) / f m where L is the lower boundary of the median class, C is the class interval, F is the cumulative frequency of all classes below the median class, and f m is the frequency of the median class. 2. Measures of dispersion (a) Ungrouped Data (1) Sample Variance S = Σ( Y Y) /( r 1) = [ ΣfY) / r]/( r 1) (2) Sample Standard deviation, S = S 2 (3) Range. The range is the largest item minus the smallest item. (b) Grouped Data (1) Sample Variance S = Σf ( Y Y) /( r = 1) = [ ΣfY ( ΣfY) / r]/( r 1) (2) Sample Standard Deviation, S = S 2 3. Data summaries

13 (a) Stem-and-leaf display Frequency table keeps track of the individual observations. (b) Five-number summary The five numbers are the extreme values, the hinges, and the median. 1EXERCISES 1. For the set of numbers 9, 6, 10, 8, 12, 11, 8, 8, 9 find the a) mean (9) b) median (9) c) mode (8) d) range (6) e) variance by use of the formula 2 2 S = Σ( Y Y) /( r 1) (3.25) f) variance by use of the formula S = [ ΣY ( ΣY) / r]/( r 1) (3.25) 2. For the set of numbers 6.5, , 12.1, 10.5, 9.2, 7.8, 12.9 calculate to two decimal places the mean, variance and standard deviation. (9.00, 8.14, 2.85) 3. Eight rats on a special diet had their weight gains recorded in grams as follows: 22.7, 25.8, 6.4, 2.18, 17.2, 18.6, 19.2, Find the mean and standard deviation of the gains in weight. (20.3, 3.10) 4. Find the mean and standard deviation for the following frequency distribution: Y: f: (6.20, 2.59) 5. The following data are the prices in cents per pound of a particular brand of cheese in 12 different stores: For these prices what is the

14 a) mean (107) b) median (107.5) c) mode (110) d) range (40) e) variance (134.36) f) standard deviation (11.59) 6. Subtract 100 cents from all prices in Exercise 5, and calculate the mean and variance. What conclusions do you draw? (7, ) 7. Given the frequency distribution: Class boundaries Frequency Find to two decimal places the a) mean b) variance c) median class d) median e) modal class (10.90, 15.20, , 11.00, ) Draw the histogram for the data of Exercise For a set of 20 values of Y it is known that ΣY i = 900. and S = 6. What is the mean of the set? (3.3) 10. Give some reasons why frequency distributions are important in statistics. 11. Based on a clinical survey of 100 patients, it was found that 10% of the patients wait 10 minutes before being seen by a physician, 15% wait 20 minutes, 25% wait 30 minutes, 35% wait 40 minutes, 10% wait 50 minutes and 5% wait 60 minutes. Find the mean and median amount of the time patients spend waiting to see a physician. What is the standard deviation? (33.5, 35.0, S = 12.82) 12. the following are the weights of 24 eggs measured in grams. 60.3, 53.8, 49.9, 52.4, 53.2, 46.6, 53.2, 51.0, 51.8, 52.9, 50.3, 52.6, 48.4, 52.6, 48.9, 53.9, 52.4, 48.7, 46.3, 45.4, 44.4, 44.4, 46.8, a) Calculate the mean, variance and standard deviation of this set of data. (50.37, 13.79, 3.71) b) Make a frequency table for these weights. c) Complete the mean, variance and standard deviation using your frequency table.

15 13. Use data in 12. a) Draw a stem-and-leaf display of the data. b) Make a five-number summary of the data. c) Draw a box-and-whisker plot.

### A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that

### Chapter 3: Data Description Numerical Methods

Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

### Descriptive Statistics. Frequency Distributions and Their Graphs 2.1. Frequency Distributions. Chapter 2

Chapter Descriptive Statistics.1 Frequency Distributions and Their Graphs Frequency Distributions A frequency distribution is a table that shows classes or intervals of data with a count of the number

### Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

### We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students:

MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having

### Chapter 2. Objectives. Tabulate Qualitative Data. Frequency Table. Descriptive Statistics: Organizing, Displaying and Summarizing Data.

Objectives Chapter Descriptive Statistics: Organizing, Displaying and Summarizing Data Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

### Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion

Statistics Basics Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion Part 1: Sampling, Frequency Distributions, and Graphs The method of collecting, organizing,

### Desciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods

Desciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods Qualitative data Data are classified in categories Non numerical (although may be numerically codified) Elements

### Chapter 3: Central Tendency

Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### REVISED GCSE Scheme of Work Mathematics Higher Unit T3. For First Teaching September 2010 For First Examination Summer 2011

REVISED GCSE Scheme of Work Mathematics Higher Unit T3 For First Teaching September 2010 For First Examination Summer 2011 Version 1: 28 April 10 Version 1: 28 April 10 Unit T3 Unit T3 This is a working

### Frequency Distributions

Displaying Data Frequency Distributions After collecting data, the first task for a researcher is to organize and summarize the data to get a general overview of the results. Remember, this is the goal

### Lecture I. Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions.

Lecture 1 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a process consisting of 3 parts. Lecture

### 2. Describing Data. We consider 1. Graphical methods 2. Numerical methods 1 / 56

2. Describing Data We consider 1. Graphical methods 2. Numerical methods 1 / 56 General Use of Graphical and Numerical Methods Graphical methods can be used to visually and qualitatively present data and

### STATISTICS FOR PSYCH MATH REVIEW GUIDE

STATISTICS FOR PSYCH MATH REVIEW GUIDE ORDER OF OPERATIONS Although remembering the order of operations as BEDMAS may seem simple, it is definitely worth reviewing in a new context such as statistics formulae.

### Chapter 2: Frequency Distributions and Graphs

Chapter 2: Frequency Distributions and Graphs Learning Objectives Upon completion of Chapter 2, you will be able to: Organize the data into a table or chart (called a frequency distribution) Construct

### Introduction to Descriptive Statistics

Mathematics Learning Centre Introduction to Descriptive Statistics Jackie Nicholas c 1999 University of Sydney Acknowledgements Parts of this booklet were previously published in a booklet of the same

### 4. Introduction to Statistics

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

### F. Farrokhyar, MPhil, PhD, PDoc

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

### Slides by. JOHN LOUCKS St. Edward s University

s by JOHN LOUCKS St. Edward s University 1 Chapter 2, Part A Descriptive Statistics: Tabular and Graphical Presentations Summarizing Qualitative Data Summarizing Quantitative Data 2 Summarizing Qualitative

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### Central Tendency and Variation

Contents 5 Central Tendency and Variation 161 5.1 Introduction............................ 161 5.2 The Mode............................. 163 5.2.1 Mode for Ungrouped Data................ 163 5.2.2 Mode

### Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

### Unit 24 Hypothesis Tests about Means

Unit 24 Hypothesis Tests about Means Objectives: To recognize the difference between a paired t test and a two-sample t test To perform a paired t test To perform a two-sample t test A measure of the amount

### Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS

Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction

### Sixth Grade Math Pacing Guide Page County Public Schools MATH 6/7 1st Nine Weeks: Days Unit: Decimals B

Sixth Grade Math Pacing Guide MATH 6/7 1 st Nine Weeks: Unit: Decimals 6.4 Compare and order whole numbers and decimals using concrete materials, drawings, pictures and mathematical symbols. 6.6B Find

### Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

### Methods for Describing Data Sets

1 Methods for Describing Data Sets.1 Describing Data Graphically In this section, we will work on organizing data into a special table called a frequency table. First, we will classify the data into categories.

### Chapter 2 Summarizing and Graphing Data

Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms 2-4 Graphs that Enlighten and Graphs that Deceive Preview Characteristics of Data 1. Center: A

### Homework 3. Part 1. Name: Score: / null

Name: Score: / Homework 3 Part 1 null 1 For the following sample of scores, the standard deviation is. Scores: 7, 2, 4, 6, 4, 7, 3, 7 Answer Key: 2 2 For any set of data, the sum of the deviation scores

### Graphical and Tabular. Summarization of Data OPRE 6301

Graphical and Tabular Summarization of Data OPRE 6301 Introduction and Re-cap... Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information

### Organizing & Graphing Data

AGSC 320 Statistical Methods Organizing & Graphing Data 1 DATA Numerical representation of reality Raw data: Data recorded in the sequence in which they are collected and before any processing Qualitative

### Chapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures- Graphs are used to describe the shape of a data set.

Page 1 of 16 Chapter 2: Exploring Data with Graphs and Numerical Summaries Graphical Measures- Graphs are used to describe the shape of a data set. Section 1: Types of Variables In general, variable can

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### CHAPTER 3 CENTRAL TENDENCY ANALYSES

CHAPTER 3 CENTRAL TENDENCY ANALYSES The next concept in the sequential statistical steps approach is calculating measures of central tendency. Measures of central tendency represent some of the most simple

### Utah Core Curriculum for Mathematics

Core Curriculum for Mathematics correlated to correlated to 2005 Chapter 1 (pp. 2 57) Variables, Expressions, and Integers Lesson 1.1 (pp. 5 9) Expressions and Variables 2.2.1 Evaluate algebraic expressions

### Chapter 2: Frequency Distributions and Graphs (or making pretty tables and pretty pictures)

Chapter 2: Frequency Distributions and Graphs (or making pretty tables and pretty pictures) Example: Titanic passenger data is available for 1310 individuals for 14 variables, though not all variables

### Chapter 7 What to do when you have the data

Chapter 7 What to do when you have the data We saw in the previous chapters how to collect data. We will spend the rest of this course looking at how to analyse the data that we have collected. Stem and

### TYPES OF DATA TYPES OF VARIABLES

TYPES OF DATA Univariate data Examines the distribution features of one variable. Bivariate data Explores the relationship between two variables. Univariate and bivariate analysis will be revised separately.

### 13.2 Measures of Central Tendency

13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

### Data Analysis: Describing Data - Descriptive Statistics

WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

### In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

MATHEMATICS: THE LEVEL DESCRIPTIONS In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. Attainment target

### MATHS LEVEL DESCRIPTORS

MATHS LEVEL DESCRIPTORS Number Level 3 Understand the place value of numbers up to thousands. Order numbers up to 9999. Round numbers to the nearest 10 or 100. Understand the number line below zero, and

### Descriptive Statistics

Chapter 2 Descriptive Statistics 2.1 Descriptive Statistics 1 2.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Display data graphically and interpret graphs:

### Summarizing Data: Measures of Variation

Summarizing Data: Measures of Variation One aspect of most sets of data is that the values are not all alike; indeed, the extent to which they are unalike, or vary among themselves, is of basic importance

### ! x sum of the entries

3.1 Measures of Central Tendency (Page 1 of 16) 3.1 Measures of Central Tendency Mean, Median and Mode! x sum of the entries a. mean, x = = n number of entries Example 1 Find the mean of 26, 18, 12, 31,

### Each exam covers lectures from since the previous exam and up to the exam date.

Sociology 301 Exam Review Liying Luo 03.22 Exam Review: Logistics Exams must be taken at the scheduled date and time unless 1. You provide verifiable documents of unforeseen illness or family emergency,

### MATH CHAPTER 2 EXAMPLES & DEFINITIONS

MATH 10043 CHAPTER 2 EXAMPLES & DEFINITIONS Section 2.2 Definition: Frequency distribution a chart or table giving the values of a variable together with their corresponding frequencies. A frequency distribution

### MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

### Year 8 - Maths Autumn Term

Year 8 - Maths Autumn Term Whole Numbers and Decimals Order, add and subtract negative numbers. Recognise and use multiples and factors. Use divisibility tests. Recognise prime numbers. Find square numbers

### Means, standard deviations and. and standard errors

CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

### Domain Essential Question Common Core Standards Resources

Middle School Math 2016-2017 Domain Essential Question Common Core Standards First Ratios and Proportional Relationships How can you use mathematics to describe change and model real world solutions? How

### Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives

Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

### Data Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010

Data Mining Part 2. and Preparation 2.1 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline Introduction Measuring the Central Tendency Measuring the Dispersion of Data Graphic Displays References

### CAMI Education linked to CAPS: Mathematics

- 1 - TOPIC 1.1 Whole numbers _CAPS curriculum TERM 1 CONTENT Mental calculations Revise: Multiplication of whole numbers to at least 12 12 Ordering and comparing whole numbers Revise prime numbers to

### LearnStat MEASURES OF CENTRAL TENDENCY, Learning Statistics the Easy Way. Session on BUREAU OF LABOR AND EMPLOYMENT STATISTICS

LearnStat t Learning Statistics the Easy Way Session on MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS MEASURES OF CENTRAL TENDENCY, DISPERSION AND SKEWNESS OBJECTIVES At the end of the session,

### 32 Measures of Central Tendency and Dispersion

32 Measures of Central Tendency and Dispersion In this section we discuss two important aspects of data which are its center and its spread. The mean, median, and the mode are measures of central tendency

### Statistics Summary (prepared by Xuan (Tappy) He)

Statistics Summary (prepared by Xuan (Tappy) He) Statistics is the practice of collecting and analyzing data. The analysis of statistics is important for decision making in events where there are uncertainties.

### YEAR 8 SCHEME OF WORK - SECURE

YEAR 8 SCHEME OF WORK - SECURE Autumn Term 1 Number Spring Term 1 Real-life graphs Summer Term 1 Calculating with fractions Area and volume Decimals and ratio Straight-line graphs Half Term: Assessment

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### 909 responses responded via telephone survey in U.S. Results were shown by political affiliations (show graph on the board)

1 2-1 Overview Chapter 2: Learn the methods of organizing, summarizing, and graphing sets of data, ultimately, to understand the data characteristics: Center, Variation, Distribution, Outliers, Time. (Computer

### Descriptive Statistics

Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

### Graphical methods for presenting data

Chapter 2 Graphical methods for presenting data 2.1 Introduction We have looked at ways of collecting data and then collating them into tables. Frequency tables are useful methods of presenting data; they

### Northumberland Knowledge

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

### Report of for Chapter 2 pretest

Report of for Chapter 2 pretest Exam: Chapter 2 pretest Category: Organizing and Graphing Data 1. "For our study of driving habits, we recorded the speed of every fifth vehicle on Drury Lane. Nearly every

### There are some general common sense recommendations to follow when presenting

Presentation of Data The presentation of data in the form of tables, graphs and charts is an important part of the process of data analysis and report writing. Although results can be expressed within

### Mathematics Teachers Self Study Guide on the national Curriculum Statement. Book 2 of 2

Mathematics Teachers Self Study Guide on the national Curriculum Statement Book 2 of 2 1 WORKING WITH GROUPED DATA Material written by Meg Dickson and Jackie Scheiber RADMASTE Centre, University of the

### MCQ S OF MEASURES OF CENTRAL TENDENCY

MCQ S OF MEASURES OF CENTRAL TENDENCY MCQ No 3.1 Any measure indicating the centre of a set of data, arranged in an increasing or decreasing order of magnitude, is called a measure of: (a) Skewness (b)

### Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

### not to be republished NCERT Measures of Central Tendency

You have learnt in previous chapter that organising and presenting data makes them comprehensible. It facilitates data processing. A number of statistical techniques are used to analyse the data. In this

### Introduction. One of the most convincing and appealing ways in which statistical results may be presented is through diagrams and graphs.

Introduction One of the most convincing and appealing ways in which statistical results may be presented is through diagrams and graphs. Just one diagram is enough to represent a given data more effectively

### 10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

### RESOURCE FOR THE COURSE 1 CARNEGIE TEXTBOOK. SECTION and PAGE NUMBER TOPIC. 1.1 Pages 5-14 Factors and Multiples

RESOURCE FOR THE COURSE 1 CARNEGIE TEXTBOOK SECTION and PAGE NUMBER TOPIC 1.1 Pages 5-14 Factors and Multiples Properties of Addition and Multiplication (Commutative, Associative and Distributive) Apply

### CH.6 Random Sampling and Descriptive Statistics

CH.6 Random Sampling and Descriptive Statistics Population vs Sample Random sampling Numerical summaries : sample mean, sample variance, sample range Stem-and-Leaf Diagrams Median, quartiles, percentiles,

### 2.2-Frequency Distributions

2.2-Frequency Distributions When working with large data sets, it is often helpful to organize and summarize data by constructing a table called a frequency distribution, defined later. Because computer

### The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

### Central Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode

Central Tendency Central Tendency n A single summary score that best describes the central location of an entire distribution of scores. n Measures of Central Tendency: n Mean n The sum of all scores divided

### Assessment Anchors and Eligible Content Aligned to the Algebra 1 Pennsylvania Core Standards

Assessment Anchors and Eligible Content Aligned to the Algebra 1 Pennsylvania Core Standards MODULE 1 A1.1.1 ASSESSMENT ANCHOR Operations and Linear Equations & Inequalities Demonstrate an understanding

### MEI Statistics 1. Exploring data. Section 1: Introduction. Looking at data

MEI Statistics Exploring data Section : Introduction Notes and Examples These notes have sub-sections on: Looking at data Stem-and-leaf diagrams Types of data Measures of central tendency Comparison of

### Ungrouped data. A list of all the values of a variable in a data set is referred to as ungrouped data.

1 Social Studies 201 September 21, 2006 Presenting data See text, chapter 4, pp. 87-160. Data sets When data are initially obtained from questionnaires, interviews, experiments, administrative sources,

### GCSE HIGHER Statistics Key Facts

GCSE HIGHER Statistics Key Facts Collecting Data When writing questions for questionnaires, always ensure that: 1. the question is worded so that it will allow the recipient to give you the information

### 3: Summary Statistics

3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

### Maths Level Targets. This booklet outlines the maths targets for each sub-level in maths from Level 1 to Level 5.

Maths Level Targets This booklet outlines the maths targets for each sub-level in maths from Level 1 to Level 5. Expected National Curriculum levels for the end of each year group are: Year 1 Year 2 Year

Algebra I COURSE DESCRIPTION: The purpose of this course is to allow the student to gain mastery in working with and evaluating mathematical expressions, equations, graphs, and other topics, with an emphasis

### Unit 21 Student s t Distribution in Hypotheses Testing

Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between

### Interactive Math Glossary Terms and Definitions

Terms and Definitions Absolute Value the magnitude of a number, or the distance from 0 on a real number line Additive Property of Area the process of finding an the area of a shape by totaling the areas

### Curriculum overview for Year 1 Mathematics

Curriculum overview for Year 1 Counting forward and back from any number to 100 in ones, twos, fives and tens identifying one more and less using objects and pictures (inc number lines) using the language

### Maths Targets Year 1 Addition and Subtraction Measures. N / A in year 1.

Number and place value Maths Targets Year 1 Addition and Subtraction Count to and across 100, forwards and backwards beginning with 0 or 1 or from any given number. Count, read and write numbers to 100

### Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

### Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

### In this chapter, you will learn to use descriptive statistics to organize, summarize, analyze, and interpret data for contract pricing.

3.0 - Chapter Introduction In this chapter, you will learn to use descriptive statistics to organize, summarize, analyze, and interpret data for contract pricing. Categories of Statistics. Statistics is

### MAT S2.1 2_3 Review and Preview; Frequency Distribution. January 14, Preview. Chapter 2 Summarizing and Graphing Data

MAT 155 Dr. Claude Moore Cape Fear Community College Chapter 2 Summarizing and Graphing Data 2 1 Review and Preview 2 3 Histograms 2 4 Statistical Graphics 2 5 Critical Thinking: Bad Graphs Preview 1.

### Statistical Concepts and Market Return

Statistical Concepts and Market Return 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 2 2. Some Fundamental Concepts... 2 3. Summarizing Data Using Frequency Distributions...

### Descriptive Statistics and Measurement Scales

Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

### Year 1 Maths Expectations

Times Tables I can count in 2 s, 5 s and 10 s from zero. Year 1 Maths Expectations Addition I know my number facts to 20. I can add in tens and ones using a structured number line. Subtraction I know all

### vs. relative cumulative frequency

Variable - what we are measuring Quantitative - numerical where mathematical operations make sense. These have UNITS Categorical - puts individuals into categories Numbers don't always mean Quantitative...

### Mathematics Scheme of Work: Form

Textbook: Formula One Maths B2 Hodder and Stoughton Chapter No. First Term Revising Time Topic and Subtopic October - December Handout Notes 2 Numbers 2.1 Basic conversions: Changing from large units to

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. C) (a) 3 (b) 51

Chapter 2- Problems to look at Use the given frequency distribution to find the (a) class width. (b) class midpoints of the first class. (c) class boundaries of the first class. 1) Height (in inches) 1)