How To Write A Statement Of Central Tendency



Similar documents
MEASURES OF VARIATION

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Descriptive Statistics

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

3: Summary Statistics

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Exploratory data analysis (Chapter 2) Fall 2011

Module 4: Data Exploration

6.4 Normal Distribution

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

Exploratory Data Analysis

Characteristics of Binomial Distributions

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

COMPARISON MEASURES OF CENTRAL TENDENCY & VARIABILITY EXERCISE 8/5/2013. MEASURE OF CENTRAL TENDENCY: MODE (Mo) MEASURE OF CENTRAL TENDENCY: MODE (Mo)

Means, standard deviations and. and standard errors

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Exploratory Data Analysis. Psychology 3256

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

2 Describing, Exploring, and

Descriptive Statistics and Measurement Scales

Lecture 1: Review and Exploratory Data Analysis (EDA)

How To Write A Data Analysis

AP * Statistics Review. Descriptive Statistics

Section 1.3 Exercises (Solutions)

CALCULATIONS & STATISTICS

Topic 9 ~ Measures of Spread

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Introduction to Quantitative Methods

Data Exploration Data Visualization

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

5/31/ Normal Distributions. Normal Distributions. Chapter 6. Distribution. The Normal Distribution. Outline. Objectives.

Geostatistics Exploratory Analysis

Ch. 3.1 # 3, 4, 7, 30, 31, 32

Shape of Data Distributions

Diagrams and Graphs of Statistical Data

Variables. Exploratory Data Analysis

Interpreting Data in Normal Distributions

Algebra I Vocabulary Cards

Sta 309 (Statistics And Probability for Engineers)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Mean, Median, and Mode

Measurement with Ratios

2. Filling Data Gaps, Data validation & Descriptive Statistics

EXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!


Introduction; Descriptive & Univariate Statistics

AP STATISTICS REVIEW (YMS Chapters 1-8)

Describing, Exploring, and Comparing Data

WEEK #22: PDFs and CDFs, Measures of Center and Spread

Descriptive Statistics

Mean = (sum of the values / the number of the value) if probabilities are equal

Descriptive statistics parameters: Measures of centrality

99.37, 99.38, 99.38, 99.39, 99.39, 99.39, 99.39, 99.40, 99.41, cm

3.2 Measures of Spread

The Normal Distribution

Simple linear regression

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

Using SPSS, Chapter 2: Descriptive Statistics

AP Statistics Solutions to Packet 2

Statistics. Measurement. Scales of Measurement 7/18/2012

II. DISTRIBUTIONS distribution normal distribution. standard scores

Using Excel (Microsoft Office 2007 Version) for Graphical Analysis of Data

Chapter 1: Exploring Data

Exercise 1.12 (Pg )

4. Continuous Random Variables, the Pareto and Normal Distributions

Lecture 2. Summarizing the Sample

6 3 The Standard Normal Distribution

Northumberland Knowledge

Coins, Presidents, and Justices: Normal Distributions and z-scores

Standard Deviation Estimator

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

What is a Box and Whisker Plot?

CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions.

Box-and-Whisker Plots

Calculation example mean, median, midrange, mode, variance, and standard deviation for raw and grouped data

Chapter 2: Frequency Distributions and Graphs

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

How Does My TI-84 Do That

Session 7 Bivariate Data and Analysis

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

Bellwork Students will review their study guide for their test. Box-and-Whisker Plots will be discussed after the test.

Measures of Central Tendency and Variability: Summarizing your Data for Others

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Transcription:

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used. Statements of central tendency are nothing more than attempts to describe the whole distribution of a data set by reporting one most typical value. The most typical values serve to represent the point or points about which most of the values in the distribution are centered. Three statements of central tendency are commonly used. They are the mean, the median, and the mode. Computing Means The mean is the average value. Imagine that we have recorded the ages of 11 children with chicken pox. 2, 4, 5, 5, 6, 6, 6, 7, 7, 8, and 10 years One way to represent these 11 ages with one most typical value is to calculate the mean age for the group of 11 children. To find the mean, we must first sum the ages (2 + 4 + 5 + 5 + 6 + 6 + 6 + 7 + 7 + 8 + 10 = 60), and then divide the total by the number of cases (66 11 = 6 years). The mean for this age group is 6 years. Computing Means To compute the mean, or average, we use the following definition: Definition of Mean The mean is the average, the location in the distribution of values at which the deviations above it and the deviations below it are equal.

Computing Means We can think of the mean as the balance point. The mean is sensitive to exceptional values. 1 + 4 + 4 + 5 = 14 1 + 1 + 1 + 1 + 2 + 2 + 2 + 4 = 14 The mean is the average, the location in the distribution of values at which the deviations above it and the deviations below it are equal. Graphically, the sum of the total distances to the data points below the mean equals the sum of the total distances to the data points above the mean. The mean is sensitive to every value. Computing Median The median is the middle value in a group of ordered values. Look again at the ages of the 11 children with chicken pox: 2, 4, 5, 5, 6, 6, 6, 7, 7, 8, and 10 years 5 values on the left 5 values on the right median Notice that the ages have been ordered from the youngest to the oldest, and the middle value is the sixth value from either end. The median for this group is 6 years. In the previous example, there were an odd number of values; so, the median was actually one of the values.

If we have an even number of values, the median is found by averaging the two middle values. For example, imagine that the ordered ages of the group of six children are: 1, 2, 3, 4, 5, and 6 In this case, there is no single middle data value. The two middle value are 3 and 4 years. The average of these is ((3 + 4) 2) 3.5. Therefore, 3.5 years represents the median value for this group of six ages, but it is not one of the ages. In general, to find the median for a set of n numbers: 1. First sort the values in order. 2. If the number of values is odd, the median is the number located in the exact middle of the list. 3. If the number of values is even, the median is found by computing the mean of the two middle numbers. Recall that the mean is dramatically affected by extreme values; whereas, the median is not dramatically affected. Finding the Modes The mode is the most frequently occurring value in a group of values. The most frequently occurring age in the group of 11 children with chicken pox: 2, 4, 5, 5, 6, 6, 6, 7, 7, 8, and 10 years is 6 years. Three of the children were 6 years old, and none of the other ages were represented more than twice. It is important to remember that there may not be a single, most frequently occurring value in the distribution, and if there is, it may not be unique i.e., there may be more than one mode.

Choosing the Most Appropriate Average When we attempt to choose the most appropriate statement of central tendency to use when describing a set of data, two factors must be considered: First, is the shape of the distribution. If the distribution is symmetrical, the mean, median, and mode will be equal or very close, and each may be used as the most typical representative value. If the shape of the distribution is not symmetrical (skewed), the median is the best choice as a measure of central tendency. The second factor to consider is the scale of measurement. If we are dealing with unordered categories, then our only choice is mode. For continuous data we may use the mean, median, or mode depending on symmetry. Measures of Spread or Dispersion Consider the following data: Range = upper extreme lower extreme = 35 20 = 15 20, 22, 22, 25, 26, 27, 27, 28, 30, 35 20, 22, 22, 25, 26, 27, 27, 28, 30, 35

Box Plots Boxplots are another graphical means of displaying key characteristics of data. The idea is to arrange the data in increasing order and choose three numbers Q1, Q2, and Q3 that divide it into four equal parts as indicated below. Minimum data point Median Q 1 Q 2 Q 3 Maximum data point Bottom 25% Min Q 1 Q 2 Q 3 Top 25% Max 15 20 25 30 35 45 Outliers An outlier is a value that is located very far away from almost al of the other data values. Relative to the other data, an outlier s an extreme value. An outlier is any value that is more than 1.5 times the interquartile range above the upper quartile or below the lower quartile. Outliers are commonly indicated with an asterisk. Outlier * Min Q 1 Q 2 Q 3 Max

Mean Absolute Deviation The mean absolute deviation (MAD) makes use of the absolute value to find the distance each data point is away from the mean. The following steps are used to determine the MAD. 1. Measure the distance from the mean by simply subtracting the data value minus the mean. 2. Find the absolute value of the differences. 3. Sum those absolute values. 4. Find the mean by dividing the sum by the number of scores. A visual picture of the mean absolute value deviation for the 11 ages of children with chicken pox. Bar Graph of Ages with Mean Ages Segment Marking the Mean Children Listed Numerically Compute the MAD for the following two sets of data.

Variance and Standard Deviation The variance and the standard deviation are two commonly used statements of dispersion. The variance of a sample may be defined as the sum of the squared deviations from the mean value divided by the number of values. The variance is calculated as follows: 1. Find the deviation of each value in the set from the mean value. 2. Square each of these deviations. 3. Sum all of these squared deviations 4. Divide the sum of the squared deviations from the mean by the number of values. Formula for the variance, v: The standard deviation, s, is the square root of the variance. Now, let us calculate the variance and standard deviation of the ages of the 11 children with chicken pox. Variance Standard deviation

Normal Distributions A normal distribution is a frequency distribution with continuous, randomly occurring data plotted on the x axis and frequency (counts) plotted on the y axis. This distribution is actually a theoretical distribution, but many real world situations are close to this idea 1. The normal curve has a bell-shape. 2. The curve extends infinitely in both directions and gets closer and closer to the x-axis but never reaches it. 3. The curve is symmetrical about its center point, but not all symmetrical distributions are normal. 4. The three statements of central tendency (mean, median, and mode) all fall in the exact same place. On a normal curve, about 68% of the values lie within 1 standard deviation of the mean, about 95% lie within 2 standard deviations, and about 99.8% are within 3 standard deviations.

Applications of the Normal Curve Suppose that cholesterol values for a population have a mean of 200 mg/dl and a standard deviation of 40 mg/dl. The following shows the raw cholesterol levels for plus/minus 1 and 2 standard deviations. Example 1 Calculate the mean, median, and mode for the following data sets: a. 2, 5, 7, 5, 8, 9, 5, 10, 8 b. 10, 12, 12, 15, 17, 12, 18, 14, 11, 13 c. 17, 21, 21, 18, 39, 17, 13

Example 2 John s fall-quarter grades follows are below. Find his grade point average for the term (A = 4, B = 3, C = 2, D = 1, F = 0). Course Credits Grades Math 5 B English 3 A Physics 5 C German 3 D Handball 1 A Example 3 For certain workers, the mean wage is $5.00/hr, with a standard deviation of $0.50. If a worker is chosen at random, what is the probability that the worker s wage is between $4.50 and $5.50? Assume a normal distribution of wages. Example 4 Ginny s median score on three tests was 90. Her mean score was 92 and her range was 6. What were her three test scores? 10.3 #A-1, 3, 9, 13, 15, 17, 19, 21, 23, B-1, 13, 19, 21