Measures of Central Tendency and Variability: Summarizing your Data for Others



Similar documents
Lesson 4 Measures of Central Tendency

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Descriptive Statistics and Measurement Scales

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Means, standard deviations and. and standard errors

COMPARISON MEASURES OF CENTRAL TENDENCY & VARIABILITY EXERCISE 8/5/2013. MEASURE OF CENTRAL TENDENCY: MODE (Mo) MEASURE OF CENTRAL TENDENCY: MODE (Mo)

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

MEASURES OF VARIATION

CALCULATIONS & STATISTICS

Introduction; Descriptive & Univariate Statistics

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Descriptive Statistics

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Week 3&4: Z tables and the Sampling Distribution of X

Descriptive Statistics

1.5 Oneway Analysis of Variance

6.4 Normal Distribution

Standard Deviation Estimator

How To Write A Data Analysis

Reliability Overview

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

individualdifferences

3.2 Measures of Spread

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Descriptive statistics parameters: Measures of centrality

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

Exercise 1.12 (Pg )

II. DISTRIBUTIONS distribution normal distribution. standard scores

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Statistics. Measurement. Scales of Measurement 7/18/2012

Topic 9 ~ Measures of Spread

CHAPTER 14 NONPARAMETRIC TESTS

Exploratory Data Analysis. Psychology 3256

3. What is the difference between variance and standard deviation? 5. If I add 2 to all my observations, how variance and mean will vary?

Simple Regression Theory II 2010 Samuel L. Baker

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Statistics Review PSY379

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Module 3: Correlation and Covariance

How do you compare numbers? On a number line, larger numbers are to the right and smaller numbers are to the left.

3: Summary Statistics

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

Lecture 1: Review and Exploratory Data Analysis (EDA)

When to use Excel. When NOT to use Excel 9/24/2014

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Module 4: Data Exploration

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1 Descriptive statistics: mode, mean and median

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Introduction to Quantitative Methods

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem

Non-Parametric Tests (I)

z-scores AND THE NORMAL CURVE MODEL

2. Filling Data Gaps, Data validation & Descriptive Statistics

Northumberland Knowledge

DATA INTERPRETATION AND STATISTICS

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.

Zeros of a Polynomial Function

Introduction to Statistics and Quantitative Research Methods

Analyzing and interpreting data Evaluation resources from Wilder Research

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

PowerScore Test Preparation (800)

Frequency Distributions

Variables. Exploratory Data Analysis

Data Analysis Tools. Tools for Summarizing Data

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Mathematical goals. Starting points. Materials required. Time needed

The Dummy s Guide to Data Analysis Using SPSS

Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Statistical Process Control (SPC) Training Guide

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

An introduction to using Microsoft Excel for quantitative data analysis

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

THE BINOMIAL DISTRIBUTION & PROBABILITY

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

Paper 1. Calculator not allowed. Mathematics test. First name. Last name. School. Remember KEY STAGE 3 TIER 6 8

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Myth or Fact: The Diminishing Marginal Returns of Variable Creation in Data Mining Solutions

Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Midterm Review Problems

Friedman's Two-way Analysis of Variance by Ranks -- Analysis of k-within-group Data with a Quantitative Response Variable

Adding and Subtracting Fractions. 1. The denominator of a fraction names the fraction. It tells you how many equal parts something is divided into.

Exploratory data analysis (Chapter 2) Fall 2011

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

Characteristics of Binomial Distributions

The GED math test gives you a page of math formulas that

Simple linear regression

Advanced Topics in Statistical Process Control

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Transcription:

Measures of Central Tendency and Variability: Summarizing your Data for Others 1

I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode : The value (score) that occurs most often in a data set. -Mo x = Sample mode -Mo = Population mode 2. Median : the point (score) which divides the data set in ½ : e.g. ½ of the subjects are above the median and ½ are below the median. -Mdn x = Sample Median -Mdn = Population Median 3. Mean: the arithmetic average: Directly considers every score in a distribution. 2

II. Skewed Distributions & the 3M s -Skewness refers to the shape of the distribution which can be influenced by extreme scores. - Skewness is also an estimate of the deviation of the Mean, Median, and Mode. 3

-Symmetrical Dist. = Mean, Median, Mode are all in the same location in the dist. 4

-Skewed Right (Positively Skewed) = Mode in peak of dist.(left of center), Median in center of distribution, Mean in right tail of distribution. 5

-Skewed Left (Negatively Skewed) = Mode in peak of dist (right of center), Median in center of distribution, Mean in left tail of distribution. 6

I. Measures of Variability (Dispersion) -Allow us to summarize our data set with a single value. -Central Tendency + Variability = a more accurate picture of our data set. -The 3 main measures of variability: Range, Variance, and Standard Deviation. These formulas are the root formulas for many of the statistical tests that will be covered later t-test, ANOVA, and Correlation Tell us how much observations in a data set vary (differ from one another) How are they dispersed within the distribution? 7

8

-Although measures of central tendency tell summarize some aspects of our data, they don't tell us much about the variance within our data. Example. Number of miles traveled before traveling companion appears human n=8 Mean = 5, Mode = 5, Median = 5 for both data sets (They do not differ) -all zoo penguins hallucinate after traveling 5 mile, while there is much more variability in the distances traveled by South Pole Penguins. -In order to draw accurate conclusions about our data both central tendency and variability must be considered. 9

II.Range : The numerical distance between the largest (X maximum) and smallest values (X minimum), tells us something about the variation in scores we have in our data, or it tells us the width of our data set. Range = X maximum - X minimum - Range for Zoo penguins = 5-5 = 0 - Range for South Pole P's = 8-2 = 6 10

-Problems with Range: Does not directly consider every value in the data set (here only the two extreme numbers; largest and smallest). We do not know whether most of the scores occur at the extremes of the distribution or toward the center. For example: 11

III. Variance = indicates the total amount of variability (differences between scores) in a data set by directly considering every observation. -Requires a point from which each observation can be compared to assess the amount they differ. -The Mean can be used as a point of comparison, since it considers every observation in its calculation. 12

The sum of the mean deviation for any data set is always 0. This limits the usefulness of the mean deviation for summarizing different data sets with a single point. if we square each deviation value then the negative values cancel out and we are left with a more meaningful value. 13

-If we sum these values we no longer get 0, but a number that reflects the total variance for this data set, -if we divide that number by N or n we get the average variance for this data set. Definitional Population Formula = σ 2 = Σ(X - Mean) 2 N Definitional Sample Formula = s 2 = Σ(X - Mean) 2 n 1 Note sample variance uses n-1 rather than N because it is an estimate of the population variance. Due to the smaller denominator, the sample variance will always be slightly larger than the population variance. 14

15

16

17

- Definitional Formula is time consuming for large data sets. - We have developed mathematically identical (algebraically equivalent) formulas that are easier to calculate. Computational Formulas= Population Variance = σ 2 = ΣX 2 -(ΣX ) 2 /N N Sample Variance = s 2 = ΣX 2 -(ΣX ) 2 /n n-1 -Note sample variance uses n-1 rather than N because it is an estimate of the population variance. Due to this reduced denominator the sample variance will always be slightly larger than the population variance. 18

19

20

21

-Problems: This formula is the base for many other statistical formulas, however as a single summary measure it has little numerical meaning until it is converted to a standardized score. Right now it represents the average distance each penguin is from the mean, in squared mile units. 3. Standard Deviation= The square root of a variance. - The standardized variance value. - It provides us with a numerically meaningful measure of variance: -The average distance each observation is from the mean. -This value (when combined with other stats methods) allow us to infer what percentage of our observations are a certain distance from the mean. Standard Deviation (based on computational formula of variance) 22

With respect to sample standard deviations (s), we can say: -Zoo penguins are an average of 0 miles from the mean number of miles walked before hallucinating 23

With respect to sample standard deviations (s), we can say: - South Pole penguins are an average of 2 miles from the mean number of miles walked before hallucinating. 24

With respect to sample standard deviations (s), we can say: - North Pole penguins are an average of 1.69 miles from the mean number miles walked before hallucinating. 25