Numerical descriptive measures

Similar documents
STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Lecture 1: Review and Exploratory Data Analysis (EDA)

2. Filling Data Gaps, Data validation & Descriptive Statistics

3.2 Measures of Spread

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

MEASURES OF VARIATION

Module 4: Data Exploration

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Exploratory Data Analysis. Psychology 3256

Lesson 4 Measures of Central Tendency

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Data Exploration Data Visualization

3: Summary Statistics

Exploratory Data Analysis

Measures of Central Tendency and Variability: Summarizing your Data for Others

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

Descriptive Statistics

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Descriptive Statistics

Means, standard deviations and. and standard errors

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Ch. 3.1 # 3, 4, 7, 30, 31, 32

a. mean b. interquartile range c. range d. median

Calculation example mean, median, midrange, mode, variance, and standard deviation for raw and grouped data

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Final Exam Practice Problem Answers

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

COMPARISON MEASURES OF CENTRAL TENDENCY & VARIABILITY EXERCISE 8/5/2013. MEASURE OF CENTRAL TENDENCY: MODE (Mo) MEASURE OF CENTRAL TENDENCY: MODE (Mo)

Quantitative Methods for Finance

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

Exercise 1.12 (Pg )

Northumberland Knowledge

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Geostatistics Exploratory Analysis

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Algebra Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Summarizing and Displaying Categorical Data

Dongfeng Li. Autumn 2010

Pre-Algebra Academic Content Standards Grade Eight Ohio. Number, Number Sense and Operations Standard. Number and Number Systems

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

AP * Statistics Review. Descriptive Statistics

Descriptive statistics parameters: Measures of centrality

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

Exploratory data analysis (Chapter 2) Fall 2011

Example: Find the expected value of the random variable X. X P(X)

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Expression. Variable Equation Polynomial Monomial Add. Area. Volume Surface Space Length Width. Probability. Chance Random Likely Possibility Odds

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Some Essential Statistics The Lure of Statistics

6.4 Normal Distribution

First Midterm Exam (MATH1070 Spring 2012)

Section 1.3 Exercises (Solutions)

How To: Analyse & Present Data

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Week 1. Exploratory Data Analysis

Mean = (sum of the values / the number of the value) if probabilities are equal

Foundation of Quantitative Data Analysis

How To Write A Data Analysis

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

CALCULATIONS & STATISTICS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

MATH 10: Elementary Statistics and Probability Chapter 7: The Central Limit Theorem

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

THE BINOMIAL DISTRIBUTION & PROBABILITY

Midterm Review Problems

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

1 Descriptive statistics: mode, mean and median

CRLS Mathematics Department Algebra I Curriculum Map/Pacing Guide

WEEK #22: PDFs and CDFs, Measures of Center and Spread

Hands-On Data Analysis

Description. Textbook. Grading. Objective

Statistics Revision Sheet Question 6 of Paper 2

Frequency Distributions

Module 3: Correlation and Covariance

2 Describing, Exploring, and

Shape of Data Distributions

Statistics Review PSY379

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

ADD-INS: ENHANCING EXCEL

Statistics. Measurement. Scales of Measurement 7/18/2012

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

Variables. Exploratory Data Analysis

Mind on Statistics. Chapter 2

TEACHER NOTES MATH NSPIRED

Descriptive Statistics and Measurement Scales

Skewness and Kurtosis in Function of Selection of Network Traffic Distribution

Analysis of Algorithms I: Binary Search Trees

CHAPTER THREE. Key Concepts

Transcription:

1/8/014 AGSC 30 Statistial Methods Numerial desriptive measures Data representation 1. Measures o entral tendeny e.g., mean, mode, median, midrange. e.g., range, variane, standard deviation 3. Measures o distribution shape e.g., normal, skewed, uniorm, random 4. Measures o position e.g., perentiles, quartiles, standard sores Data organization Height o 0 trees: 50, 45, 3, 48, 56, 38, 4, 48, 55, 36, 41, 51, 30, 59, 53, 47, 57, 51, 46, 44 7 6 5 4 3 1 0 30 35 40 45 50 55 60 Height 3 1

1/8/014 Measures o entral tendeny 1. mean or arithmeti average Deinition: sum o values divided by the total number o observations 4 Measures o entral tendeny. Median: Deinition: the midpoint / middle value in a group o data The point that separate the data in two set with the same number o observations Steps: arrange the data in order ind the midpoint 5 Measures o entral tendeny 3. Mode Deinition: the most requently ourring value / observation Notes: not always unique an also be bimodal, multimodal 4. Midrange Deinition: sum o the lowest and highest values divided by 6

1/8/014 Measures o entral tendeny Summary Statistis Mean Value 7 Relationship among mean,median,mode Depending on the shape o the histogram / requeny distribution the mean an be loated dierently in respet with median or mode Mean=Median=Mode Mean<Median<Mode Mode<Median<Mean 8 1. Range: Deinition: the dierene between the largest and smallest observation Range = x max - x min where x max largest observation x min smallest observation 9 3

1/8/014. Variane: Deinition: sum o the squared dierenes between eah observation and the mean, divided by the number o observations. Population Sample 10 Working ormulas or Variane and Standard deviation 11 3. Standard deviation Deinition: the square root o the variane A measure o the spread o the observations in the original units Population Sample 1 4

1/8/014 Variane and Standard deviation Using deinition: Using working ormulas: 13 Range rule o thumb A rough estimate o the standard deviation is a quarter o range range s 4 Example using tree data... s...... 14 4. Coeiient o variation Ratio between standard deviation and mean sample s CV 100 x population CV 100 Example using tree data... CV 100... [%]... 15 5

1/8/014 Measures o entral tendeny Mean or Arithmeti average Deinition: sum o values divided by the total number o observations sample data x 1 x 1 number o lasses Population data th x, value o the lass midpoint requeny o the th lass 1 1 16 Variane and Standard deviation or requeny distribution sample data s 1 ( x x) 1 1 population data 1 ( ) 1 th x, value o the lass midpo int number o lasses requeny o the th lass 17 Example: Daily ommuting times, in minutes Calulate mean, variane, standard deviation, CV Daily ommuting time Number o employees Less than 10 min 4 10 0 min 9 0 30 min 6 30 40 min 4 40 50 min 18 6

1/8/014 Remember: in a lass all individuals are assumed to have the mid-value o the respetive lass Mid-value o the lass = lass mean Commuting time # employees Class mean x μ < 10 min 4 5 0 10 0 min 9 0 30 min 6 30 40 min 4 40 50 min Total 19 Mean ommuting time: Variane: 1 ( ) 4(5...) 9(15...)... (4 9 6 4 ) 1 Standard deviation: σ= Coeiient o variation: CV=σ/μ x100= /.= 0 Use o standard deviation Connet mean with standard deviation Chebyshev s Theorem: For any k>1, at least 1-1/k o the data lie within k standard deviation rom the mean Example: i k= 1-1/k =1-1/4=0.75 or 75% This means that 75% o data values are within two standard deviation rom the mean 1 7

1/8/014 Measures o Distribution Shape Skeweness: a measure o the asymmetry o the requeny distribution n 3 ( xi x) /( n 1) i1 3/ n ( x x) /( n 1) i i1 Kurtosis: measure o the "peakeness" o the requeny distribution n 4 ( xi x) /( n 1) i1 3 n ( x x) /( n 1) i i1 Measure o position Loate the relative position o an observation /data within dataset PERCENTILES divide the data set into 100 groups with equal number o observations indiate the position o an individual in a group Eduation Health related industry Lie sienes (# observations less than x) 0.5 perentile 100 total # observations [%] 3 Perentiles harts 4 8

1/8/014 Standard sores Compare the relative position o observations within their deining dataset Standard sore or z-sore observation ' s value mean x x z standard deviation s Allows omparison o dierent datasets or dierent type o data 5 Standard sore Example: Student reeived 9% Statistis and 75% English Was the overall student s perormane bad? Additional ino: Mean grade or Statistis was 85 and or English was 70 Variane or Statistis was 36 and or English was 9. Compute the z-sores: x x...... x x...... z Statistis... z English... s s Conlusion: 6 Population vs. statistis Various numerial measures an be omputed or the population as well as or a sample Mean, median, variane, oeiient o variation When the measure is omputed or the entire population then the measure is alled population parameter or simply PARAMATER When the measure is omputed or a portion o the population (namely sample), then the measure is alled sample statistis, or simply STATISTIC. 7 9