Business Statistics, 5 th ed. by Ken Black

Similar documents
Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

COMPARISON MEASURES OF CENTRAL TENDENCY & VARIABILITY EXERCISE 8/5/2013. MEASURE OF CENTRAL TENDENCY: MODE (Mo) MEASURE OF CENTRAL TENDENCY: MODE (Mo)

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Descriptive Statistics

Week 1. Exploratory Data Analysis

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

Means, standard deviations and. and standard errors

2. Filling Data Gaps, Data validation & Descriptive Statistics

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Statistics. Measurement. Scales of Measurement 7/18/2012

Data Exploration Data Visualization

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Exploratory data analysis (Chapter 2) Fall 2011

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Quantitative Methods for Finance

Lecture 1: Review and Exploratory Data Analysis (EDA)

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

How To Write A Data Analysis

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

Exploratory Data Analysis. Psychology 3256

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

3: Summary Statistics

Descriptive statistics parameters: Measures of centrality

Measures of Central Tendency and Variability: Summarizing your Data for Others

Exploratory Data Analysis

3.2 Measures of Spread

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Descriptive Statistics and Measurement Scales

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Module 3: Correlation and Covariance

Mean = (sum of the values / the number of the value) if probabilities are equal

Module 4: Data Exploration

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Variables. Exploratory Data Analysis

Calculation example mean, median, midrange, mode, variance, and standard deviation for raw and grouped data

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

Descriptive Statistics

MEASURES OF VARIATION

Ch. 3.1 # 3, 4, 7, 30, 31, 32

CHAPTER THREE. Key Concepts

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Lesson 4 Measures of Central Tendency

Summarizing and Displaying Categorical Data

MATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!

Geostatistics Exploratory Analysis

II. DISTRIBUTIONS distribution normal distribution. standard scores

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Lecture 2. Summarizing the Sample

Northumberland Knowledge

Statistics: Descriptive Statistics & Probability

Frequency Distributions

Descriptive Statistics

Describing Data: Measures of Central Tendency and Dispersion

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Sampling and Descriptive Statistics

2 Describing, Exploring, and

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

List of Examples. Examples 319

Additional sources Compilation of sources:

A and B This represents the probability that both events A and B occur. This can be calculated using the multiplication rules of probability.

Topic 9 ~ Measures of Spread

Shape of Data Distributions

Description. Textbook. Grading. Objective

Foundation of Quantitative Data Analysis

Diagrams and Graphs of Statistical Data

Exercise 1.12 (Pg )

Introduction to Quantitative Methods

430 Statistics and Financial Mathematics for Business

Module 5: Measuring (step 3) Inequality Measures

Algebra I Vocabulary Cards

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

Mathematical goals. Starting points. Materials required. Time needed

Introduction; Descriptive & Univariate Statistics

MEASURES OF CENTER AND SPREAD MEASURES OF CENTER 11/20/2014. What is a measure of center? a value at the center or middle of a data set

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Dongfeng Li. Autumn 2010

1 Descriptive statistics: mode, mean and median

determining relationships among the explanatory variables, and

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

Manhattan Center for Science and Math High School Mathematics Department Curriculum

CALCULATIONS & STATISTICS

Sta 309 (Statistics And Probability for Engineers)

Chapter 2 Statistical Foundations: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics

Data! Data! Data! I can t make bricks without clay. Sherlock Holmes, The Copper Beeches, 1892.

THE BINOMIAL DISTRIBUTION & PROBABILITY

seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

Transcription:

Business Statistics, 5 th ed. by Ken Black Chapter 3 Discrete Distributions Descriptive Statistics PowerPoint presentations prepared by Lloyd Jaisingh, Morehead State University

Learning Objectives Distinguish between measures of central tendency, measures of variability, measures of shape, and measures of association. Understand the meanings of mean, median, mode, quartile, percentile, and range. Compute mean, median, mode, percentile, quartile, range, variance, standard deviation, and mean absolute deviation on ungrouped data. Differentiate between sample and population variance and standard deviation.

Learning Objectives -- Continued Understand the meaning of standard deviation as it is applied by using the empirical rule and Chebyshev s theorem. Compute the mean, mode, standard deviation, and variance on grouped data. Understand skewness, kurtosis, and box and whisker plots. Compute a coefficient of correlation and interpret it.

Measures of Central Tendency: Ungrouped Data Measures of central tendency yield information about the center, or middle part, of a group of numbers. Common Measures of central tendency Mode Median Mean Percentiles Quartiles

Mode The most frequently occurring value in a data set Applicable to all levels of data measurement (nominal, ordinal, interval, and ratio) Bimodal -- Data sets that have two modes Multimodal -- Data sets that contain more than two modes

Mode -- Example The mode is 44. 44 is the most frequently occurring data value. 35 37 41 41 44 44 45 46 37 43 44 46 39 43 44 46 40 43 44 46 40 43 45 48

Median Middle value in an ordered array of numbers Applicable for ordinal, interval, and ratio data Not applicable for nominal data Unaffected by extremely large and extremely small values

Median: Computational Procedure First Procedure Arrange the observations in an ordered array. If there is an odd number of terms, the median is the middle term of the ordered array. If there is an even number of terms, the median is the average of the middle two terms. Second Procedure The median s position in an ordered array is given by (n+1)/.

Median: Example with an Odd Number of Terms Ordered Array 3 4 5 7 8 9 11 14 15 16 16 17 19 19 0 1 There are 17 terms in the ordered array. Position of median = (n+1)/ = (17+1)/ = 9 The median is the 9th term, which is 15. If the is replaced by 100, the median is 15. If the 3 is replaced by -103, the median is 15.

Median: Example with an Even Number of Terms Ordered Array 3 4 5 7 8 9 11 14 15 16 16 17 19 19 0 1 There are 16 terms in the ordered array. Position of median = (n+1)/ = (16+1)/ = 8.5 The median is between the 8th and 9th terms, 14.5. If the 1 is replaced by 100, the median is 14.5. If the 3 is replaced by -88, the median is 14.5.

Arithmetic Mean Commonly called the mean Is the average of a group of numbers Applicable for interval and ratio data Not applicable for nominal or ordinal data Affected by each value in the data set, including extreme values Computed by summing all values in the data set and dividing the sum by the number of values in the data set

Population Mean X... 1 3 N N 41319611 5 93 5 18. 6 X X X XN

Sample Mean X X... 1 3 n n 57864389066 6 379 6 63. 167 X X X Xn

Percentiles Measures of central tendency that divide a group of data into 100 parts At least n% of the data lie below the nth percentile, and at most (100 - n)% of the data lie above the nth percentile Example: 90th percentile indicates that at least 90% of the data lie below it, and at most 10% of the data lie above it The median and the 50th percentile have the same value. Applicable for ordinal, interval, and ratio data Not applicable for nominal data

Percentiles: Computational Procedure Organize the data into an ascending ordered array. Calculate the percentile location: P i () n 100 Where P = percentile i= percentile location n= sample size Determine the percentile s location and its value. If i is a whole number, the percentile is the average of the values at the i and (i + 1) positions. If i is not a whole number, the percentile is at the whole number part of (i + 1) in the ordered array.

Percentiles: Example Raw Data: 14, 1, 19, 3, 5, 13, 8, 17 Ordered Array: 5, 1, 13, 14, 17, 19, 3, 8 Location of 30th percentile: 30 i 100 () 8 4. The location index, i, is not a whole number; i + 1 =.4 + 1 = 3.4; the whole number portion is 3; the 30th percentile is at the 3rd location of the array; the 30th percentile is 13.

Quartiles Measures of central tendency that divide a group of data into four subgroups Q 1 : 5% of the data set is below the first quartile Q : 50% of the data set is below the second quartile Q 3 : 75% of the data set is below the third quartile Q 1 is equal to the 5th percentile Q is located at 50th percentile and equals the median Q 3 is equal to the 75th percentile Quartile values are not necessarily members of the data set

Ordered array: 106, 109, 114, 116, 11, 1, 15, 19 Q 1 5 Q : Q 3 : Quartiles: Example i i i Q 100 8 109 114 () 1 1115. 50 Q 100 8 4 116 11 () 1185. 75 Q 100 8 6 1 15 () 3 135.

Variability No Variability in Cash Flow (same amounts) Mean Mean Variability in Cash Flow (different amounts) Mean Mean

Variability Variability No Variability

Measures of Variability: Ungrouped Data Measures of variability describe the spread or the dispersion of a set of data. Common Measures of Variability Range Interquartile Range Mean Absolute Deviation Variance Standard Deviation Z scores Coefficient of Variation

Range The difference between the largest and the smallest values in a set of data Simple to compute 35 41 44 Ignores all data points except the 37 41 44 two extremes 37 43 44 Example: Range 39 43 = 44 Largest - Smallest = 40 43 44 48-35 = 13 40 43 45 45 46 46 46 46 48

Interquartile Range Range of values between the first and third quartiles Range of the middle 50% of the ordered data set Less influenced by extremes Interquartile Range Q3Q1

Deviation from the Mean Data set: 5, 9, 16, 17, 18 Mean: = 13 Deviations (x - ) from the mean: -8, -4, 3, 4, 5-4 -8 +3 +4 +5 0 5 10 15 0

Mean Absolute Deviation Average of the absolute deviations from the mean X 5 9 16 17 18 X X -8-4 +3 +4 +5 0 +8 +4 +3 +4 +5 4 M. A. D. 4 5 4. 8 X N

Population Variance Average of the squared deviations from the arithmetic mean X 5 9 16 17 18 X X -8-4 +3 +4 +5 0 64 16 9 16 5 130 X N 130 5 6. 0

Population Standard Deviation Square root of the variance X N 130 5 6. 0 6. 0 51.

Sample Variance Average of the squared deviations from the arithmetic mean X X X X X,398 1,844 1,539 1,311 7,09 65 71-34 -46 0 X X 390,65 5,041 54,756 13,444 663,866 S n 1 663, 866 3 1, 88. 67

Sample Standard Deviation Square root of the sample variance X X S n 1 663, 866 3 1, 88. 67 S S 1, 88. 67 470. 41

Uses of Standard Deviation Indicator of financial risk Quality Control construction of quality control charts process capability studies Comparing populations household incomes in two cities employee absenteeism at two plants

Standard Deviation as an Indicator of Financial Risk Annualized Rate of Return Financial Security A 15% 3% B 15% 7%

Coefficient of Variation Ratio of the standard deviation to the mean, expressed as a percentage Measurement of relative dispersion CV 100

Coefficient of Variation 1 9 1 46. 84 10 CV 1 1 1. 100 46 9 100 1586. CV 100 10 84 100 11. 90

Empirical Rule Data are normally distributed (or approximately normal) Distance from the Mean 1 Percentage of Values Falling Within Distance 68 95 3 99.7

Chebyshev s Theorem Applies to all distributions P( k X k) 1 1 k for k>1

Chebyshev s Theorem Applies to all distributions Number of Standard Deviations K = Distance from the Mean Minimum Proportion of Values Falling Within Distance 1-1/ = 0.75 K = 3 3 1-1/3 = 0.89 4 K = 4 1-1/4 = 0.94

Measures of Central Tendency and Variability: Grouped Data Measures of Central Tendency Mean Median Mode Measures of Variability Variance Standard Deviation

Mean of Grouped Data Weighted average of class midpoints Class frequencies are the weights fm f fm N f 1M 1 f M f 3M 3 fim f 1 f f 3 fi i

Calculation of Grouped Mean Class Interval Frequency Class Midpoint fm 0-under 30 6 5 150 30-under 40 18 35 630 40-under 50 11 45 495 50-under 60 11 55 605 60-under 70 3 65 195 70-under 80 1 75 75 50 150 fm 150 f 50 43. 0

Median of Grouped Data N cf Median L fmed p W Where: L the lower limit of the median class cf p = cumulative frequency of class preceding the median class f med = frequency of the median class W = width of the median class N = total of frequencies

Median of Grouped Data -- Example Cumulative Class Interval Frequency Frequency 0-under 30 6 6 30-under 40 18 4 40-under 50 11 35 50-under 60 11 46 60-under 70 3 49 70-under 80 1 50 N = 50 Md L N cf fmed p 50 4 40 11 40. 909 W 10

Mode of Grouped Data Midpoint of the modal class Modal class has the greatest frequency Class Interval Frequency 30 40 Mode 0-under 30 6 30-under 40 18 40-under 50 11 50-under 60 11 60-under 70 3 70-under 80 1 35

Variance and Standard Deviation of Grouped Data Population Sample f M N S S S f M n 1 X

Population Variance and Standard Deviation of Grouped Data Class Interval 0-under 30 30-under 40 40-under 50 50-under 60 60-under 70 70-under 80 f 6 18 11 11 3 1 50 M 5 35 45 55 65 75 fm 150 630 495 605 195 75 150 M M f M -18-8 1 3 f 700 N 50 144 34 64 4 144 484 104 M 1944 115 44 1584 145 104 700 1441

Measures of Shape Skewness Absence of symmetry Extreme values in one side of a distribution Kurtosis Peakedness of a distribution Leptokurtic: high and thin Mesokurtic: normal shape Platykurtic: flat and spread out Box and Whisker Plots Graphic display of a distribution Reveals skewness

Symmetrical and Skewness 0.4 0.30 0.5 0.3 0.0 0. 0.15 0.10 0.1 0.05 0.0 0-4 -3 - -1 0 1 3 Symmetrical 1 0.00 0 0 4 6 8 10 1 Right or Positively Skewed 10 8 6 4 0 0 0.70 0.75 0.80 0.85 0.90 0.95 1.00 Left or Negatively Skewed

Relationship of Mean, Median and Mode

Relationship of Mean, Median and Mode

Relationship of Mean, Median and Mode

Coefficient of Skewness Summary measure for skewness S k 33 M d If S k < 0, the distribution is negatively skewed (skewed to the left). If S k = 0, the distribution is symmetric (not skewed). If S k > 0, the distribution is positively skewed (skewed to the right).

Coefficient of Skewness M 1 d1 3 6 M d 6 6 M 3 d3 9 6 1 1. 3 1. 3 3 1. 3 S 1 3 1 d1 M 1 336 1. 3 073. S 3 d 366 1. 3 0 M S 3 3 3 d3 M 3 396 1. 3 073.

Types of Kurtosis Platykurtic Distribution Leptokurtic Distribution Mesokurtic Distribution

Box and Whisker Plot Five specific values are used: Median, Q First quartile, Q 1 Third quartile, Q 3 Minimum value in the data set Maximum value in the data set Inner Fences IQR = Q 3 -Q 1 Lower inner fence = Q 1-1.5 IQR Upper inner fence = Q 3 + 1.5 IQR Outer Fences Lower outer fence = Q 1-3.0 IQR Upper outer fence = Q 3 + 3.0 IQR

Box and Whisker Plot Minimum Q 1 Q Q 3 Maximum

Measures of Association Measures of association are statistics that yield information about the relatedness of numerical variables. Correlation is a measure of the degree of relatedness of variables.

Pearson Product-Moment Correlation Coefficient r SSXY SSX SSY X X Y Y X X Y Y XY X X Y n X n Y Y n 1r 1

Three Degrees of Correlation r < 0 r > 0 r = 0

Computation of r for the Economics Example (Part 1) Futures Day Interest X Index Y X Y XY 1 7.43 1 55.05 48,841 1,64.03 7.48 55.950 49,84 1,660.56 3 8.00 6 64.000 51,076 1,808.00 4 7.75 5 60.063 50,65 1,743.75 5 7.60 4 57.760 50,176 1,70.40 6 7.63 3 58.17 49,79 1,701.49 7 7.68 3 58.98 49,79 1,71.64 8 7.67 6 58.89 51,076 1,733.4 9 7.59 6 57.608 51,076 1,715.34 10 8.07 35 65.15 55,5 1,896.45 11 8.03 33 64.481 54,89 1,870.99 1 8.00 41 64.000 58,081 1,98.00 Summations 9.93,75 70.0 619,07 1,115.07

Computation of r for the Economics Example (Part ) r X X Y Y XY X Y n n n 1115 07 9 93 75 1 70 1 619 07 1 9 93 75 815,...,..

Scatter Plot and Correlation Matrix for the Economics Example 45 40 Futures Index 35 30 5 0 7.40 7.60 7.80 8.00 8.0 Interest Interest Futures Index Interest 1 Futures Index 0.81554 1

Copyright 008 John Wiley & Sons, Inc. All rights reserved. Reproduction or translation of this work beyond that permitted in section 117 of the 1976 United States Copyright Act without express permission of the copyright owner is unlawful. Request for further information should be addressed to the Permissions Department, John Wiley & Sons, Inc. The purchaser may make back-up copies for his/her own use only and not for distribution or resale. The Publisher assumes no responsibility for errors, omissions, or damages caused by the use of these programs or from the use of the information herein.