Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion"

Transcription

1 Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

2 Statistics as a Tool for LIS Research Importance of statistics in research Summarize observations to provide answers to research questions and hypotheses Make general conclusions based on specific study observations Objectively evaluate reliability of study conclusions

3 Statistics as a Tool for LIS Research Main purposes of statistics in research Describe central point in a set of data/observations Describe how broad, diversified, or variable the data in a set is Indicate whether specfic features of a set of data are related, and how closely they are related Indicate probability of features of data being influenced by factors other than simply chance

4 Statistics as a Tool for LIS Research Two main types or branches of statistics Descriptive statistics Characterizing or summarizing data set Presenting data in charts and tables to clarify characteristics No inference, just describing a particular group of observations Inferential statistics Using sample data to make generalizations (inferences) or estimates about a population Statements made in terms of probability

5 Statistics as a Tool for LIS Research Descriptive and inferential statistics not mutually exclusive Overlap in what can be called descriptive and what can be called inferential Intent is important: Group of observations intended to describe an event: descriptive Group of observations collected from a sample and intended to predict what a larger population is like: inferential

6 Statistics as a Tool for LIS Research Choosing statistical methods Type of data collected largely determines choice of statistical analysis techniques Decisions about how and what type of data is collected will determine the specific statistical tests that can be performed to analyze the data Data collected should determine statistical tests used, not the other way around But consideration of how you want to analyze data should be done as part of research design to ensure study can produce the type of conclusions you want to make

7 Descriptive Statistics Commonly used in LIS research Cannot test causal relationships Primary strength is describing and summarizing data: Describing data in terms of frequency distributions Describing most typical value in data set - measures of central tendency Describing variability of data - measures of dispersion

8 Frequency Distributions Describing data in terms of frequency distributions Counts of totals by value or category for each measured variable Can be presented as absolute totals, cumulative totals, percentages, grouped totals Books checked out Often a first step in statistical analysis of data Usually presented in tables or charts (histogram, bar graph, etc.) Age group

9 Describing most typical value in data set - measures of central tendency Mean is often referred to as average though average can be any of these measures of central tendency: Mean (arithmetic average) Median Mode

10 Mean Most popular statistic for summarizing data Can be used for interval or ratio data Based on all observations of the data set Arithmetic average of a set of observations Example: mean of 5, 10, and 30 is 15, since 45 3 = 15 Mean of a set of numbers can be a number not in set Example: mean of 1, 2, 3, and 4 is 3.5, since 10 4 = 2.5

11 ple size σ population stdev jth quartile Samplep standard population deviation: General N population addition rule: size Probab P(A proportion ple mean x 2 ( x) d 2 paired /n difference population s CHAPTER size µ population mean n 1 O observed (x x) frequency s 2 Mean of a discrete x random ple stdev ˆp sample proportion or s 2 ( x population mean E expected n 1frequency Probab nwhere 1 (n uartile + 1)/2, 3(n + p1)/4 population proportion Standard Descriptive Statistics Quartile positions: N den CHAPTER (n + 1)/4, 3 (n deviation of a disc lation size O observed frequencyσ Descriptive (x + 1)/2, µ) 2 3(n P(X Measu + 1)/ x Q Descriptive 1 Measures lation Formula mean of mean Sample mean: x x Specia mean: x x Interquartile E expected range: frequency IQR Factorial: Q 3 Q 1 k! k(k where 1) Upper limit Q IQR Sample mean: = mean of sample n P n Lower limit Q IQR, Upper limit N( Q deno ) 3 n + 1 (A, B, able): = mean of population Range: Range Max Min Range µ x Binomial coefficient: criptive Measures MaxN Min Population mean (mean Population of a variable): mean: µ x Specia x = sigma, sum of values N Comp standard deviation: of a variable): Sample Binomial standard probability deviation: formu x x P X n= set of observations Population STAstandard 570 deviation (standard Formula deviation Sheet (x x) 2 Gener of a v (x x) 2 x 2 x X i = specific observations or s 2 ( x) 2 /n (A, B, P(X x) or Maxσ Min Sample Mean = X s = X 1+X X o n n 1 (x n µ) 1 = Mean σ 2 x N µ2 or nσ 1 2 Compl n n or N = number of N where n denotes the numbe N rd deviation: positions: µ (n + 1)/4, (n + 1)/2, 3(n observations Quartile + 1)/4 Standa probability. positions: (n + 1)/4, Gary Sample Geisler Simmons Variance College LIS 403 Spring, = 2004 rtile range: IQR Standardized Q s 2 = (X 1 X) 2 +(X 2 X) 2 3 Q variable: z x µ Genera (x x) 2 x 2 ( x) 2 /n 1 +. or s Interquartile Mean σ of arange: binomial IQRrandom Q n i=1 X σ

12 Median Value that is above the lower one-half and below the upper one-half of the values -- middle value of set of observations when they have been arranged in order Can be used for ordinal, interval or ratio data Most central measure of a distribution Every data set has a median that is unique Difference in sets with odd numbers of observations than for even numbers of observations Example: median of the five observations 1, 3, 15, 16, and 17 = 15 Example: median of the six observations 1, 2, 3, 5, 8, and 9 = 4

13 Mode Can be used for any type of data Most frequently occuring value among a set of observations Examples: Mode of the observations 1, 2, 2, 3, 4, 5 = 2 Set of observations 1, 2, 3, 4, 5 has no mode Set of observations 1, 2, 3, 3, 4, 5, 5 has no single mode, but can be considered to have two modes, or is bi-modal

14 Advantages of mean Always exists Is unique Can always be calculated by a simple formula Disadvantages of mean Mean value for a data set is not necessarily one of the values of the data set Sensitive to extreme scores, either high or low Easily distorted by extremely large or extremely small values among the set of observations, Example: mean of 1, 2, and 1,000,000 is 333,334.33

15 Advantages of median Not affected by extreme scores Useful way of describing sets of observations that are skewed by including extremely large or small values Disadvantages of median Median is not necessarily one of the values of the data set Defined differently for odd and even numbers of observations

16 Advantages of mode Can be used with any scale of measurement If set of observations has a mode, mode usefully characterizing the set For example, set of observations noting result of rolling two dice will have a mode of 7 Disadvantages of mode Many sets of observations lack a mode because no observed value occurs more than once Other sets of observations may have several different most frequent values Doesn t characterize set beyond most frequently occuring value

17 Calculating mean Age Frequency

18 Calculating mean Age Frequency 13 x 3 = x 4 = x 6 = x 8 = x 4 = x 3 = x 3 = 57 N = 31 Sum of X = /31 = Mean = 15.87

19 Calculating Age mode Frequency Mode =

20 Calculating median Age 13 Frequency 1-3 Non-grouped data N = 31 so midpoint is 16th value Median = 16

21 Calculating median Age Frequency Grouped data: Each value is somewhere within each age range Values are assumed to be equally distributed within range N = 31 so midpoint is 16th value Median =

22 Mean = Mode = 16 Median = 16.31

23 Normal distribution Normal curve, bell-shaped curve, Gaussian distribution Many types of data are normally distributed in a population Histogram of data approximates a bell-shaped, symmetrical curve Concentration of scores in the middle, with fewer and fewer scores as you approach extremes Example: heights of people in a population are normally distributed

24 Skewness Not all sets of data will exhibit properties of a normal distribution Some data sets are asymmetrical around a central point Majority of scores are closer to one extreme or the other: skewed distribution In a skewed distribution, the mean does not equal the median

25 Positively skewed distribution, tail goes to the right - median is less than the mean Example: Annual income of population Negatively skewed distribution tail goes to the left - mean is less than the median

26 Special case of skewness: J-Curve Extreme skewness Proposed by Allport to describe conforming behavior in groups of people Large majority of scores fall at end representing socially acceptable behavior, small minority represent deviation from norm Example: amount of time drivers who park in No Parking zone stay there < 5 5 to to to to 25 >25

27 Determining when a distribution is skewed too much to be considered normal General rule of thumb: values beyond 2 standard errors of skewness (ses) are probably significantly skewed ses = 6/N or use ses statistic from software (SPSS, for example) output Example: if sample size = 30 and skewness statistic is.9814: ses = 6/30 =.20 = ses =.4472 x 2 =.8944 skewness statistic of.9814 is beyond 2 ses, so is significantly skewed Other factors (histograms, normal probability plots, type of test to be used) should influence decision, depending on exact circumstances of analysis

28 Kurtosis - amount of peakedness or flatness of the distribution Mesokurtic - normal Leptokurtic - peaked, many scores around middle Platykurtic - flat, many scores dispersed from middle Non-normal kurtosis determined by similar process to skewness Non-normal kurtosis only a concern with some statistical tests

29 Selecting appropriate measure of central tendency Interactive selection at Selecting Statistics by William M.K. Trochim: Rules below can be bent, depending on situation Unimodal, Ratio or interval data, skewed Unimodal, Ratio or interval data, not skewed Unimodal, ordinal Unimodal, Nominal Bi-modal or multi-modal distribution median mean median mode mode

30 Measures of Dispersion Variability is a fundamental characteristic of most data sets, but is not addressed by measures of central tendency Measures of central tendency are not enough to accurately describe a data set Also need to be able to describe the variability or dispersion of the data Dispersion: scatteredness or flucuation of scores around average score Several types of measures of dispersion Range Standard deviation Variance

31 Measures of Dispersion Range Distance between the smallest and largest observations in a set of data Examples: Range of the set of observations 2, 4, 7 is 5 Range of the set -10, -3, 4 is 14

32 Measures of Dispersion Interquartile range Simplified version: ignore the top and bottom 25% after sorting Difference between the remaining largest and smallest numbers is interquartile range Addresses the problem of outliers Other methods of calculating interquartile range are slightly more complicated but take into account more data

33 Measures of Dispersion Standard deviation Measures the variability or the degree of dispersion of the data set Square root of the average squared deviations from the mean Roughly speaking, standard deviation is the average distance between the individual observations and the center of the set of observations

34 Range: Range Max Min Measures of Dispersion Calculating standard deviation 1. Subtract each each observation from sample/population mean and square 2. Add squared distances 3. Divide sum by n - 1 or N (adjusted mean of squared distances) 4. Take square root of mean squared distances Sample CHAPTER standard 3 Descriptive deviation: Meas (x x) Sample s mean: x 2 x or n 1 n Quartile Range: positions: Range (n Max + 1)/4, Mi Interquartile Sample standard deviation: range: IQR Q 3 (x x) SD of sample: Lower limit s Q IQR, n 1 Population Quartile mean positions: (mean (n of + a 1)/ va Population Interquartile standard range: deviation IQR ( Q Lower limit (x Q 1 µ) SD of population: 1.5 IQ σ 2 N Population mean (mean of a Standardized variable: z x Population standard deviatio CHAPTER 4 Descriptive (x Method µ) σ 2 S xx, S xy, and S yy : N

35 Measures of Dispersion Variance Square of standard deviation Not used for descriptive statistics, but is important for specific inferential statistics tests Variance of sample Variance of population

36 Measures of Dispersion Advantages of range as measure of dispersion Very simple to calculate Provides a meaningful characteristic of a set of observations (total spread of the observations) Disadvantages of range as measure of dispersion Extreme values distort range Only measures the total spread; tells us nothing about the pattern of data distribution Examples: Data set 1, 2, 3, 4, 5, 6, 7, 8, 9 has a range of 8 Data set 1, 9, 9, 9, 9, 9, 9, 9, 9 also has range of 8, though clearly less scattered

37 Measures of Dispersion Advantages of standard deviation as measure of dispersion Can always be calculated Meaningful characteristic of a set of observations; takes every observation into account to express the scatteredness of observations Examples: Set of observations 1, 2, 3, 4, 5, 6, 7, 8, 9 has a standard deviation s = 2.74 Set of observations 1, 9, 9, 9, 9, 9, 9, 9, 9 has a standard deviation s = 2.67 Range doesn t distinguish difference in scatteredness of sets, but standard deviation does Disadvantage of standard deviation as measure of dispersion is that it is more complicated to calculate -- though not for computers

We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students:

We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students: MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having

More information

F. Farrokhyar, MPhil, PhD, PDoc

F. Farrokhyar, MPhil, PhD, PDoc Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

More information

Chapter 3: Central Tendency

Chapter 3: Central Tendency Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents

More information

Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS

Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Data Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010

Data Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010 Data Mining Part 2. and Preparation 2.1 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline Introduction Measuring the Central Tendency Measuring the Dispersion of Data Graphic Displays References

More information

GCSE HIGHER Statistics Key Facts

GCSE HIGHER Statistics Key Facts GCSE HIGHER Statistics Key Facts Collecting Data When writing questions for questionnaires, always ensure that: 1. the question is worded so that it will allow the recipient to give you the information

More information

CHINHOYI UNIVERSITY OF TECHNOLOGY

CHINHOYI UNIVERSITY OF TECHNOLOGY CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF NATURAL SCIENCES AND MATHEMATICS DEPARTMENT OF MATHEMATICS MEASURES OF CENTRAL TENDENCY AND DISPERSION INTRODUCTION From the previous unit, the Graphical displays

More information

Data Analysis: Describing Data - Descriptive Statistics

Data Analysis: Describing Data - Descriptive Statistics WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

More information

10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation 10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

More information

MIDTERM EXAMINATION Spring 2009 STA301- Statistics and Probability (Session - 2)

MIDTERM EXAMINATION Spring 2009 STA301- Statistics and Probability (Session - 2) MIDTERM EXAMINATION Spring 2009 STA301- Statistics and Probability (Session - 2) Question No: 1 Median can be found only when: Data is Discrete Data is Attributed Data is continuous Data is continuous

More information

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that

More information

MCQ S OF MEASURES OF CENTRAL TENDENCY

MCQ S OF MEASURES OF CENTRAL TENDENCY MCQ S OF MEASURES OF CENTRAL TENDENCY MCQ No 3.1 Any measure indicating the centre of a set of data, arranged in an increasing or decreasing order of magnitude, is called a measure of: (a) Skewness (b)

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

Chapter 3: Data Description Numerical Methods

Chapter 3: Data Description Numerical Methods Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

More information

Numerical Measures of Central Tendency

Numerical Measures of Central Tendency Numerical Measures of Central Tendency Often, it is useful to have special numbers which summarize characteristics of a data set These numbers are called descriptive statistics or summary statistics. A

More information

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

More information

Lesson 4 Measures of Central Tendency

Lesson 4 Measures of Central Tendency Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

Session 1.6 Measures of Central Tendency

Session 1.6 Measures of Central Tendency Session 1.6 Measures of Central Tendency Measures of location (Indices of central tendency) These indices locate the center of the frequency distribution curve. The mode, median, and mean are three indices

More information

Mathematics. Probability and Statistics Curriculum Guide. Revised 2010

Mathematics. Probability and Statistics Curriculum Guide. Revised 2010 Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination

Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination Descriptive Statistics Understanding Data: Dataset: Shellfish Contamination Location Year Species Species2 Method Metals Cadmium (mg kg - ) Chromium (mg kg - ) Copper (mg kg - ) Lead (mg kg - ) Mercury

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

GCSE Statistics Revision notes

GCSE Statistics Revision notes GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table

2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table 2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations

More information

Describing Data. We find the position of the central observation using the formula: position number =

Describing Data. We find the position of the central observation using the formula: position number = HOSP 1207 (Business Stats) Learning Centre Describing Data This worksheet focuses on describing data through measuring its central tendency and variability. These measurements will give us an idea of what

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Seminar paper Statistics

Seminar paper Statistics Seminar paper Statistics The seminar paper must contain: - the title page - the characterization of the data (origin, reason why you have chosen this analysis,...) - the list of the data (in the table)

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Readings: Ha and Ha Textbook - Chapters 1 8 Appendix D & E (online) Plous - Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability

More information

CHAPTER 3 CENTRAL TENDENCY ANALYSES

CHAPTER 3 CENTRAL TENDENCY ANALYSES CHAPTER 3 CENTRAL TENDENCY ANALYSES The next concept in the sequential statistical steps approach is calculating measures of central tendency. Measures of central tendency represent some of the most simple

More information

x Measures of Central Tendency for Ungrouped Data Chapter 3 Numerical Descriptive Measures Example 3-1 Example 3-1: Solution

x Measures of Central Tendency for Ungrouped Data Chapter 3 Numerical Descriptive Measures Example 3-1 Example 3-1: Solution Chapter 3 umerical Descriptive Measures 3.1 Measures of Central Tendency for Ungrouped Data 3. Measures of Dispersion for Ungrouped Data 3.3 Mean, Variance, and Standard Deviation for Grouped Data 3.4

More information

Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives

Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

More information

Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion

Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion Statistics Basics Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion Part 1: Sampling, Frequency Distributions, and Graphs The method of collecting, organizing,

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Means, standard deviations and. and standard errors

Means, standard deviations and. and standard errors CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

More information

Frequency distributions, central tendency & variability. Displaying data

Frequency distributions, central tendency & variability. Displaying data Frequency distributions, central tendency & variability Displaying data Software SPSS Excel/Numbers/Google sheets Social Science Statistics website (socscistatistics.com) Creating and SPSS file Open the

More information

Quantitative Research Methods II. Vera E. Troeger Office: Office Hours: by appointment

Quantitative Research Methods II. Vera E. Troeger Office: Office Hours: by appointment Quantitative Research Methods II Vera E. Troeger Office: 0.67 E-mail: v.e.troeger@warwick.ac.uk Office Hours: by appointment Quantitative Data Analysis Descriptive statistics: description of central variables

More information

1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics)

1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) 1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) As well as displaying data graphically we will often wish to summarise it numerically particularly if we wish to compare two or more data sets.

More information

( ) ( ) Central Tendency. Central Tendency

( ) ( ) Central Tendency. Central Tendency 1 Central Tendency CENTRAL TENDENCY: A statistical measure that identifies a single score that is most typical or representative of the entire group Usually, a value that reflects the middle of the distribution

More information

4. DESCRIPTIVE STATISTICS. Measures of Central Tendency (Location) Sample Mean

4. DESCRIPTIVE STATISTICS. Measures of Central Tendency (Location) Sample Mean 4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 6, 29 in U.S.

More information

4. Introduction to Statistics

4. Introduction to Statistics Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

More information

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.) Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

2.3. Measures of Central Tendency

2.3. Measures of Central Tendency 2.3 Measures of Central Tendency Mean A measure of central tendency is a value that represents a typical, or central, entry of a data set. The three most commonly used measures of central tendency are

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Week 1. Exploratory Data Analysis

Week 1. Exploratory Data Analysis Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Histogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004

Histogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004 Graphs, and measures of central tendency and spread 9.07 9/13/004 Histogram If discrete or categorical, bars don t touch. If continuous, can touch, should if there are lots of bins. Sum of bin heights

More information

Statistical Concepts and Market Return

Statistical Concepts and Market Return Statistical Concepts and Market Return 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 2 2. Some Fundamental Concepts... 2 3. Summarizing Data Using Frequency Distributions...

More information

Descriptive Statistics. Frequency Distributions and Their Graphs 2.1. Frequency Distributions. Chapter 2

Descriptive Statistics. Frequency Distributions and Their Graphs 2.1. Frequency Distributions. Chapter 2 Chapter Descriptive Statistics.1 Frequency Distributions and Their Graphs Frequency Distributions A frequency distribution is a table that shows classes or intervals of data with a count of the number

More information

Introduction; Descriptive & Univariate Statistics

Introduction; Descriptive & Univariate Statistics Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of

More information

Measures of Central Tendency and Variability: Summarizing your Data for Others

Measures of Central Tendency and Variability: Summarizing your Data for Others Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :

More information

Statistics GCSE Higher Revision Sheet

Statistics GCSE Higher Revision Sheet Statistics GCSE Higher Revision Sheet This document attempts to sum up the contents of the Higher Tier Statistics GCSE. There is one exam, two hours long. A calculator is allowed. It is worth 75% of the

More information

Measures of Center Section 3-2 Definitions Mean (Arithmetic Mean)

Measures of Center Section 3-2 Definitions Mean (Arithmetic Mean) Measures of Center Section 3-1 Mean (Arithmetic Mean) AVERAGE the number obtained by adding the values and dividing the total by the number of values 1 Mean as a Balance Point 3 Mean as a Balance Point

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

THE BINOMIAL DISTRIBUTION & PROBABILITY

THE BINOMIAL DISTRIBUTION & PROBABILITY REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

3.1 Measures of central tendency: mode, median, mean, midrange Dana Lee Ling (2012)

3.1 Measures of central tendency: mode, median, mean, midrange Dana Lee Ling (2012) 3.1 Measures of central tendency: mode, median, mean, midrange Dana Lee Ling (2012) Mode The mode is the value that occurs most frequently in the data. Spreadsheet programs such as Microsoft Excel or OpenOffice.org

More information

2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics 2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

More information

Descriptive statistics parameters: Measures of centrality

Descriptive statistics parameters: Measures of centrality Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between

More information

3: Summary Statistics

3: Summary Statistics 3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

STAT 155 Introductory Statistics. Lecture 5: Density Curves and Normal Distributions (I)

STAT 155 Introductory Statistics. Lecture 5: Density Curves and Normal Distributions (I) The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STAT 155 Introductory Statistics Lecture 5: Density Curves and Normal Distributions (I) 9/12/06 Lecture 5 1 A problem about Standard Deviation A variable

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.

SKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set. SKEWNESS All about Skewness: Aim Definition Types of Skewness Measure of Skewness Example A fundamental task in many statistical analyses is to characterize the location and variability of a data set.

More information

Module 2 Project Maths Development Team Draft (Version 2)

Module 2 Project Maths Development Team Draft (Version 2) 5 Week Modular Course in Statistics & Probability Strand 1 Module 2 Analysing Data Numerically Measures of Central Tendency Mean Median Mode Measures of Spread Range Standard Deviation Inter-Quartile Range

More information

Central Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode

Central Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode Central Tendency Central Tendency n A single summary score that best describes the central location of an entire distribution of scores. n Measures of Central Tendency: n Mean n The sum of all scores divided

More information

Each exam covers lectures from since the previous exam and up to the exam date.

Each exam covers lectures from since the previous exam and up to the exam date. Sociology 301 Exam Review Liying Luo 03.22 Exam Review: Logistics Exams must be taken at the scheduled date and time unless 1. You provide verifiable documents of unforeseen illness or family emergency,

More information

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test. The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

STATISTICS FOR PSYCH MATH REVIEW GUIDE

STATISTICS FOR PSYCH MATH REVIEW GUIDE STATISTICS FOR PSYCH MATH REVIEW GUIDE ORDER OF OPERATIONS Although remembering the order of operations as BEDMAS may seem simple, it is definitely worth reviewing in a new context such as statistics formulae.

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization (Understanding Data) First: Some data preprocessing problems... 1 Missing Values The approach of the problem of missing values adopted in SQL is based on nulls and three-valued

More information

Module 4: Data Exploration

Module 4: Data Exploration Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Without data, all you are is just another person with an opinion.

Without data, all you are is just another person with an opinion. OCR Statistics Module Revision Sheet The S exam is hour 30 minutes long. You are allowed a graphics calculator. Before you go into the exam make sureyou are fully aware of the contents of theformula booklet

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement

Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.

More information

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

More information

What are Data? The Research Question (Randomised Controlled Trials (RCTs)) The Research Question (Non RCTs)

What are Data? The Research Question (Randomised Controlled Trials (RCTs)) The Research Question (Non RCTs) What are Data? Quantitative Data o Sets of measurements of objective descriptions of physical and behavioural events; susceptible to statistical analysis Qualitative data o Descriptive, views, actions

More information

Chapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures- Graphs are used to describe the shape of a data set.

Chapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures- Graphs are used to describe the shape of a data set. Page 1 of 16 Chapter 2: Exploring Data with Graphs and Numerical Summaries Graphical Measures- Graphs are used to describe the shape of a data set. Section 1: Types of Variables In general, variable can

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

Homework 3. Part 1. Name: Score: / null

Homework 3. Part 1. Name: Score: / null Name: Score: / Homework 3 Part 1 null 1 For the following sample of scores, the standard deviation is. Scores: 7, 2, 4, 6, 4, 7, 3, 7 Answer Key: 2 2 For any set of data, the sum of the deviation scores

More information

13.2 Measures of Central Tendency

13.2 Measures of Central Tendency 13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Box plots & t-tests. Example

Box plots & t-tests. Example Box plots & t-tests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as box-and-whisker diagrams. Any data that you can

More information