Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion


 Bernadette McCoy
 2 years ago
 Views:
Transcription
1 Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion
2 Statistics as a Tool for LIS Research Importance of statistics in research Summarize observations to provide answers to research questions and hypotheses Make general conclusions based on specific study observations Objectively evaluate reliability of study conclusions
3 Statistics as a Tool for LIS Research Main purposes of statistics in research Describe central point in a set of data/observations Describe how broad, diversified, or variable the data in a set is Indicate whether specfic features of a set of data are related, and how closely they are related Indicate probability of features of data being influenced by factors other than simply chance
4 Statistics as a Tool for LIS Research Two main types or branches of statistics Descriptive statistics Characterizing or summarizing data set Presenting data in charts and tables to clarify characteristics No inference, just describing a particular group of observations Inferential statistics Using sample data to make generalizations (inferences) or estimates about a population Statements made in terms of probability
5 Statistics as a Tool for LIS Research Descriptive and inferential statistics not mutually exclusive Overlap in what can be called descriptive and what can be called inferential Intent is important: Group of observations intended to describe an event: descriptive Group of observations collected from a sample and intended to predict what a larger population is like: inferential
6 Statistics as a Tool for LIS Research Choosing statistical methods Type of data collected largely determines choice of statistical analysis techniques Decisions about how and what type of data is collected will determine the specific statistical tests that can be performed to analyze the data Data collected should determine statistical tests used, not the other way around But consideration of how you want to analyze data should be done as part of research design to ensure study can produce the type of conclusions you want to make
7 Descriptive Statistics Commonly used in LIS research Cannot test causal relationships Primary strength is describing and summarizing data: Describing data in terms of frequency distributions Describing most typical value in data set  measures of central tendency Describing variability of data  measures of dispersion
8 Frequency Distributions Describing data in terms of frequency distributions Counts of totals by value or category for each measured variable Can be presented as absolute totals, cumulative totals, percentages, grouped totals Books checked out Often a first step in statistical analysis of data Usually presented in tables or charts (histogram, bar graph, etc.) Age group
9 Describing most typical value in data set  measures of central tendency Mean is often referred to as average though average can be any of these measures of central tendency: Mean (arithmetic average) Median Mode
10 Mean Most popular statistic for summarizing data Can be used for interval or ratio data Based on all observations of the data set Arithmetic average of a set of observations Example: mean of 5, 10, and 30 is 15, since 45 3 = 15 Mean of a set of numbers can be a number not in set Example: mean of 1, 2, 3, and 4 is 3.5, since 10 4 = 2.5
11 ple size σ population stdev jth quartile Samplep standard population deviation: General N population addition rule: size Probab P(A proportion ple mean x 2 ( x) d 2 paired /n difference population s CHAPTER size µ population mean n 1 O observed (x x) frequency s 2 Mean of a discrete x random ple stdev ˆp sample proportion or s 2 ( x population mean E expected n 1frequency Probab nwhere 1 (n uartile + 1)/2, 3(n + p1)/4 population proportion Standard Descriptive Statistics Quartile positions: N den CHAPTER (n + 1)/4, 3 (n deviation of a disc lation size O observed frequencyσ Descriptive (x + 1)/2, µ) 2 3(n P(X Measu + 1)/ x Q Descriptive 1 Measures lation Formula mean of mean Sample mean: x x Specia mean: x x Interquartile E expected range: frequency IQR Factorial: Q 3 Q 1 k! k(k where 1) Upper limit Q IQR Sample mean: = mean of sample n P n Lower limit Q IQR, Upper limit N( Q deno ) 3 n + 1 (A, B, able): = mean of population Range: Range Max Min Range µ x Binomial coefficient: criptive Measures MaxN Min Population mean (mean Population of a variable): mean: µ x Specia x = sigma, sum of values N Comp standard deviation: of a variable): Sample Binomial standard probability deviation: formu x x P X n= set of observations Population STAstandard 570 deviation (standard Formula deviation Sheet (x x) 2 Gener of a v (x x) 2 x 2 x X i = specific observations or s 2 ( x) 2 /n (A, B, P(X x) or Maxσ Min Sample Mean = X s = X 1+X X o n n 1 (x n µ) 1 = Mean σ 2 x N µ2 or nσ 1 2 Compl n n or N = number of N where n denotes the numbe N rd deviation: positions: µ (n + 1)/4, (n + 1)/2, 3(n observations Quartile + 1)/4 Standa probability. positions: (n + 1)/4, Gary Sample Geisler Simmons Variance College LIS 403 Spring, = 2004 rtile range: IQR Standardized Q s 2 = (X 1 X) 2 +(X 2 X) 2 3 Q variable: z x µ Genera (x x) 2 x 2 ( x) 2 /n 1 +. or s Interquartile Mean σ of arange: binomial IQRrandom Q n i=1 X σ
12 Median Value that is above the lower onehalf and below the upper onehalf of the values  middle value of set of observations when they have been arranged in order Can be used for ordinal, interval or ratio data Most central measure of a distribution Every data set has a median that is unique Difference in sets with odd numbers of observations than for even numbers of observations Example: median of the five observations 1, 3, 15, 16, and 17 = 15 Example: median of the six observations 1, 2, 3, 5, 8, and 9 = 4
13 Mode Can be used for any type of data Most frequently occuring value among a set of observations Examples: Mode of the observations 1, 2, 2, 3, 4, 5 = 2 Set of observations 1, 2, 3, 4, 5 has no mode Set of observations 1, 2, 3, 3, 4, 5, 5 has no single mode, but can be considered to have two modes, or is bimodal
14 Advantages of mean Always exists Is unique Can always be calculated by a simple formula Disadvantages of mean Mean value for a data set is not necessarily one of the values of the data set Sensitive to extreme scores, either high or low Easily distorted by extremely large or extremely small values among the set of observations, Example: mean of 1, 2, and 1,000,000 is 333,334.33
15 Advantages of median Not affected by extreme scores Useful way of describing sets of observations that are skewed by including extremely large or small values Disadvantages of median Median is not necessarily one of the values of the data set Defined differently for odd and even numbers of observations
16 Advantages of mode Can be used with any scale of measurement If set of observations has a mode, mode usefully characterizing the set For example, set of observations noting result of rolling two dice will have a mode of 7 Disadvantages of mode Many sets of observations lack a mode because no observed value occurs more than once Other sets of observations may have several different most frequent values Doesn t characterize set beyond most frequently occuring value
17 Calculating mean Age Frequency
18 Calculating mean Age Frequency 13 x 3 = x 4 = x 6 = x 8 = x 4 = x 3 = x 3 = 57 N = 31 Sum of X = /31 = Mean = 15.87
19 Calculating Age mode Frequency Mode =
20 Calculating median Age 13 Frequency 13 Nongrouped data N = 31 so midpoint is 16th value Median = 16
21 Calculating median Age Frequency Grouped data: Each value is somewhere within each age range Values are assumed to be equally distributed within range N = 31 so midpoint is 16th value Median =
22 Mean = Mode = 16 Median = 16.31
23 Normal distribution Normal curve, bellshaped curve, Gaussian distribution Many types of data are normally distributed in a population Histogram of data approximates a bellshaped, symmetrical curve Concentration of scores in the middle, with fewer and fewer scores as you approach extremes Example: heights of people in a population are normally distributed
24 Skewness Not all sets of data will exhibit properties of a normal distribution Some data sets are asymmetrical around a central point Majority of scores are closer to one extreme or the other: skewed distribution In a skewed distribution, the mean does not equal the median
25 Positively skewed distribution, tail goes to the right  median is less than the mean Example: Annual income of population Negatively skewed distribution tail goes to the left  mean is less than the median
26 Special case of skewness: JCurve Extreme skewness Proposed by Allport to describe conforming behavior in groups of people Large majority of scores fall at end representing socially acceptable behavior, small minority represent deviation from norm Example: amount of time drivers who park in No Parking zone stay there < 5 5 to to to to 25 >25
27 Determining when a distribution is skewed too much to be considered normal General rule of thumb: values beyond 2 standard errors of skewness (ses) are probably significantly skewed ses = 6/N or use ses statistic from software (SPSS, for example) output Example: if sample size = 30 and skewness statistic is.9814: ses = 6/30 =.20 = ses =.4472 x 2 =.8944 skewness statistic of.9814 is beyond 2 ses, so is significantly skewed Other factors (histograms, normal probability plots, type of test to be used) should influence decision, depending on exact circumstances of analysis
28 Kurtosis  amount of peakedness or flatness of the distribution Mesokurtic  normal Leptokurtic  peaked, many scores around middle Platykurtic  flat, many scores dispersed from middle Nonnormal kurtosis determined by similar process to skewness Nonnormal kurtosis only a concern with some statistical tests
29 Selecting appropriate measure of central tendency Interactive selection at Selecting Statistics by William M.K. Trochim: Rules below can be bent, depending on situation Unimodal, Ratio or interval data, skewed Unimodal, Ratio or interval data, not skewed Unimodal, ordinal Unimodal, Nominal Bimodal or multimodal distribution median mean median mode mode
30 Measures of Dispersion Variability is a fundamental characteristic of most data sets, but is not addressed by measures of central tendency Measures of central tendency are not enough to accurately describe a data set Also need to be able to describe the variability or dispersion of the data Dispersion: scatteredness or flucuation of scores around average score Several types of measures of dispersion Range Standard deviation Variance
31 Measures of Dispersion Range Distance between the smallest and largest observations in a set of data Examples: Range of the set of observations 2, 4, 7 is 5 Range of the set 10, 3, 4 is 14
32 Measures of Dispersion Interquartile range Simplified version: ignore the top and bottom 25% after sorting Difference between the remaining largest and smallest numbers is interquartile range Addresses the problem of outliers Other methods of calculating interquartile range are slightly more complicated but take into account more data
33 Measures of Dispersion Standard deviation Measures the variability or the degree of dispersion of the data set Square root of the average squared deviations from the mean Roughly speaking, standard deviation is the average distance between the individual observations and the center of the set of observations
34 Range: Range Max Min Measures of Dispersion Calculating standard deviation 1. Subtract each each observation from sample/population mean and square 2. Add squared distances 3. Divide sum by n  1 or N (adjusted mean of squared distances) 4. Take square root of mean squared distances Sample CHAPTER standard 3 Descriptive deviation: Meas (x x) Sample s mean: x 2 x or n 1 n Quartile Range: positions: Range (n Max + 1)/4, Mi Interquartile Sample standard deviation: range: IQR Q 3 (x x) SD of sample: Lower limit s Q IQR, n 1 Population Quartile mean positions: (mean (n of + a 1)/ va Population Interquartile standard range: deviation IQR ( Q Lower limit (x Q 1 µ) SD of population: 1.5 IQ σ 2 N Population mean (mean of a Standardized variable: z x Population standard deviatio CHAPTER 4 Descriptive (x Method µ) σ 2 S xx, S xy, and S yy : N
35 Measures of Dispersion Variance Square of standard deviation Not used for descriptive statistics, but is important for specific inferential statistics tests Variance of sample Variance of population
36 Measures of Dispersion Advantages of range as measure of dispersion Very simple to calculate Provides a meaningful characteristic of a set of observations (total spread of the observations) Disadvantages of range as measure of dispersion Extreme values distort range Only measures the total spread; tells us nothing about the pattern of data distribution Examples: Data set 1, 2, 3, 4, 5, 6, 7, 8, 9 has a range of 8 Data set 1, 9, 9, 9, 9, 9, 9, 9, 9 also has range of 8, though clearly less scattered
37 Measures of Dispersion Advantages of standard deviation as measure of dispersion Can always be calculated Meaningful characteristic of a set of observations; takes every observation into account to express the scatteredness of observations Examples: Set of observations 1, 2, 3, 4, 5, 6, 7, 8, 9 has a standard deviation s = 2.74 Set of observations 1, 9, 9, 9, 9, 9, 9, 9, 9 has a standard deviation s = 2.67 Range doesn t distinguish difference in scatteredness of sets, but standard deviation does Disadvantage of standard deviation as measure of dispersion is that it is more complicated to calculate  though not for computers
We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students:
MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having
More informationF. Farrokhyar, MPhil, PhD, PDoc
Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How
More informationChapter 3: Central Tendency
Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents
More informationContent DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS
Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationData Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010
Data Mining Part 2. and Preparation 2.1 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline Introduction Measuring the Central Tendency Measuring the Dispersion of Data Graphic Displays References
More informationGCSE HIGHER Statistics Key Facts
GCSE HIGHER Statistics Key Facts Collecting Data When writing questions for questionnaires, always ensure that: 1. the question is worded so that it will allow the recipient to give you the information
More informationCHINHOYI UNIVERSITY OF TECHNOLOGY
CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF NATURAL SCIENCES AND MATHEMATICS DEPARTMENT OF MATHEMATICS MEASURES OF CENTRAL TENDENCY AND DISPERSION INTRODUCTION From the previous unit, the Graphical displays
More informationData Analysis: Describing Data  Descriptive Statistics
WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most
More information103 Measures of Central Tendency and Variation
103 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.
More informationMIDTERM EXAMINATION Spring 2009 STA301 Statistics and Probability (Session  2)
MIDTERM EXAMINATION Spring 2009 STA301 Statistics and Probability (Session  2) Question No: 1 Median can be found only when: Data is Discrete Data is Attributed Data is continuous Data is continuous
More informationA frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes
A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that
More informationMCQ S OF MEASURES OF CENTRAL TENDENCY
MCQ S OF MEASURES OF CENTRAL TENDENCY MCQ No 3.1 Any measure indicating the centre of a set of data, arranged in an increasing or decreasing order of magnitude, is called a measure of: (a) Skewness (b)
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationSTATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
More informationNumerical Summarization of Data OPRE 6301
Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationChapter 3: Data Description Numerical Methods
Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,
More informationNumerical Measures of Central Tendency
Numerical Measures of Central Tendency Often, it is useful to have special numbers which summarize characteristics of a data set These numbers are called descriptive statistics or summary statistics. A
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationLesson 4 Measures of Central Tendency
Outline Measures of a distribution s shape modality and skewness the normal distribution Measures of central tendency mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
More informationSession 1.6 Measures of Central Tendency
Session 1.6 Measures of Central Tendency Measures of location (Indices of central tendency) These indices locate the center of the frequency distribution curve. The mode, median, and mean are three indices
More informationMathematics. Probability and Statistics Curriculum Guide. Revised 2010
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
More informationDescriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination
Descriptive Statistics Understanding Data: Dataset: Shellfish Contamination Location Year Species Species2 Method Metals Cadmium (mg kg  ) Chromium (mg kg  ) Copper (mg kg  ) Lead (mg kg  ) Mercury
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 111) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More informationCA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction
CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous
More informationGCSE Statistics Revision notes
GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 14)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 14) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More information2.0 Lesson Plan. Answer Questions. Summary Statistics. Histograms. The Normal Distribution. Using the Standard Normal Table
2.0 Lesson Plan Answer Questions 1 Summary Statistics Histograms The Normal Distribution Using the Standard Normal Table 2. Summary Statistics Given a collection of data, one needs to find representations
More informationDescribing Data. We find the position of the central observation using the formula: position number =
HOSP 1207 (Business Stats) Learning Centre Describing Data This worksheet focuses on describing data through measuring its central tendency and variability. These measurements will give us an idea of what
More informationFoundation of Quantitative Data Analysis
Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10  October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1
More informationSeminar paper Statistics
Seminar paper Statistics The seminar paper must contain:  the title page  the characterization of the data (origin, reason why you have chosen this analysis,...)  the list of the data (in the table)
More informationIntroduction to Statistics for Psychology. Quantitative Methods for Human Sciences
Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html
More informationDescribe what is meant by a placebo Contrast the doubleblind procedure with the singleblind procedure Review the structure for organizing a memo
Readings: Ha and Ha Textbook  Chapters 1 8 Appendix D & E (online) Plous  Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability
More informationCHAPTER 3 CENTRAL TENDENCY ANALYSES
CHAPTER 3 CENTRAL TENDENCY ANALYSES The next concept in the sequential statistical steps approach is calculating measures of central tendency. Measures of central tendency represent some of the most simple
More informationx Measures of Central Tendency for Ungrouped Data Chapter 3 Numerical Descriptive Measures Example 31 Example 31: Solution
Chapter 3 umerical Descriptive Measures 3.1 Measures of Central Tendency for Ungrouped Data 3. Measures of Dispersion for Ungrouped Data 3.3 Mean, Variance, and Standard Deviation for Grouped Data 3.4
More informationChapter 3 Descriptive Statistics: Numerical Measures. Learning objectives
Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the
More informationSampling, frequency distribution, graphs, measures of central tendency, measures of dispersion
Statistics Basics Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion Part 1: Sampling, Frequency Distributions, and Graphs The method of collecting, organizing,
More informationExploratory Data Analysis. Psychology 3256
Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationFrequency distributions, central tendency & variability. Displaying data
Frequency distributions, central tendency & variability Displaying data Software SPSS Excel/Numbers/Google sheets Social Science Statistics website (socscistatistics.com) Creating and SPSS file Open the
More informationQuantitative Research Methods II. Vera E. Troeger Office: Office Hours: by appointment
Quantitative Research Methods II Vera E. Troeger Office: 0.67 Email: v.e.troeger@warwick.ac.uk Office Hours: by appointment Quantitative Data Analysis Descriptive statistics: description of central variables
More information1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics)
1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) As well as displaying data graphically we will often wish to summarise it numerically particularly if we wish to compare two or more data sets.
More information( ) ( ) Central Tendency. Central Tendency
1 Central Tendency CENTRAL TENDENCY: A statistical measure that identifies a single score that is most typical or representative of the entire group Usually, a value that reflects the middle of the distribution
More information4. DESCRIPTIVE STATISTICS. Measures of Central Tendency (Location) Sample Mean
4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 6, 29 in U.S.
More information4. Introduction to Statistics
Statistics for Engineers 41 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
More informationCenter: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)
Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center
More informationExploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
More information2.3. Measures of Central Tendency
2.3 Measures of Central Tendency Mean A measure of central tendency is a value that represents a typical, or central, entry of a data set. The three most commonly used measures of central tendency are
More informationNorthumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data  November 2012  This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
More informationSummarizing and Displaying Categorical Data
Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency
More informationWeek 1. Exploratory Data Analysis
Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationHistogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004
Graphs, and measures of central tendency and spread 9.07 9/13/004 Histogram If discrete or categorical, bars don t touch. If continuous, can touch, should if there are lots of bins. Sum of bin heights
More informationStatistical Concepts and Market Return
Statistical Concepts and Market Return 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 2 2. Some Fundamental Concepts... 2 3. Summarizing Data Using Frequency Distributions...
More informationDescriptive Statistics. Frequency Distributions and Their Graphs 2.1. Frequency Distributions. Chapter 2
Chapter Descriptive Statistics.1 Frequency Distributions and Their Graphs Frequency Distributions A frequency distribution is a table that shows classes or intervals of data with a count of the number
More informationIntroduction; Descriptive & Univariate Statistics
Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of
More informationMeasures of Central Tendency and Variability: Summarizing your Data for Others
Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :
More informationStatistics GCSE Higher Revision Sheet
Statistics GCSE Higher Revision Sheet This document attempts to sum up the contents of the Higher Tier Statistics GCSE. There is one exam, two hours long. A calculator is allowed. It is worth 75% of the
More informationMeasures of Center Section 32 Definitions Mean (Arithmetic Mean)
Measures of Center Section 31 Mean (Arithmetic Mean) AVERAGE the number obtained by adding the values and dividing the total by the number of values 1 Mean as a Balance Point 3 Mean as a Balance Point
More informationChapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
More informationDescriptive Statistics
Descriptive Statistics Suppose following data have been collected (heights of 99 fiveyearold boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationThe Big 50 Revision Guidelines for S1
The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand
More informationGeostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt
More informationTHE BINOMIAL DISTRIBUTION & PROBABILITY
REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More informationProbability and Statistics Vocabulary List (Definitions for Middle School Teachers)
Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence
More information3.1 Measures of central tendency: mode, median, mean, midrange Dana Lee Ling (2012)
3.1 Measures of central tendency: mode, median, mean, midrange Dana Lee Ling (2012) Mode The mode is the value that occurs most frequently in the data. Spreadsheet programs such as Microsoft Excel or OpenOffice.org
More information2. Filling Data Gaps, Data validation & Descriptive Statistics
2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)
More informationDescriptive statistics parameters: Measures of centrality
Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between
More information3: Summary Statistics
3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes
More informationBASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS
BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi110 012 seema@iasri.res.in Genomics A genome is an organism s
More informationData Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
More informationSTAT 155 Introductory Statistics. Lecture 5: Density Curves and Normal Distributions (I)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STAT 155 Introductory Statistics Lecture 5: Density Curves and Normal Distributions (I) 9/12/06 Lecture 5 1 A problem about Standard Deviation A variable
More informationExploratory Data Analysis
Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
More informationSKEWNESS. Measure of Dispersion tells us about the variation of the data set. Skewness tells us about the direction of variation of the data set.
SKEWNESS All about Skewness: Aim Definition Types of Skewness Measure of Skewness Example A fundamental task in many statistical analyses is to characterize the location and variability of a data set.
More informationModule 2 Project Maths Development Team Draft (Version 2)
5 Week Modular Course in Statistics & Probability Strand 1 Module 2 Analysing Data Numerically Measures of Central Tendency Mean Median Mode Measures of Spread Range Standard Deviation InterQuartile Range
More informationCentral Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode
Central Tendency Central Tendency n A single summary score that best describes the central location of an entire distribution of scores. n Measures of Central Tendency: n Mean n The sum of all scores divided
More informationEach exam covers lectures from since the previous exam and up to the exam date.
Sociology 301 Exam Review Liying Luo 03.22 Exam Review: Logistics Exams must be taken at the scheduled date and time unless 1. You provide verifiable documents of unforeseen illness or family emergency,
More informationVariables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.
The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationSTATISTICS FOR PSYCH MATH REVIEW GUIDE
STATISTICS FOR PSYCH MATH REVIEW GUIDE ORDER OF OPERATIONS Although remembering the order of operations as BEDMAS may seem simple, it is definitely worth reviewing in a new context such as statistics formulae.
More informationDescriptive Data Summarization
Descriptive Data Summarization (Understanding Data) First: Some data preprocessing problems... 1 Missing Values The approach of the problem of missing values adopted in SQL is based on nulls and threevalued
More informationModule 4: Data Exploration
Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationWithout data, all you are is just another person with an opinion.
OCR Statistics Module Revision Sheet The S exam is hour 30 minutes long. You are allowed a graphics calculator. Before you go into the exam make sureyou are fully aware of the contents of theformula booklet
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 15 scale to 0100 scores When you look at your report, you will notice that the scores are reported on a 0100 scale, even though respondents
More informationResearch Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement
Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.
More informationChapter 15 Multiple Choice Questions (The answers are provided after the last question.)
Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately
More informationWhat are Data? The Research Question (Randomised Controlled Trials (RCTs)) The Research Question (Non RCTs)
What are Data? Quantitative Data o Sets of measurements of objective descriptions of physical and behavioural events; susceptible to statistical analysis Qualitative data o Descriptive, views, actions
More informationChapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures Graphs are used to describe the shape of a data set.
Page 1 of 16 Chapter 2: Exploring Data with Graphs and Numerical Summaries Graphical Measures Graphs are used to describe the shape of a data set. Section 1: Types of Variables In general, variable can
More informationVariables. Exploratory Data Analysis
Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is
More informationHomework 3. Part 1. Name: Score: / null
Name: Score: / Homework 3 Part 1 null 1 For the following sample of scores, the standard deviation is. Scores: 7, 2, 4, 6, 4, 7, 3, 7 Answer Key: 2 2 For any set of data, the sum of the deviation scores
More information13.2 Measures of Central Tendency
13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationThe right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median
CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box
More informationBox plots & ttests. Example
Box plots & ttests Box Plots Box plots are a graphical representation of your sample (easy to visualize descriptive statistics); they are also known as boxandwhisker diagrams. Any data that you can
More information