Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Content DESCRIPTIVE STATISTICS. Data & Statistic. Statistics. Example: DATA VS. STATISTIC VS. STATISTICS"

Transcription

1 Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction to statistics Descriptive vs. inferential statistics Variables Types of variables Organizing and displaying data for categorical variables Organizing and displaying data for categorical variables Data & Statistic INTRODUCTION TO STATISTICS DATA VS. STATISTIC VS. STATISTICS Data: A collection of items of information. Statistic : A summary of value of some attribute of a sample, usually but not necessarily as an estimator of some population parameter. Is calculated by applying a function to the values of the items of the sample (Porta, M. (2014). A Dictionary of Epidemiology: Oxford University Press, USA) Statistics The science of collecting, summarizing, and analyzing data. Data may or may not subject to random variation. The data themselves and summarizations of the data. Porta, M. (2008). A Dictionary of Epidemiology: Oxford University Press, USA A Branch of applied mathematics concerned with the collection and interpretation of quantitative data and the use of probability theory to estimate population parameters. Example: Data; ID Gender Height (m) 1 Male Male Female Male Female Female Female

2 Example: Statistic; 4 (57.1%) Female, 3 (42.9%) Male Mean height = 1.62m Standard deviation for height = 0.06m Statistics The process of calculating the statistic. How to calculate the frequency and percentage for gender and how to calculate mean and standard deviation for height. Why use statistics? Modern society concern with reading & writing Statistics in used to make the strongest possible conclusions from limited amount of data. A more thorough understanding of research literature will lead to improves patient care. Descriptive statistics BRANCHES OF STATISTICS DESCRIPTIVE VS. INFERENTIAL Describe and summarize dataset Involves collection, organization, analysis, interpretation and presentation of sample data Can be presented in tables, graphs or narrative format Descriptive statistics How to describe this population? Purpose Describe the characteristics of study participants Understand the data Answer the research questions in descriptive study Detect outliers or extreme values 2

3 How to describe this population? samples Describe samples Descriptive statistics Frequency distribution Measures of central tendency Measures of dispersion Measures of position Exploratory data analysis Measures of shape of distribution: graphs, skewness, kurtosis Inferential statistics Estimation Hypothesis testing reach a decision Parametric statistics Non-parametric statistics (distribution free statistics) Modelling, predicting. How to make conclusion from this population? How to make conclusion from this population? samples Inferential statistics VARIABLE Infer findings to population 3

4 Y axis: Dependent variable Variables Any quantity that have different values across individuals or other study units. (Porta, M. (2014). A Dictionary of Epidemiology: Oxford University Press, USA) Variables Independent Dependent Variables Independent variable A variable that is hypothesized to influence an event or state (the dependent variable) The independent variable is not influenced by the event but may cause (or contribute to the occurrence of) the event, or contribute to change the (psychological, environmental, socioeconomic) status. Variables Dependent variable A variable the value of which is dependent on the effect of another variable(s) the independent variable(s) in the relationship under study. A manifestation or outcome whose variation we seek to explain or account for by the influence of independent variables. Variables Effect of sunlight to plant growth Variables Variables Effect of sunlight to plant growth Effect of sunlight to plant growth Independent variable Dependent variable X axis: Independent variable 4

5 Variables Controlled variable(s) Everything you want to remain constant and unchanged during the study period Example: Investigating effect of sunlight exposure duration (hours/day) to plant growth Independent variable: Duration of sunlight exposure Dependent variable: Plant height Controlled variable: type of plant, size of pot, amount of water, type of soil etc. TYPES OF VARIABLES MEASUREMENT SCALE Measurement scale Classification of data Different types of scale are measured differently Knowledge about the measurement scale/data helps in deciding how to organize, analyse and present the data. Four fundamental scale ; Nominal Ordinal Interval Ratio Nominal Categorical (qualitative) Ordinal Data Numerical (quantitative) Interval Ratio Less info More Info Categorical data: Nominal scales Names or categories, mutually exclusive Does not imply any ordering of responses Example; Sex: Male, Female Race: Malay, Chinese, Indian, Others Lowest and least informative level of measurement Categorical data: Ordinal scales Names or categorizes which are mutually exclusive and the order is meaningful Example; Severity: mild, moderate, severe Socioeconomic status: Low, Middle, High Limitation; Can t assume the differences between adjacent scale values are equal Can t make this assumption even if the labels are number 5

6 Numerical data: Interval scales Interval scales Names or categorizes, the order is meaningful, the intervals are equal. Example; Fahrenheit temperature scale Celsius temperature scale Problem: No true zero point (Zero point is arbitrary) Zero does not mean complete absence of temperature Numerical data: Ratio scales Ratio scales Highest and most informative scale Contains the qualities of the nominal, ordinal and interval scale with the addition of an absolute zero point. Example: Amount of money Age Blood pressure The values were able to be multiple or divide Zero in Kelvin scale is absolute absence of thermal energy. Kelvin scale is therefore considered as ratio scale. Numerical data Interval and ratio variables are sometime indistinguishable, and handled the same way in data analysis. Both can be converted to categorical data Converting numerical to categorical data causes lost of information Summary of data types and scale measurement Provides Nominal Ordinal Interval Ratio Counts/frequency of distribution Mode, median The order of values is known Can quantify the difference between each value Can add or subtract values Can multiple and divide values Has true zero 6

7 ORGANIZING & DISPLAYING DATA FOR CATEGORICAL VARIABLE Organizing & displaying data for categorical variable Table: Frequency table Frequency Relative frequency (percentage) Cumulative frequency (cumulative percentage) Graphical: Bar chart Pie chart Output from SPSS Frequency table Bar chart Characteristics; 1. Y axis represent frequency 2. X axis represent categorical variables 3. Equal width of bars 4. Bars separated by equal gaps 5. Height represent frequency or percent Pie chart Characteristics; 1. Size of slice represent frequency or percent 2. Each piece of slice represent ach category 3. Combination of all slices must add up to 100% Excellent graphical presentation of data Accuracy: proper data entry, not misleading, distortion or susceptible to misinterpretation Clarity: The ideas and concept conveyed are clearly understood Simplicity: Straight forward, avoid gridlines or odd lettering Appearance: should be appealing Well-designed structure: pattern highlighted, letterings are horizontal 7

8 ORGANIZING & DISPLAYING DATA FOR NUMERICAL DATA Organizing & displaying data for numerical data Central tendency Dispersion Exploratory data analysis 1. Stem & leaf displays 2. Box and whisker plots Frequency 1. Histogram 2. Frequency polygon 3. Cumulative frequency Shape of distribution Measures of central tendency 1. Mean 2. Median 3. Mode Measures of central tendency 1. Mean Sample average Sum all values, divided by the number of values Sensitive to extreme values n X i i X 1 Example: n What is the mean height of these 9 students? id height (cm) Measures of central tendency 2. Median Middle value Not sensitive to extreme value Used to summarize a skewed data When n is odd, median=[(n+1)/2]th value When n is even, median=average of (n/2)th and [(n/2)+1]th value Measures of central tendency 2. Median Example: What is the median height of these 9 students? id height (cm)

9 Measures of central tendency 2. Median Example: What is the median height of these 9 students? Measures of central tendency 3. Mode Observation that occur most frequently Less useful in describing data N=9, median = (9+1)/2th value = 5 th value sort Measures of dispersion 1. Range 2. Variance 3. Standard deviation 4. Coefficient of variation 5. Inter quartile range Measures of dispersion 1. Range Largest value smallest value (max-min) Sensitive to extreme values Measures of dispersion 2. Variance Measures the amount of spread or variability of observation from mean The sample variance (s 2 )=the average of the square of the deviations about the sample mean (population variance= 2 ) Not used in descriptive statistics because difficulty in interpreting a square unit of data. s 2 n i1 ( X X ) 1 n 1 2 Measures of dispersion 3. Standard deviation Square root of variance Most widely used and better measure of variability The smaller the value, the closer to the mean Sensitive to extreme values s n i1 ( X X ) 1 n 1 2 9

10 Measures of dispersion 4. Coefficient of variation Ratio of the standard deviation to the mean Expressed as percentage Also known as relative standard deviation Shows the extent of variability in relation to the mean. s CoV X Hands-on Calculate/find the range, variance, standard deviation and coefficient of variation for numerical variables in the given data file. (5 minutes) id height (cm) Measures of dispersion 4. Inter quartile range: Data can be divided into quarter or four equal parts; Q1=25 th percentile Q2=50 th percentile Q3=75 th percentile IQR is the distance from Q1 to Q3 Measures of dispersion 4. Inter quartile range: The most common inter percentile measure Not sensitive to extreme values (outliers) Usually described together with median in skewed distribution observation Min Max In SPSS In SPSS 10

11 Exploratory data analysis 1. Stem & leaf displays 2. Box and whisker plots GRAPHICAL VISUALIZATION/ PRESENTATION FOR NUMERICAL DATA Exploratory data analysis Stem & leaf displays Allows easier identification of individual values in the sample id height (cm) height Stem-and-Leaf Plot Frequency Stem & Leaf 1.00 Extremes (=<162) Stem width: 10 Each leaf: 1 case(s) Exploratory data analysis Box and whisker plots Graphical display of percentile Also known as 5 number summary plot (min, Q1, Q2, Q3, max) Provide information on central tendency and variability of the middle 50% of the distribution Box represent 25 th to 75 th percentile Exploratory data analysis Box and whisker plots Observation >1.5 times IQR away from the edge of the box is/are the outlier(s) Observation >3 times IQR away is/are the extreme outlier(s) Whisker are made of smallest and largest value outside the outliers Continuous data in multiple groups can be displayed side by side Exploratory data analysis Box and whisker plots 11

12 Exploratory data analysis Box and whisker plots Measures of frequency of distribution: Graphs 1. Histogram 2. Frequency polygon 3. Cumulative frequency Measures of frequency of distribution: Graphs Histogram Graphical representation of the frequency distribution of a variable. Bar height represent frequency or percent Bar width represent the interval class No gap between the interval class Gives us idea of the distribution: normal distribution or skewed Measures of frequency of distribution: Graphs Histogram Measures of frequency of distribution: Graphs Frequency polygon A graph that displays the data using lines to connect points plotted for the frequency The frequency represent the heights of the vertical bars in the histogram Measures of frequency of distribution: Graphs Frequency polygon 12

13 Measures of frequency of distribution: Graphs Cumulative frequency Used to determine the number of observation that lie below or above a particular value Calculated using a frequency distribution table Can be constructed from stem and leaf plots or directly from data Measures of frequency of distribution: Graphs Cumulative frequency Measures of shape of distribution Skewness Kurtosis Measures of shape of distribution Skewness: measure of asymmetry of a distribution around its mean. Graphically examined by plotting normal curve on histogram Negative skewness: left tail is more pronounced than the right tail Positive skewness: right tail is more prominent than the left tail. Measures of shape of distribution Skewness: Measures of shape of distribution Kurtosis; Relative peakness or flatness of a distribution compared with the normal distribution. Visualised by plotting a normal curve on histogram Types; Distribution with a high peak: leptokurtic Distribution with a flat-topped curve: platykurtic Normal distribution: mesokurtic 13

14 Measures of shape of distribution Kurtosis; HOW TO PRESENT General rule Can be presented in either graphical, table or text format Categorical variable: n (%) Numerical variable: Symmetric data: mean (standard deviation) Skewed data: median (IQR) How to decide symmetric or skewed? Statistical Mean = median = mode Skewness Kurtosis Kolmogorov-Smirnov test (p>0.05) Shapiro Wilk test (P>0.05) How to decide symmetric or skewed? Graphical Histogram Stem and Leaf plot Box and whisker plot Table presentation Table 1: Characteristic of study participants (n=30) Variable Mean (SD) n (%) Age (yrs) Sex Female Male Race Malay Chinese Indian Education Primary Secondary Tertiary BMI (kg/m 2 ) DBP (mmhg) SBP (mmhg) *median (IQR) 14

15 THANK YOU. 15

F. Farrokhyar, MPhil, PhD, PDoc

F. Farrokhyar, MPhil, PhD, PDoc Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

More information

Chapter 3: Data Description Numerical Methods

Chapter 3: Data Description Numerical Methods Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

More information

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

More information

Data Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010

Data Mining Part 2. Data Understanding and Preparation 2.1 Data Understanding Spring 2010 Data Mining Part 2. and Preparation 2.1 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction Outline Introduction Measuring the Central Tendency Measuring the Dispersion of Data Graphic Displays References

More information

CHINHOYI UNIVERSITY OF TECHNOLOGY

CHINHOYI UNIVERSITY OF TECHNOLOGY CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF NATURAL SCIENCES AND MATHEMATICS DEPARTMENT OF MATHEMATICS MEASURES OF CENTRAL TENDENCY AND DISPERSION INTRODUCTION From the previous unit, the Graphical displays

More information

Desciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods

Desciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods Desciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods Qualitative data Data are classified in categories Non numerical (although may be numerically codified) Elements

More information

Data Analysis: Describing Data - Descriptive Statistics

Data Analysis: Describing Data - Descriptive Statistics WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students:

We will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students: MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

Central Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode

Central Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode Central Tendency Central Tendency n A single summary score that best describes the central location of an entire distribution of scores. n Measures of Central Tendency: n Mean n The sum of all scores divided

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

Chapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures- Graphs are used to describe the shape of a data set.

Chapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures- Graphs are used to describe the shape of a data set. Page 1 of 16 Chapter 2: Exploring Data with Graphs and Numerical Summaries Graphical Measures- Graphs are used to describe the shape of a data set. Section 1: Types of Variables In general, variable can

More information

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes

A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes A frequency distribution is a table used to describe a data set. A frequency table lists intervals or ranges of data values called data classes together with the number of data values from the set that

More information

Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement

Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test. The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

More information

Week 1. Exploratory Data Analysis

Week 1. Exploratory Data Analysis Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

More information

Univariate Descriptive Statistics

Univariate Descriptive Statistics Univariate Descriptive Statistics Displays: pie charts, bar graphs, box plots, histograms, density estimates, dot plots, stemleaf plots, tables, lists. Example: sea urchin sizes Boxplot Histogram Urchin

More information

Statistical Concepts and Market Return

Statistical Concepts and Market Return Statistical Concepts and Market Return 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 2 2. Some Fundamental Concepts... 2 3. Summarizing Data Using Frequency Distributions...

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Chapter 3: Central Tendency

Chapter 3: Central Tendency Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents

More information

Frequency distributions, central tendency & variability. Displaying data

Frequency distributions, central tendency & variability. Displaying data Frequency distributions, central tendency & variability Displaying data Software SPSS Excel/Numbers/Google sheets Social Science Statistics website (socscistatistics.com) Creating and SPSS file Open the

More information

Dr. Peter Tröger Hasso Plattner Institute, University of Potsdam. Software Profiling Seminar, Statistics 101

Dr. Peter Tröger Hasso Plattner Institute, University of Potsdam. Software Profiling Seminar, Statistics 101 Dr. Peter Tröger Hasso Plattner Institute, University of Potsdam Software Profiling Seminar, 2013 Statistics 101 Descriptive Statistics Population Object Object Object Sample numerical description Object

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Chapter 2 Summarizing and Graphing Data

Chapter 2 Summarizing and Graphing Data Chapter 2 Summarizing and Graphing Data 2-1 Review and Preview 2-2 Frequency Distributions 2-3 Histograms 2-4 Graphs that Enlighten and Graphs that Deceive Preview Characteristics of Data 1. Center: A

More information

Previous lecture. Lecture 6. Learning outcomes of this lecture. Today. Entering data into SPSS. Data coding. Data preparation methods

Previous lecture. Lecture 6. Learning outcomes of this lecture. Today. Entering data into SPSS. Data coding. Data preparation methods Lecture 6 Empirical Research Methods IN4304 Data preparation methods Previous lecture participant-observation and non-participant observation Does the observer act as a member of the group? Sampling strategies

More information

2. Describing Data. We consider 1. Graphical methods 2. Numerical methods 1 / 56

2. Describing Data. We consider 1. Graphical methods 2. Numerical methods 1 / 56 2. Describing Data We consider 1. Graphical methods 2. Numerical methods 1 / 56 General Use of Graphical and Numerical Methods Graphical methods can be used to visually and qualitatively present data and

More information

Descriptive Statistics. Frequency Distributions and Their Graphs 2.1. Frequency Distributions. Chapter 2

Descriptive Statistics. Frequency Distributions and Their Graphs 2.1. Frequency Distributions. Chapter 2 Chapter Descriptive Statistics.1 Frequency Distributions and Their Graphs Frequency Distributions A frequency distribution is a table that shows classes or intervals of data with a count of the number

More information

Chapter 2. Objectives. Tabulate Qualitative Data. Frequency Table. Descriptive Statistics: Organizing, Displaying and Summarizing Data.

Chapter 2. Objectives. Tabulate Qualitative Data. Frequency Table. Descriptive Statistics: Organizing, Displaying and Summarizing Data. Objectives Chapter Descriptive Statistics: Organizing, Displaying and Summarizing Data Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically

More information

Chapter 2: Frequency Distributions and Graphs (or making pretty tables and pretty pictures)

Chapter 2: Frequency Distributions and Graphs (or making pretty tables and pretty pictures) Chapter 2: Frequency Distributions and Graphs (or making pretty tables and pretty pictures) Example: Titanic passenger data is available for 1310 individuals for 14 variables, though not all variables

More information

GCSE HIGHER Statistics Key Facts

GCSE HIGHER Statistics Key Facts GCSE HIGHER Statistics Key Facts Collecting Data When writing questions for questionnaires, always ensure that: 1. the question is worded so that it will allow the recipient to give you the information

More information

Lecture I. Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions.

Lecture I. Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. Lecture 1 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a process consisting of 3 parts. Lecture

More information

Chapter 1: Looking at Data Distributions. Dr. Nahid Sultana

Chapter 1: Looking at Data Distributions. Dr. Nahid Sultana Chapter 1: Looking at Data Distributions Dr. Nahid Sultana Chapter 1: Looking at Data Distributions 1.1 Displaying Distributions with Graphs 1.2 Describing Distributions with Numbers 1.3 Density Curves

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Copyright 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Slide 4-1

Copyright 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Slide 4-1 Slide 4-1 Chapter 4 Displaying Quantitative Data Dealing With a Lot of Numbers Summarizing the data will help us when we look at large sets of quantitative data. Without summaries of the data, it s hard

More information

III. GRAPHICAL METHODS

III. GRAPHICAL METHODS Pie Charts and Bar Charts: III. GRAPHICAL METHODS Pie charts and bar charts are used for depicting frequencies or relative frequencies. We compare examples of each using the same data. Sources: AT&T (1961)

More information

Report of for Chapter 2 pretest

Report of for Chapter 2 pretest Report of for Chapter 2 pretest Exam: Chapter 2 pretest Category: Organizing and Graphing Data 1. "For our study of driving habits, we recorded the speed of every fifth vehicle on Drury Lane. Nearly every

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Readings: Ha and Ha Textbook - Chapters 1 8 Appendix D & E (online) Plous - Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability

More information

Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion

Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion Statistics Basics Sampling, frequency distribution, graphs, measures of central tendency, measures of dispersion Part 1: Sampling, Frequency Distributions, and Graphs The method of collecting, organizing,

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

GCSE Statistics Revision notes

GCSE Statistics Revision notes GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic

More information

Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination

Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination Descriptive Statistics Understanding Data: Dataset: Shellfish Contamination Location Year Species Species2 Method Metals Cadmium (mg kg - ) Chromium (mg kg - ) Copper (mg kg - ) Lead (mg kg - ) Mercury

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Graphical and Tabular. Summarization of Data OPRE 6301

Graphical and Tabular. Summarization of Data OPRE 6301 Graphical and Tabular Summarization of Data OPRE 6301 Introduction and Re-cap... Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information

More information

The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

More information

Table 2-1. Sucrose concentration (% fresh wt.) of 100 sugar beet roots. Beet No. % Sucrose. Beet No.

Table 2-1. Sucrose concentration (% fresh wt.) of 100 sugar beet roots. Beet No. % Sucrose. Beet No. Chapter 2. DATA EXPLORATION AND SUMMARIZATION 2.1 Frequency Distributions Commonly, people refer to a population as the number of individuals in a city or county, for example, all the people in California.

More information

Statistical Analysis I

Statistical Analysis I CTSI BERD Research Methods Seminar Series Statistical Analysis I Lan Kong, PhD Associate Professor Department of Public Health Sciences December 22, 2014 Biostatistics, Epidemiology, Research Design(BERD)

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

CH.6 Random Sampling and Descriptive Statistics

CH.6 Random Sampling and Descriptive Statistics CH.6 Random Sampling and Descriptive Statistics Population vs Sample Random sampling Numerical summaries : sample mean, sample variance, sample range Stem-and-Leaf Diagrams Median, quartiles, percentiles,

More information

Mathematics. Probability and Statistics Curriculum Guide. Revised 2010

Mathematics. Probability and Statistics Curriculum Guide. Revised 2010 Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

Chapter 2 - Graphical Summaries of Data

Chapter 2 - Graphical Summaries of Data Chapter 2 - Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense

More information

10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation 10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

More information

909 responses responded via telephone survey in U.S. Results were shown by political affiliations (show graph on the board)

909 responses responded via telephone survey in U.S. Results were shown by political affiliations (show graph on the board) 1 2-1 Overview Chapter 2: Learn the methods of organizing, summarizing, and graphing sets of data, ultimately, to understand the data characteristics: Center, Variation, Distribution, Outliers, Time. (Computer

More information

MIDTERM EXAMINATION Spring 2009 STA301- Statistics and Probability (Session - 2)

MIDTERM EXAMINATION Spring 2009 STA301- Statistics and Probability (Session - 2) MIDTERM EXAMINATION Spring 2009 STA301- Statistics and Probability (Session - 2) Question No: 1 Median can be found only when: Data is Discrete Data is Attributed Data is continuous Data is continuous

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

More information

vs. relative cumulative frequency

vs. relative cumulative frequency Variable - what we are measuring Quantitative - numerical where mathematical operations make sense. These have UNITS Categorical - puts individuals into categories Numbers don't always mean Quantitative...

More information

Nominal Scaling. Measures of Central Tendency, Spread, and Shape. Interval Scaling. Ordinal Scaling

Nominal Scaling. Measures of Central Tendency, Spread, and Shape. Interval Scaling. Ordinal Scaling Nominal Scaling Measures of, Spread, and Shape Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning The lowest level of

More information

There are some general common sense recommendations to follow when presenting

There are some general common sense recommendations to follow when presenting Presentation of Data The presentation of data in the form of tables, graphs and charts is an important part of the process of data analysis and report writing. Although results can be expressed within

More information

LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016

LEARNING OBJECTIVES SCALES OF MEASUREMENT: A REVIEW SCALES OF MEASUREMENT: A REVIEW DESCRIBING RESULTS DESCRIBING RESULTS 8/14/2016 UNDERSTANDING RESEARCH RESULTS: DESCRIPTION AND CORRELATION LEARNING OBJECTIVES Contrast three ways of describing results: Comparing group percentages Correlating scores Comparing group means Describe

More information

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data

Introduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

Quantitative Research Methods II. Vera E. Troeger Office: Office Hours: by appointment

Quantitative Research Methods II. Vera E. Troeger Office: Office Hours: by appointment Quantitative Research Methods II Vera E. Troeger Office: 0.67 E-mail: v.e.troeger@warwick.ac.uk Office Hours: by appointment Quantitative Data Analysis Descriptive statistics: description of central variables

More information

How to interpret scientific & statistical graphs

How to interpret scientific & statistical graphs How to interpret scientific & statistical graphs Theresa A Scott, MS Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott 1 A brief introduction Graphics:

More information

9 Descriptive and Multivariate Statistics

9 Descriptive and Multivariate Statistics 9 Descriptive and Multivariate Statistics Jamie Price Donald W. Chamberlayne * S tatistics is the science of collecting and organizing data and then drawing conclusions based on data. There are essentially

More information

What is Statistics? Statistics is about Collecting data Organizing data Analyzing data Presenting data

What is Statistics? Statistics is about Collecting data Organizing data Analyzing data Presenting data Introduction What is Statistics? Statistics is about Collecting data Organizing data Analyzing data Presenting data What is Statistics? Statistics is divided into two areas: descriptive statistics and

More information

Statistics Chapter 3 Averages and Variations

Statistics Chapter 3 Averages and Variations Statistics Chapter 3 Averages and Variations Measures of Central Tendency Average a measure of the center value or central tendency of a distribution of values. Three types of average: Mode Median Mean

More information

Quantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality

Quantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality Quantitative Data Analysis: Choosing a statistical test Prepared by the Office of Planning, Assessment, Research and Quality 1 To help choose which type of quantitative data analysis to use either before

More information

Describing and presenting data

Describing and presenting data Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data

More information

Basic Biostatistics for Clinical Research. Ramses F Sadek, PhD GRU Cancer Center

Basic Biostatistics for Clinical Research. Ramses F Sadek, PhD GRU Cancer Center Basic Biostatistics for Clinical Research Ramses F Sadek, PhD GRU Cancer Center 1 1. Basic Concepts 2. Data & Their Presentation Part One 2 1. Basic Concepts Statistics Biostatistics Populations and samples

More information

Comments 2 For Discussion Sheet 2 and Worksheet 2 Frequency Distributions and Histograms

Comments 2 For Discussion Sheet 2 and Worksheet 2 Frequency Distributions and Histograms Comments 2 For Discussion Sheet 2 and Worksheet 2 Frequency Distributions and Histograms Discussion Sheet 2 We have studied graphs (charts) used to represent categorical data. We now want to look at a

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Biostatistics 101: Data Presentation

Biostatistics 101: Data Presentation B a s i c S t a t i s t i c s F o r D o c t o r s Singapore Med J 2003 Vol 44(6) : 280-285 Biostatistics 101: Data Presentation Y H Chan Clinical trials and Epidemiology Research Unit 226 Outram Road Blk

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Histogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004

Histogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004 Graphs, and measures of central tendency and spread 9.07 9/13/004 Histogram If discrete or categorical, bars don t touch. If continuous, can touch, should if there are lots of bins. Sum of bin heights

More information

Module 2: Introduction to Quantitative Data Analysis

Module 2: Introduction to Quantitative Data Analysis Module 2: Introduction to Quantitative Data Analysis Contents Antony Fielding 1 University of Birmingham & Centre for Multilevel Modelling Rebecca Pillinger Centre for Multilevel Modelling Introduction...

More information

Chapter Four: Univariate Statistics

Chapter Four: Univariate Statistics Chapter Four: Univariate Statistics Univariate analysis, looking at single variables, is typically the first procedure one does when examining data for the first time. There are a number of reasons why

More information

PROPERTIES OF MEAN, MEDIAN

PROPERTIES OF MEAN, MEDIAN PROPERTIES OF MEAN, MEDIAN In the last class quantitative and numerical variables bar charts, histograms(in recitation) Mean, Median Suppose the data set is {30, 40, 60, 80, 90, 120} X = 70, median = 70

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

4. DESCRIPTIVE STATISTICS. Measures of Central Tendency (Location) Sample Mean

4. DESCRIPTIVE STATISTICS. Measures of Central Tendency (Location) Sample Mean 4. DESCRIPTIVE STATISTICS Descriptive Statistics is a body of techniques for summarizing and presenting the essential information in a data set. Eg: Here are daily high temperatures for Jan 6, 29 in U.S.

More information

Describing Data. Carolyn J. Anderson EdPsych 580 Fall Describing Data p. 1/42

Describing Data. Carolyn J. Anderson EdPsych 580 Fall Describing Data p. 1/42 Describing Data Carolyn J. Anderson EdPsych 580 Fall 2005 Describing Data p. 1/42 Describing Data Numerical Descriptions Single Variable Relationship Graphical displays Single variable. Relationships in

More information

How To: Analyse & Present Data

How To: Analyse & Present Data INTRODUCTION The aim of this How To guide is to provide advice on how to analyse your data and how to present it. If you require any help with your data analysis please discuss with your divisional Clinical

More information

1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics)

1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) 1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) As well as displaying data graphically we will often wish to summarise it numerically particularly if we wish to compare two or more data sets.

More information