# 4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Size: px
Start display at page:

Download "4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?""

Transcription

1 Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses of the same dataset may reflect or represent different aspects of the underlying data structure. Once a plan has been established, it could contain any combination of the following types of data analysis strategies: Exploratory: This type of data analysis often occurs when a program is new, and it is unclear what to expect from the data. Descriptive: The most common type of data analysis, this approach will summarize your findings and describe the sample. Inferential: Inferential statistics allow us to draw conclusions about the larger population from which the sample is drawn. These powerful techniques may be able demonstrate if a change has occurred as a result of your program. PDA's Statistics Division specializes in many sophisticated data analysis techniques. 4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?" Exploratory data analysis uses numerical and graphical methods to display important features of your data set. Exploratory data analysis helps us to highlight general features of your data to direct future analyses. It also pinpoints problem areas in the data. For example: Should outliers be included or excluded in the analyses? Do the data need cleaning for consistency? How much missing data is there and how should it be handled? What do the distributions look like for key variables? Distribution of the data: What's the "shape" of the data? Where do most of the values lie? Are they clumped around a central value, and if so, are there roughly as many above this value as below it? We look at the distribution for each variable to determine which analyses would be most appropriate. Sometimes it is necessary to examine distributions of data partitioned by other key variables. Missing Values: In a survey, missing values correspond to skipped questions or unendorsed options. A discussion between analyst and client should take place in determining how missing values should be handled. In some cases, missing values might be perfectly normal (e.g. the variable "result of pregnancy test" for a male is blank). However, in some cases missing values for important variables might exclude a record from certain analyses. Sometimes it is appropriate to place normalized values in place of missing values. Outliers: "Unusually" large or small values that are dramatically separated from the rest of the data might be: 'out-of-range' or physically impossible values that resulted from entry or processing error. Merely "weird" values might represent entry error, Page 1 of 13

2 random fluctuation, or members of a population other than the one you want to study. Data Cleaning: Exploratory data analyses might indicate a particular need to clean data. Data cleaning is extremely important when the data collection method allows inconsistencies. All data cleaning work must be carefully documented and available in a report. Date cleaning includes the following activities as needed: Removal of outliers: Invalid, impossible, or extreme values may be removed from the dataset. Outliers might also be marked for exclusion for the purpose of certain analyses. Labeling missing values: It may be necessary to label each missing value with the reason it is considered missing in order to guarantee accurate bases for analysis. 4.2 Descriptive Data Analysis: Descriptive statistics tell you how your data look, and what the relationships are between the different variables in your data set. Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data. Descriptive Statistics are used to present quantitative descriptions in a manageable form. Each descriptive statistic reduces lots of data into a simpler summary. For instance, consider a simple number used to summarize how well a batter is performing in baseball, the batting average. This single number is simply the number of hits divided by the number of times at bat (reported to three significant digits). A batter who is hitting.333 is getting a hit one time in every three at bats. One batting.250 is hitting one time in four. The single number describes a large number of discrete events. Every time you try to describe a large set of observations with a single indicator you run the risk of distorting the original data or losing important detail. The batting average doesn't tell you whether the batter is hitting home runs or singles. It doesn't tell whether she's been in a slump or on a streak. Even given these limitations, descriptive statistics provide a powerful summary that may enable comparisons across people or other units. Univariate Analysis. Univariate analysis involves the examination across cases of one variable at a time. There are three major characteristics of a single variable that we tend to look at: the distribution the central tendency the dispersion In most situations, we would describe all three of these characteristics for each of the variables in our study. The Distribution. The distribution is a summary of the frequency of individual values or ranges of values for a variable. The simplest distribution would list every value of a variable and the number of persons who had each value. For instance, a typical way to describe the distribution of college students is by year in college, listing the number or percent of students at each of the four years. Or, we describe gender by Page 2 of 13

3 listing the number or percent of males and females. In these cases, the variable has few enough values that we can list each one and summarize how many sample cases had the value. With variables that can have a large number of possible values, with relatively few people having each one, we group the raw scores into categories according to ranges of values. Table 1. Frequency distribution table. One of the most common ways to describe a single variable is with a frequency distribution. Depending on the particular variable, all of the data values may be represented, or you may group the values into categories first (e.g., with age, price, or temperature variables, it would usually not be sensible to determine the frequencies for each value. Rather, the values are grouped into ranges and the frequencies determined.). Frequency distributions can be depicted in two ways, as a table or as a graph. Table 1 shows an age frequency distribution with five categories of age ranges defined. The same frequency distribution can be depicted in a graph as shown in Figure 2. This type of graph is often referred to as a histogram or bar chart. Table 2. Frequency distribution bar chart. Distributions may also be displayed using percentages. For example, you could use percentages to describe the: percentage of people in different income levels percentage of people in different age ranges percentage of people in different ranges of standardized test scores Page 3 of 13

4 Central Tendency. The central tendency of a distribution is an estimate of the "center" of a distribution of values. There are three major types of estimates of central tendency: Mean Median Mode The Mean or average is probably the most commonly used method of describing central tendency. To compute the mean all you do is add up all the values and divide by the number of values. For example, the mean or average quiz score is determined by summing all the scores and dividing by the number of students taking the exam. For example, consider the test score values: 15, 20, 21, 20, 36, 15, 25, 15 The sum of these 8 values is 167, so the mean is 167/8 = The Median is the score found at the exact middle of the set of values. One way to compute the median is to list all scores in numerical order, and then locate the score in the center of the sample. For example, if there are 500 scores in the list, score #250 would be the median. If we order the 8 scores shown above, we would get: 15,15,15,20,20,21,25,36 There are 8 scores and score #4 and #5 represent the halfway point. Since both of these scores are 20, the median is 20. If the two middle scores had different values, you would have to interpolate to determine the median. The mode is the most frequently occurring value in the set of scores. To determine the mode, you might again order the scores as shown above, and then count each one. The most frequently occurring value is the mode. In our example, the value 15 occurs three times and is the model. In some distributions there is more than one modal value. For instance, in a bimodal distribution there are two values that occur most frequently. Notice that for the same set of 8 scores we got three different values , 20, and for the mean, median and mode respectively. If the distribution is truly normal (i.e., bell-shaped), the mean, median and mode are all equal to each other. Dispersion. Dispersion refers to the spread of the values around the central tendency. There are two common measures of dispersion, the range and the standard deviation. The range is simply the highest value minus the lowest value. In our example distribution, the high value is 36 and the low is 15, so the range is = 21. The Standard Deviation is a more accurate and detailed estimate of dispersion because an outlier can greatly exaggerate the range (as was true in this example where the single outlier value of 36 stands apart from the rest of the values. The Standard Deviation shows the relation that set of scores has to the mean of the sample. Again lets take the set of scores: Page 4 of 13

5 15,20,21,20,36,15,25,15 To compute the standard deviation, we first find the distance between each value and the mean. We know from above that the mean is So, the differences from the mean are: = = = = = = = = Notice that values that are below the mean have negative discrepancies and values above it have positive ones. Next, we square each discrepancy: * = * = * = * = * = * = * = * = Now, we take these "squares" and sum them to get the Sum of Squares (SS) value. Here, the sum is Next, we divide this sum by the number of scores minus 1. Here, the result is / 7 = This value is known as the variance. To get the standard deviation, we take the square root of the variance (remember that we squared the deviations earlier). This would be SQRT(50.125) = Although this computation may seem convoluted, it's actually quite simple. To see this, consider the formula for the standard deviation: In the top part of the ratio, the numerator, we see that each score has the the mean subtracted from it, the difference is squared, and the squares are summed. In the bottom part, we take the number of scores minus 1. The ratio is the variance and the square root is the standard deviation. In English, we can describe the standard deviation as: the square root of the sum of the squared deviations from the mean divided by the number of scores minus one. Although we can calculate these univariate statistics by hand, it gets quite tedious when you have more than a few values and variables. Every statistics program is capable of calculating them easily for you. For instance, I put the eight scores into a Page 5 of 13

6 statistical software program (eg, Epi Info, Excel, SPPS, Stata, SAS) and got the following table as a result, which confirms the calculations done by hand: N 8 Mean Median Mode Std. Deviation Variance Range The standard deviation allows us to reach some conclusions about specific scores in our distribution. Assuming that the distribution of scores is normal or bell-shaped (or close to it!), the following conclusions can be reached: approximately 68% of the scores in the sample fall within one standard deviation of the mean approximately 95% of the scores in the sample fall within two standard deviations of the mean approximately 99% of the scores in the sample fall within three standard deviations of the mean For instance, since the mean in our example is and the standard deviation is , we can from the above statement estimate that approximately 95% of the scores will fall in the range of (2*7.0799) to (2*7.0799) or between and This kind of information is a critical stepping stone to enabling us to compare the performance of an individual on one variable with their performance on another, even when the variables are measured on entirely different scales. 4.3 Inferential Statistics: Inferential statistics test hypotheses about the data and may permit you to generalize beyond your dataset. Examples include comparing means (averages) for a given measurement between several different groups or for the same individuals across time: Trends Analysis: If your study involves repeating the same measurements at different points in time, repeated measures or time series analyses can help find changes or patterns over time. Analysis Of Variance Models: Analysis of variance is used to compare average scores for different groups. Many different ANOVA models are available to tease out the effect of covariates (factors expected to relate to the outcome, e.g., stressful life events and depression) and handle multiple dependent variables (outcomes). Multiple Regression & Correlation: Correlation measures the strength of the relationship between different variables in your data (e.g., there is a strong relationship between height and shoe size, a weak relationship between height and IQ). Multiple regression examines how well one set of variables (e.g. hours studied, IQ, interest in topic) predicts an outcome variable (grade on an exam) and specifies the unique contributions of each predictor (e.g., Page 6 of 13

7 hours studied is more important than IQ in predicting grade). Canonical correlation analysis relates one set of variables to another set. Page 7 of 13

8 References AO Foundation (n.d.). Step-by-step guide to DOING clinical research. Retrieved 09 October 2006 from Professional Data Analysts (n.d.). Stage 3: Data Analysis.Retrieved 09 October 2006 from Trochim, W.M.K. (2006).Descriptive Statistics. Retrieved 09 October 2006 from Wikipedia (n.d.). Statistical Bias. Retrieved on 09 October 2006 from Page 8 of 13

### DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

### Northumberland Knowledge

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Analyzing and interpreting data Evaluation resources from Wilder Research

Wilder Research Analyzing and interpreting data Evaluation resources from Wilder Research Once data are collected, the next step is to analyze the data. A plan for analyzing your data should be developed

### MEASURES OF VARIATION

NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

### Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

### Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Chapter 2 Overview Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Classify as categorical or qualitative data. 1) A survey of autos parked in

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

### 3: Summary Statistics

3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### Lesson 4 Measures of Central Tendency

Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

EC151.02 Statistics for Business and Economics (MWF 8:00-8:50) Instructor: Chiu Yu Ko Office: 462D, 21 Campenalla Way Phone: 2-6093 Email: kocb@bc.edu Office Hours: by appointment Description This course

### Descriptive Statistics and Measurement Scales

Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

### consider the number of math classes taken by math 150 students. how can we represent the results in one number?

ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.

### Describing Data: Measures of Central Tendency and Dispersion

100 Part 2 / Basic Tools of Research: Sampling, Measurement, Distributions, and Descriptive Statistics Chapter 8 Describing Data: Measures of Central Tendency and Dispersion In the previous chapter we

### 1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate

### Statistics Revision Sheet Question 6 of Paper 2

Statistics Revision Sheet Question 6 of Paper The Statistics question is concerned mainly with the following terms. The Mean and the Median and are two ways of measuring the average. sumof values no. of

### Measurement & Data Analysis. On the importance of math & measurement. Steps Involved in Doing Scientific Research. Measurement

Measurement & Data Analysis Overview of Measurement. Variability & Measurement Error.. Descriptive vs. Inferential Statistics. Descriptive Statistics. Distributions. Standardized Scores. Graphing Data.

### Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

### Descriptive statistics parameters: Measures of centrality

Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between

### Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

### STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

### DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

### Statistics Chapter 2

Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students

### Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### How To Write A Data Analysis

Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

### Variables. Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

### COMPARISON MEASURES OF CENTRAL TENDENCY & VARIABILITY EXERCISE 8/5/2013. MEASURE OF CENTRAL TENDENCY: MODE (Mo) MEASURE OF CENTRAL TENDENCY: MODE (Mo)

COMPARISON MEASURES OF CENTRAL TENDENCY & VARIABILITY Prepared by: Jess Roel Q. Pesole CENTRAL TENDENCY -what is average or typical in a distribution Commonly Measures: 1. Mode. Median 3. Mean quantified

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### 2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

### Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

### DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science

### Descriptive Statistics and Exploratory Data Analysis

Descriptive Statistics and Exploratory Data Analysis Dean s s Faculty and Resident Development Series UT College of Medicine Chattanooga Probasco Auditorium at Erlanger January 14, 2008 Marc Loizeaux,

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### EXPLORING SPATIAL PATTERNS IN YOUR DATA

EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze

### Data Exploration Data Visualization

Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

3.2 Measures of Spread In some data sets the observations are close together, while in others they are more spread out. In addition to measures of the center, it's often important to measure the spread

### Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

### Geostatistics Exploratory Analysis

Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

### The Normal Distribution

Chapter 6 The Normal Distribution 6.1 The Normal Distribution 1 6.1.1 Student Learning Objectives By the end of this chapter, the student should be able to: Recognize the normal probability distribution

### Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

### Descriptive Statistics

Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

### Chapter 5 Analysis of variance SPSS Analysis of variance

Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,

### 9. Sampling Distributions

9. Sampling Distributions Prerequisites none A. Introduction B. Sampling Distribution of the Mean C. Sampling Distribution of Difference Between Means D. Sampling Distribution of Pearson's r E. Sampling

### How To: Analyse & Present Data

INTRODUCTION The aim of this How To guide is to provide advice on how to analyse your data and how to present it. If you require any help with your data analysis please discuss with your divisional Clinical

### Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### Introduction to Quantitative Methods

Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### Directions for using SPSS

Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

### Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center

### January 26, 2009 The Faculty Center for Teaching and Learning

THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

### When to use Excel. When NOT to use Excel 9/24/2014

Analyzing Quantitative Assessment Data with Excel October 2, 2014 Jeremy Penn, Ph.D. Director When to use Excel You want to quickly summarize or analyze your assessment data You want to create basic visual

### 4. Descriptive Statistics: Measures of Variability and Central Tendency

4. Descriptive Statistics: Measures of Variability and Central Tendency Objectives Calculate descriptive for continuous and categorical data Edit output tables Although measures of central tendency and

### Organizing Your Approach to a Data Analysis

Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

### Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................

### Midterm Review Problems

Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

### Chapter 2: Descriptive Statistics

Chapter 2: Descriptive Statistics **This chapter corresponds to chapters 2 ( Means to an End ) and 3 ( Vive la Difference ) of your book. What it is: Descriptive statistics are values that describe the

### " Y. Notation and Equations for Regression Lecture 11/4. Notation:

Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through

### Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

### Measures of Central Tendency and Variability: Summarizing your Data for Others

Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :

### DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 OVERVIEW STATISTICS PANIK...THE THEORY AND METHODS OF COLLECTING, ORGANIZING, PRESENTING, ANALYZING, AND INTERPRETING DATA SETS SO AS TO DETERMINE THEIR ESSENTIAL

### Correlation key concepts:

CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

### Survey Research Data Analysis

Survey Research Data Analysis Overview Once survey data are collected from respondents, the next step is to input the data on the computer, do appropriate statistical analyses, interpret the data, and

### MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

### MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

### Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

### HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

### A Correlation of. to the. South Carolina Data Analysis and Probability Standards

A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

There are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### TIPS FOR DOING STATISTICS IN EXCEL

TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

### Def: The standard normal distribution is a normal probability distribution that has a mean of 0 and a standard deviation of 1.

Lecture 6: Chapter 6: Normal Probability Distributions A normal distribution is a continuous probability distribution for a random variable x. The graph of a normal distribution is called the normal curve.

### Chapter 7: Simple linear regression Learning Objectives

Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -

### When to Use a Particular Statistical Test

When to Use a Particular Statistical Test Central Tendency Univariate Descriptive Mode the most commonly occurring value 6 people with ages 21, 22, 21, 23, 19, 21 - mode = 21 Median the center value the

### IBM SPSS Statistics for Beginners for Windows

ISS, NEWCASTLE UNIVERSITY IBM SPSS Statistics for Beginners for Windows A Training Manual for Beginners Dr. S. T. Kometa A Training Manual for Beginners Contents 1 Aims and Objectives... 3 1.1 Learning

### Module 4: Data Exploration

Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive

### Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

### A Picture Really Is Worth a Thousand Words

4 A Picture Really Is Worth a Thousand Words Difficulty Scale (pretty easy, but not a cinch) What you ll learn about in this chapter Why a picture is really worth a thousand words How to create a histogram

### Data exploration with Microsoft Excel: analysing more than one variable

Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical

### NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Exploratory Data Analysis

Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

### PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050 Class Hours: 2.0 Credit Hours: 3.0 Laboratory Hours: 2.0 Date Revised: Fall 2013 Catalog Course Description: Descriptive

### AP * Statistics Review. Descriptive Statistics

AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

### Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing. C. Olivia Rud, VP, Fleet Bank

Data Mining: An Overview of Methods and Technologies for Increasing Profits in Direct Marketing C. Olivia Rud, VP, Fleet Bank ABSTRACT Data Mining is a new term for the common practice of searching through