Data Evaluation Why Important??

Size: px
Start display at page:

Download "Data Evaluation Why Important??"

Transcription

1 Data Evaluation Why important? Questions answered depends on data collected Data format & storage (electronic / hard copies) Where to begin with examining the raw data Exploratory analysis Dealing with the real world (missing, below detection limit, non-normal, normal, autocorrelation) Natural variability (e.g., season, hydrology / meteorology) Statistical Approaches for Assessment and detecting trends Data analysis resources in notebook

2 Data Evaluation Why Important?? Get informational value out of data collect Communicate results in summary format that relates to questions want to answer Make correct analysis and interpretations of data Make sure everyone s on same page wrt information to be gained from monitoring Move future monitoring forward in direction that best meets objectives

3 Questions Answered Match Data Collected For Example: Frequency Explanatory Variables?? (e.g., Flow / stage / rainfall / season / Land use?) Question Answered 1 grab sample None Synoptic Even interval for multi-years (e.g., biweekly, quarterly) Yes (essential) No Screening for potential follow-up snap shot watershed assessment Not much Even interval for multi-years (e.g., biweekly, quarterly) Storm water samples Yes Yes Long-term trends (e.g., adjusted concentrations, biological health) Loads/watershed assessments (long term trends if sampling sustained)

4 Questions Asked (Con t( Con t): Linking Water Quality and Land Treatment / Use Watershed Experimental Design Essential Land Treatment / Land Use and Water Quality Monitoring Explanatory Variables to Isolate Water Quality Trends due to BMPs Match LT and WQ Data Hydrologic (Spatial) Time Basis (Temporal) - Multi - years

5 Data Evaluation Data Format and Storage Collect data in format / layout similar to computer entry (and visa versa) e.g., forms Include date, location, time, etc. on ALL records (not just in file name) Allows for more analysis flexibility Minimizes errors in data identification Make unique data fields that can sort upon (e.g., site, date, comments / data flags, TP) Don t combine fields - <.01 not good

6 Example Spread Sheet Data Entry Site NO2+3 TKN TP TSS TS FC FC_flag FS mg/l mg/l mg/l mg/l mg/l --- mpn/100ml Apr-93 E n 13-Apr-93 E n 21-Apr-93 E n 27-Apr-93 E n 04-May-93 E > n 11-May-93 E e e 18-May-93 E e n 25-May-93 E e n 02-Jun-93 E > 08-Jun-93 E n

7 Data Evaluation Data Format and Storage (con t) Build in data entry QA (e.g., allowable: minimal / maximum values Character vs. numeric Keep hard copies (remember the card readers, or CPM operating system (8 floppies) Have data entry fields for Field Observations or narrative Back-ups, back-ups, back-ups

8 Data Evaluation Exploratory Data Analysis Check for data entry errors: Minimum / Maximum / Average values to check for exceptionally high / low values ( outliers ) Box and Whisker Plots (Box Plots) to check for exceptionally high / low values and highly skewed data Time Plots or Time Series Plots. Plot data values vs. time to visually examine for unreasonable data Skewness Tests (e.g., Proc Univariate in SAS or Data Analysis Tools in Excel)

9 Data Evaluation Exploratory Data Analysis (Con t) Check for data distribution attributes : Normality Tests Test for departure from normal distribution or Bell-shaped curve (e.g., PROC UNIVARIATE in SAS or Data Analysis Tools in Excel) Skewness Tests. Test for long tails (e.g., PROC UNIVARIATE in SAS or Data Analysis Tools in Excel) Time Plots or Time Series Plots. Visually examine for seasonality, autocorrelation Autocorrelation tests.. (e.g., PROC AUTOREG in SAS)

10 Real World Dealing with Outliers Do you throw them out?? iff Perhaps you can trace the error back to data entry, lab or field QA/QC problem KEEP ELSE -- these may be where the real information is held

11 Real World Dealing with Below DL values BEST: Use the actual instrument values (could be negative), reflects variability and distribution at lower range. (In hard copy reports, use DL values with Less DL flag). If <20% below DL: Can substitute ½ value of DL (e.g., if DL is 0.01 mg/l, then substitute mg/l). BUT, if value is really 0.01, do not change (the value of a flag variable for DL). D Else: analysis Use alternative statistical analysis, e.g., Frequency Else: Generate synthetic data that mimics distribution at tail

12 Real World Dealing with Missing values BEST: Have sufficient data frequency and use rest of data values for analysis Substitution: e.g., Regression analysis: plot values of TP concentration vs. stream flow. If there is a good correlation, calculate estimated values for missing TP concentration when discharge is known. USE SPARINGLY Aggregation: Combine data over time intervals (e.g., weekly averages, annual averages)

13 Real World Dealing with Non-Normality Normality Data Transformation: Log(X). The log-normal distribution (i.e., the log transformed data has a normal distribution) is very common for water quality pollutant concentration data. An attribute of the data is that there are e a few high values in the tail. Utilize Non-parametric Statistical Analyses: However, doesn t cure all problems..

14 Parametric vs. Nonparametric Mean = Central Tendency Symmetrical Distribution about Mean (usually Normal) LogNormal and Slightly skewed OK Must Adjust for : - Autocorrelation (easy) - Seasonal Differences (easy) - Variance Heterogeneity (doable) - Hydrology, flow (easy) Versatile, Excellent for: - Assessments of variability - Step Trends - Linear Trends - Ramp Trends Median = Central Tendency Normality Not Required Skewed and Outlier Data OK Must Adjust for : - Autocorrelation (doable) - Seasonal Differences (easy) - Variance Heterogeneity (difficult) - Hydrology, flow (2-steps) Excellent for: - Assessments of variability - Step Trends - Monotonic Trends

15 Real World Dealing with Autocorrelation Time Series Analysis e.g., PROC AUTOREG in SAS (appropriate for weekly, monthly data) Useful in regression relationships (e.g., time trends, correlation between sites such as paired watersheds or upstream/downstream) Note: Spatial autocorrelation analysis methods available Aggregate Data: Average into larger time steps (e.g., quarterly, annually). Problem with potential loss of degrees of freedom.

16 Real World Dealing with Seasonality Use Explanatory Variable (covariate) Adjustment with measured variables dealing with hydrology / meteorological changes to adjust for seasonal changes, e.g.: TP concentration adjusted for stream discharge by including discharge as an X variable in trend analysis Normalize: e.g, adjusting the load value to average storm discharge level to allow comparison across storms Normalize: Model seasonal cycles into analysis: e.g, Indicator variables ( 0 or 1 ) for each month/season Sinusoidal Models

17 Natural Variability What s in a MEAN Central Tendency Good summary statistic Doesn t tell the full story The Fallacy of the Mean Doesn t show range or variability Hard to show statistically significant differences between mean values without variance Non-robust to extremes

18 Natural Variability Variability is our Friend Use to determine Minimal Detectable Changes (MDC) or differences Find the goods and the bads bads Avoid unrealistic expectation of good or bad conditions Recognize that year-to to-year variability can be LARGE

19 Natural Variability Utilize Explanatory Variables / Covariates to minimize unexplained variability and assist with making correct data interpretations), such as: Land use Stream flow / discharge / stage height Precipitation Ground water table depth Temperature Season Upstream conditions

20

21

22 Waukegan River, Illinois IBI (e.g., IBG I B Guessing ) Y IBI Pre Treatment Post Control Elapsed Months Y S1 S2

23 Statistical Analysis Toolbox No witch hunts allowed Pre-planned questions only Utilize statistical test(s) that address at questions / objective (e.g., assessments of central tendency and variability, step change, gradual change) Utilize multiple statistical approaches and graphical presentations

24 Statistical Distribution Assessment Box and whisker plots Mean & variance / standard deviation Median & percentile analysis Frequency distribution analysis (e.g., Percent of data in 25 percentile, 50 percentile, 75 percentile. Percent exceedance of standard

25 BMP Effectiveness: An Example Across Sites / Studies (e.g., multiple watersheds) % load reductio n Changes in Sediment Load - Conservation Tillage Range and Mean Lowest Highest Mean % load reductio n Changes in Total P Load - Conservation Tillage Range and Mean Lowest Highest Mean

26 Correlation between variables (e.g., TSS and Turbidity, Long Creek) Correlation Between TSS & Turbidity TSS Y Predicted Y Turbidity TSS vs. Turbidity, Long Creek, Site E log(tss) Log(Turbidity) Y Predicted Y

27 Statistical Approaches Comparisons Between Locations Parametric: T-test (compare mean values between 2 groups) Analysis of variance, AVOVA (compare more than 2 groups) Analysis of covariance (addition of explanatory variable can be continuous variable such as stream flow Non-Parametric Wilcoxon Rank Sum (~T-test) Kruskal-Wallis k-sample k (~ANOVA)

28 Statistical Approaches Step Trend Comparison between 2 time periods Parametric Tests: T-Test (Non-Paired or Paired) Paired is usually more powerful Analysis of Variance Analysis of Covariance

29 Statistical Approaches Step Trend Non-Parametric Tests: Step Trend (Non-Paired) Wilcoxon Rank Sum Test Seasonal Wilcoxon Rank Sum Test Kruskal-Wallis k-sample (compares more than 2 groups, ~Analysis of Variance) Step Trend (Paired Differences) Wilcoxon Signed Rank Test

30 Statistical Approaches Continuous Trend Parametric Tests: Linear Regression Add explanatory variables (covariates) were appropriate Can ADD dummy variable to mimic ramp Analysis of Covariance (e.g., adjustment for upstream concentration or control watershed) Time Series Analysis

31 Statistical Approaches Continuous Trend Non-Parametric Tests: Correlation Spearman's Rank Correlation (Spearman's rho) Monotonic Trends Kendall's tau (Mann-Kendall, Kendall Rank Correlation) Seasonal Kendall Test 2-step process by 1) calculating residuals from linear regression of concentration vs. discharge; 2) utilize residuals (adjusted values) in one of above tests Contingency Table (e.g., Cochran-Mantel-Haenszel (CMH) statistics

32 Long Creek, NC 319 NMP 83%, 76, 78, and 33% reductions in sediment, TP, TKN, Nitrate-N N loads, respectively (upstream/downstream before/after design)

33 Long Creek, NC (See NWQEP NOTES, July 1999, Figure 7 for SAS program to test for trends in downstream after adjusting for upsteam 7.0 of BMPs 6.0 downstream treatment upstream control log weekly TSS load, lbs Week

34 Section 319 NMP Projects Morro Bay, California 4-H H Watershed Model, Youth Education

Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation.

Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation. Computer Workshop 1 Part I Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation. Outlier testing Problem: 1. Five months of nickel

More information

Technical Guidance for Exploring TMDL Effectiveness Monitoring Data

Technical Guidance for Exploring TMDL Effectiveness Monitoring Data December 2011 Technical Guidance for Exploring TMDL Effectiveness Monitoring Data 1. Introduction Effectiveness monitoring is a critical step in the Total Maximum Daily Load (TMDL) process for addressing

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

More information

SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

More information

MEASURES OF LOCATION AND SPREAD

MEASURES OF LOCATION AND SPREAD Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the

More information

Monitoring Data Exploring Your Data, The First Step

Monitoring Data Exploring Your Data, The First Step July 2005 Donald W. Meals and Steven A. Dressing. 2005. Monitoring data exploring your data, the first step, Tech Notes, July 2005. Developed for U.S. Environmental Protection Agency by Tetra Tech, Inc.,

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Part II Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Part II

Part II Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Part II Part II covers diagnostic evaluations of historical facility data for checking key assumptions implicit in the recommended statistical tests and for making appropriate adjustments to the data (e.g., consideration

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

UNIVERSITY OF NAIROBI

UNIVERSITY OF NAIROBI UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER

More information

Using Excel for inferential statistics

Using Excel for inferential statistics FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE Perhaps Microsoft has taken pains to hide some of the most powerful tools in Excel. These add-ins tools work on top of Excel, extending its power and abilities

More information

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

More information

SPSS Explore procedure

SPSS Explore procedure SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,

More information

Directions for using SPSS

Directions for using SPSS Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

Statistical Analysis for Monotonic Trends

Statistical Analysis for Monotonic Trends November 2011 Donald W. Meals, Jean Spooner, Steven A. Dressing, and Jon B. Harcum. 2011. Statistical analysis for monotonic trends, Tech Notes 6, November 2011. Developed for U.S. Environmental Protection

More information

Statistics for Sports Medicine

Statistics for Sports Medicine Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach

More information

Trend Analysis and Presentation

Trend Analysis and Presentation Percent Total Concentration Trend Monitoring What is it and why do we do it? Trend monitoring looks for changes in environmental parameters over time periods (E.g. last 10 years) or in space (e.g. as you

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Development of Performance Measures. Task 3.1 Technical Memorandum. Determining Urban Stormwater Best Management Practice (BMP) Removal Efficiencies

Development of Performance Measures. Task 3.1 Technical Memorandum. Determining Urban Stormwater Best Management Practice (BMP) Removal Efficiencies 1 Development of Performance Measures Task 3.1 Technical Memorandum Determining Urban Stormwater Best Management Practice (BMP) Removal Efficiencies Prepared by URS Greiner Woodward Clyde Urban Drainage

More information

Comparing Means in Two Populations

Comparing Means in Two Populations Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we

More information

1 Nonparametric Statistics

1 Nonparametric Statistics 1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

More information

DATA INTERPRETATION AND STATISTICS

DATA INTERPRETATION AND STATISTICS PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Lesson 4 Measures of Central Tendency

Lesson 4 Measures of Central Tendency Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central

More information

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk

Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:

More information

Exploratory Data Analysis. Psychology 3256

Exploratory Data Analysis. Psychology 3256 Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship

More information

Module 5: Statistical Analysis

Module 5: Statistical Analysis Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Module 5: Multiple Regression Analysis

Module 5: Multiple Regression Analysis Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College

More information

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the

More information

How To Test For Significance On A Data Set

How To Test For Significance On A Data Set Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.

More information

Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.

Post-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9. Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics

Analysis of Data. Organizing Data Files in SPSS. Descriptive Statistics Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to

More information

2. Filling Data Gaps, Data validation & Descriptive Statistics

2. Filling Data Gaps, Data validation & Descriptive Statistics 2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)

More information

Analyzing Research Data Using Excel

Analyzing Research Data Using Excel Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial

More information

The Statistics Tutor s Quick Guide to

The Statistics Tutor s Quick Guide to statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Chapter 12 Nonparametric Tests. Chapter Table of Contents

Chapter 12 Nonparametric Tests. Chapter Table of Contents Chapter 12 Nonparametric Tests Chapter Table of Contents OVERVIEW...171 Testing for Normality...... 171 Comparing Distributions....171 ONE-SAMPLE TESTS...172 TWO-SAMPLE TESTS...172 ComparingTwoIndependentSamples...172

More information

Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

More information

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011

SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis

More information

A (very) short course on the analysis of Water Quality Data

A (very) short course on the analysis of Water Quality Data A (very) short course on the analysis of Water Quality Data Carl James Schwarz Department of Statistics and Actuarial Science Simon Fraser University Burnaby, BC, Canada cschwarz @ stat.sfu.ca 1 / 118

More information

The Dummy s Guide to Data Analysis Using SPSS

The Dummy s Guide to Data Analysis Using SPSS The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests

More information

Handbook for Developing Watershed Plans to Restore and Protect Our Waters

Handbook for Developing Watershed Plans to Restore and Protect Our Waters This document is one chapter from the EPA Handbook for Developing Watershed Plans to Restore and Protect Our Waters, published in March 2008. The reference number is EPA 841-B-08-002. You can find the

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test

Nonparametric Two-Sample Tests. Nonparametric Tests. Sign Test Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

More information

Data Exploration Data Visualization

Data Exploration Data Visualization Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

More information

Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Water Quality Data Analysis & R Programming Internship Central Coast Water Quality Preservation, Inc. March 2013 February 2014

Water Quality Data Analysis & R Programming Internship Central Coast Water Quality Preservation, Inc. March 2013 February 2014 Water Quality Data Analysis & R Programming Internship Central Coast Water Quality Preservation, Inc. March 2013 February 2014 Megan Gehrke, Graduate Student, CSU Monterey Bay Advisor: Sarah Lopez, Central

More information

EXPLORATORY DATA ANALYSIS: GETTING TO KNOW YOUR DATA

EXPLORATORY DATA ANALYSIS: GETTING TO KNOW YOUR DATA EXPLORATORY DATA ANALYSIS: GETTING TO KNOW YOUR DATA Michael A. Walega Covance, Inc. INTRODUCTION In broad terms, Exploratory Data Analysis (EDA) can be defined as the numerical and graphical examination

More information

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST

EPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions

More information

ADD-INS: ENHANCING EXCEL

ADD-INS: ENHANCING EXCEL CHAPTER 9 ADD-INS: ENHANCING EXCEL This chapter discusses the following topics: WHAT CAN AN ADD-IN DO? WHY USE AN ADD-IN (AND NOT JUST EXCEL MACROS/PROGRAMS)? ADD INS INSTALLED WITH EXCEL OTHER ADD-INS

More information

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem) NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions

More information

HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

More information

1 Quality Assurance and Quality Control Project Plan

1 Quality Assurance and Quality Control Project Plan 1 Quality Assurance and Quality Control Project Plan The purpose of this section is to describe the quality assurance/quality control program that will be used during the system specific field testing

More information

SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation

SAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation SAS/STAT Introduction to 9.2 User s Guide Nonparametric Analysis (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation

More information

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management

KSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To

More information

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association

More information

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

consider the number of math classes taken by math 150 students. how can we represent the results in one number? ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.

More information

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples

Statistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours

More information

Recall this chart that showed how most of our course would be organized:

Recall this chart that showed how most of our course would be organized: Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical

More information

Instructions for SPSS 21

Instructions for SPSS 21 1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3

More information

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: What do the data look like? Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Rank-Based Non-Parametric Tests

Rank-Based Non-Parametric Tests Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs

More information

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

More information

Come scegliere un test statistico

Come scegliere un test statistico Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table

More information

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.

containing Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics. Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment

More information

Measures of Central Tendency and Variability: Summarizing your Data for Others

Measures of Central Tendency and Variability: Summarizing your Data for Others Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :

More information

EXPLORING SPATIAL PATTERNS IN YOUR DATA

EXPLORING SPATIAL PATTERNS IN YOUR DATA EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

SPSS Guide How-to, Tips, Tricks & Statistical Techniques

SPSS Guide How-to, Tips, Tricks & Statistical Techniques SPSS Guide How-to, Tips, Tricks & Statistical Techniques Support for the course Research Methodology for IB Also useful for your BSc or MSc thesis March 2014 Dr. Marijke Leliveld Jacob Wiebenga, MSc CONTENT

More information

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Nonparametric Statistics

Nonparametric Statistics Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

Statistical And Trend Analysis Of Rainfall And River Discharge: Yala River Basin, Kenya

Statistical And Trend Analysis Of Rainfall And River Discharge: Yala River Basin, Kenya Statistical And Trend Analysis Of Rainfall And River Discharge: Yala River Basin, Kenya Githui F. W.* +, A. Opere* and W. Bauwens + *Department of Meteorology, University of Nairobi, P O Box 30197 Nairobi,

More information

3.2 Statistical Analysis Procedures

3.2 Statistical Analysis Procedures 3.2 Statistical Analysis Procedures There are many different types of statistical analysis that can be performed on water quality data sets for reporting and interpretation purposes. Many inferences can

More information