Trend Analysis and Presentation



Similar documents
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

How Far is too Far? Statistical Outlier Detection

Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation.

Projects Involving Statistics (& SPSS)

Temperature Data Analysis

Statistical Analysis for Monotonic Trends

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Simple Regression Theory II 2010 Samuel L. Baker

Module 5: Statistical Analysis


Module 3: Correlation and Covariance

Part II Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Part II

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Scatter Plot, Correlation, and Regression on the TI-83/84

SPSS Explore procedure

3.2 Statistical Analysis Procedures

Formula for linear models. Prediction, extrapolation, significance test against zero slope.

Simple linear regression

Fairfield Public Schools

Homework 8 Solutions

Data Visualization Techniques

9 RUN CHART RUN CHART

MTH 140 Statistics Videos

Regression and Correlation

Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

FORECASTING. Operations Management

Data Visualization Techniques

ESTIMATING THE DISTRIBUTION OF DEMAND USING BOUNDED SALES DATA

A spreadsheet Approach to Business Quantitative Methods

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

Simple Predictive Analytics Curtis Seare

Building Credibility: Quality Assurance & Quality Control for Volunteer Monitoring Programs

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

Module 5: Multiple Regression Analysis

Exercise 1.12 (Pg )

Diagrams and Graphs of Statistical Data

Quality. Guidance for Data Quality Assessment. Practical Methods for Data Analysis EPA QA/G-9 QA00 UPDATE. EPA/600/R-96/084 July, 2000

PS 271B: Quantitative Methods II. Lecture Notes

Teaching Business Statistics through Problem Solving

Basic Statistics and Data Analysis for Health Researchers from Foreign Countries

A Review of Statistical Outlier Methods

Statistics 2014 Scoring Guidelines

Module 4: Data Exploration

Using Statistical Process Control (SPC) for improved Utility Management

Evaluation of Open Channel Flow Equations. Introduction :

Financial Risk Management Exam Sample Questions/Answers

This unit will lay the groundwork for later units where the students will extend this knowledge to quadratic and exponential functions.

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there

OUTLIER ANALYSIS. Data Mining 1

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Father s height (inches)

the Median-Medi Graphing bivariate data in a scatter plot

Statistical And Trend Analysis Of Rainfall And River Discharge: Yala River Basin, Kenya

Data Quality Assessment: A Reviewer s Guide EPA QA/G-9R

6. MEASURING EFFECTS OVERVIEW CHOOSE APPROPRIATE METRICS

Using SPSS, Chapter 2: Descriptive Statistics

APPENDIX N. Data Validation Using Data Descriptors

Normality Testing in Excel

Time Series Laboratory

Integrating the Internet into Your Measurement System. DataSocket Technical Overview

II. DISTRIBUTIONS distribution normal distribution. standard scores

Using kernel methods to visualise crime data

Lean Six Sigma Black Belt-EngineRoom

The importance of graphing the data: Anscombe s regression examples

Charts, Tables, and Graphs

430 Statistics and Financial Mathematics for Business

2.2 Elimination of Trend and Seasonality

2013 MBA Jump Start Program. Statistics Module Part 3

EXPLORING SPATIAL PATTERNS IN YOUR DATA

Getting Correct Results from PROC REG

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Using Excel for Statistical Analysis

PROPERTIES OF THE SAMPLE CORRELATION OF THE BIVARIATE LOGNORMAL DISTRIBUTION

We are often interested in the relationship between two variables. Do people with more years of full-time education earn higher salaries?

MATH 103/GRACEY PRACTICE EXAM/CHAPTERS 2-3. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Mean, Median, and Mode

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Regression III: Advanced Methods

Introduction to Regression and Data Analysis

DEALING WITH THE DATA An important assumption underlying statistical quality control is that their interpretation is based on normal distribution of t

The Seven Basic Tools. QUALITY CONTROL TOOLS (The Seven Basic Tools) What are check sheets? CHECK SHEET. Illustration (Painting defects)

Introduction to nonparametric regression: Least squares vs. Nearest neighbors

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

APPENDIX E THE ASSESSMENT PHASE OF THE DATA LIFE CYCLE

Session 7 Bivariate Data and Analysis

A Sales Strategy to Increase Function Bookings

Examples of Data Representation using Tables, Graphs and Charts

Lesson 2: Constructing Line Graphs and Bar Graphs

DATA COLLECTION AND ANALYSIS

UPPER DESCHUTES R-EMAP TEMPERATURE SUMMARY

SP500 September 2011 Outlook

Transcription:

Percent Total Concentration Trend Monitoring What is it and why do we do it? Trend monitoring looks for changes in environmental parameters over time periods (E.g. last 10 years) or in space (e.g. as you move downstream) 100 Box Plot of Workshop Participants Concentration VS Time of Day *** 90 80 70 60 50 40 30 20 10 0 08:00 09:00 10:00 11:00 12:00 13:00 14:00 15:00 16:00 17:00 Time of Day *** Created with StatPlus Software

Be vewy, vewy quiet I m hunting fow a twend How to find a trend Visual- Good News...Graphing or mapping data for people to see is the easiest way to communicate trends, especially to a non-technical crowd. Bad News No way of measuring that there is a trend or how big it is. Statistical- Good News Can identify hard to see trends and gives a number that is defensible and repeatable. Bad News Easy to do wrong and hard to figure out. http://www.barbneal.com/elmer.asp

Selecting How to Analyze and Present a Trend Did your data meet your data quality objectives? Trend monitoring requires strict monitoring protocols. Who is your audience and what type of analysis or presentation is appropriate for them? If a statistical test is required, does the data satisfy all the statistic s assumptions? What is the trend telling you? Determining the cause of the trend is more difficult than determining the trend. Have you considered all the exogenous parameters that could influence your trends (flow, time of day, etc.)?

Avgerage 7 Day Max for July 2000 Visual methods for displaying trends Average 7 day max by River Mile Summer 2000 E.coli Concentrations 24 22 20 18 20.9 20.6 20.6 20.6 20.2 21.5 18.9 18.6 18.9 18.2 2500 2000 16 14 15.9 Concentration (MPN/100mL) 1500 1000 12 500 October Sept 10 0 5 10 15 20 25 30 35 40 River Mile Time/area series scatter plot 0 1 Bar Graphs 2 3 4 5 7 Site Number 8 9 10 August July June May Month Time series smoothed scatter plot* Time/area series box plot of statistics From Helsel D.R. and R.M. Hirsch. 1991 Statistical Methods in Water Resources. Techniques of Water-Resources Investigations of the USGS Book 4. Chapter A3 p. 287

The Trouble with Graphs People tend to focus on the outliers and not on more subtle changes Gradual trends are hard to detect by eye Seasonal variation and exogenous variables can mask trends in a parameter. Viewers can see what they want to see sometimes http://dnr.metrokc.gov/wlr/waterres/streams/northtrend.htm

Statistical Trend Analysis Methods Picking a statistic can be tricky- Review your data for the following characteristics: Normal Distribution Abrupt Changes Cycles Outliers Missing Values Censored Data Serial Correlation Select a Statistic that is appropriate for the type of data set you have.

Statistical Trend Analysis Methods Type of statistic depends on data characteristics- To test if an existing data set is normally distributed you may (a) plot a histogram of the results; or (b) conduct tests for normality like the Shapiro-Wilk W test, the Filliben s statistic, or the studentized rage test (EPA, 2000, p. 4-6). Parametric- Statistics for normally distributed data. Includes regression of parameter against time or spatial measure, like river mile. Parametric statistics are rarely appropriate for environmental samples without massaging data. See the USGS guide by Helsel and Hirsch for descriptions of using regression. Nonparametric- Statistics that are not as dependent on assumptions about data distribution. Generally the safer statistics to use, but are not as readily available. Test include the Kendall test for presence of consistent trend, Sen slope test for measure of magnitude of slope, and Wilcoxon-Mann-Whitney step trend analysis.

Things to watch out for with statistical analysis Verify and document that all assumptions have been tested and met. Failure to do so, may lead to misleading results The easiest test to do, OLS regression, is usually not appropriate. More robust, non-parametric tests like the seasonal Kendall test are not readily available and can be computationally demanding. Finding no trend may only mean your data was insufficient to find the trend. Infrequent use of statistical methods often requires relearning fairly complicated procedures every year. If you do use a statistical package, keep excellent notes of what you do and why. Four OLS regression charts with the same R 2, slope, and intercept* * From Helsel D.R. and R.M. Hirsch. 1991 Statistical Methods in Water Resources. Techniques of Water-Resources Investigations of the USGS Book 4. Chapter A3 p.245 *

Statistical Methods for Determining Trends Non-Parametric, distribution independent methods Seasonal Kendall Test: Compares the relationship between points at separate time periods or seasons and determines if there is a trend. Highly robust and relatively powerful, recommended method for most water quality trend monitoring (Aroner, 2001). * Sen Slope or Kendall Theil: Compares the change in value vs time (slope) for each point on the chart and takes the median slope as the a summary statistic describing the magnitude of the trend. ** ** Wilcoxon-Mann-Whitney Step Trend: Seasonal or non-seasonal test to identify changes that take place at a distinct time. Combine with Hodges- Lehmann estimator to determine magnitude of step (Aroner, 2001) * From Helsel D.R. and R.M. Hirsch. 1991 Statistical Methods in Water Resources. Techniques of Water-Resources Investigations of the USGS Book 4. Chapter A3. p. 266 ** Created with WQHydro Water Quality/Hydrology Graphics/Analysis System Software

Some Units How to Calculate Non-Parametric Statistics Kendall Test to determine if a significant trend exists see EPA guide QA/G-9 pages 4-16 to 4-20 12 10 8 XY Plot of data vs time Year Units 2000 5 2001 6 2002 11 2003 8 2004 10 6 4 2 0 1998 2000 2002 2004 2006 Yearly Value Sen Slope or Kendall-Theil line to measure the magnitude of the trend see USGS Statistical Methods for Water Resources pages 266-267

What type of trend analysis is right for you to use? What would you need for temperature trend monitoring? Is there an exogenous variable? What about changes in other water quality parameters? How could changes in monitoring schedule impact trends? Example Bob has been monitoring DO in the morning at a site for 3 years. He adds 4 new sites to his route and now visits his old site in the afternoon every visit. Now samples are collected in the pm. What impact will this have on your ability to track trends? What could you do to correct the problem? DO Saturation in Willamette R. at Portland S.P. & S. Railroad Bridge

Statistical Methods for Determining Trends Regressions Regression methods draw a line as close to all the data as possible. Use regressions only if you can satisfy the data requirements. Helsel and Hirsch (1991) give a good description of the steps needed to satisfy assumptions and data transformation tools to bring your data within the limits of the assumptions. * * From Helsel D.R. and R.M. Hirsch. 1991 Statistical Methods in Water Resources. Techniques of Water-Resources Investigations of the USGS Book 4. Chapter A3. p. 279

OWQI.month OWQI.anual Trend Analysis and Presentation Trend Monitoring... The devil is in the details Issues to consider when doing trend monitoring 1 Consistent data quality- Trend monitoring assumes that the same or equivalent methods and protocols are used for all the monitoring. Time frame & number of samples- 5 years of monthly data for WQ monotonic trend analysis (Lettenmaier et.al., 1982); for step trends, at least 2 years of monthly data before and after management change (Hirsch, 1982). (Statistic should be predefined in QAPP to make sure you have everything you need.) Seasonality- Parameters that vary naturally in different seasons of the year require special statistics or de-seasonalization. Type of data distribution (Is your data Normal?)*- If your data is not normally distributed, you need to use a non-parametric statistical test. Comparisonof Trends: Anual vs. MonthlyData WilameteRiverat NewbergBridge 90 90 80 80 70 70 60 OWQI.month OWQI.anual 60 50 50 2/12/1906/27/1911/8/1923/23/1948/5/195 Date 1. Based on Aroner, E.R. 2000 WQHYDRO Environmental Data Analysis Technical App. p. 146 * Issue most applicable to statistical analysis

Issues to consider when doing trend monitoring 1 Similar variance across data set*- Statistical methods require consistent sampling frequency (i.e., constant number of samples per period, equally spaced across the period) to minimize variance. Minimal missing observations*- Missing observations may invalidate some tests by changing variance. Outliers- Statistical tests designed for normal distributions are very sensitive to outliers Serial correlation*- Each result must be independent of other samples (across time and space). Exogenous variables- Parameters like conductivity and ph can be greatly impacted by exogenous parameters, like stream discharge or time of day, respectively. Changes in these exogenous variables may impact trends you measure in the parameter of interest. To remove these impacts develop a regression between the two variables. Calculate the residuals (the difference between the measured value and value predicted by the regression) and then run the trend analysis on the residuals. Differences in detection limits*- Changing detection limits might fool the statistic into thinking concentrations are decreasing (make sure you have a way to statistically handle changing detection limits}. 1. Based on Aroner, E.R. 2000 WQHYDRO Environmental Data Analysis Technical Appendix p. 146 * Issue most applicable to statistical analysis

References Aroner, E.R., 2000. WQHYDRO Environmental Data Analysis Technical Appendix. Berk K.N., and P. Carey, 2000. Data Analysis with Microsoft Excel. Duxbury Press, Pacific Grove, CA. pp. 155-162. EPA, 2000 (EPA/600/R-96/084). Guidance for Data Quality Assessment: Practical methods for Data Analysis, EPA QA/G-9, QA00 Update Helsel D.R. and R.M. Hirsh. 1991 Statistical Methods in Water Resources. Techniques of Water-Resources Investigations of the USGS Book 4. Chapter A3 Hirsch R.M., J.R. Slack and R.A. Smith, 1982. Techniques of Trend Analysis for Monthly Water Quality Data. Water Resources Research, Vol. 18 (1) pp. 107-121 February. Hirsch R.M., R.B. Alexander, and R.A. Smith 1991. Selection of Methods for the Detection and Estimation of Trends in Water Quality. Water Resources Research, Vol. 27(5), pp. 803-813 May. Lettenmaier, D.P., L.L. Conquest and J.P. Hughes, 1982. Routine Streams and Rivers Water Quality Trend Monitoring Review. C.W. Harris Hydraulics Laboratory, Technical Report No. 75, Seattle, WA.