Data Evaluation Why Important??
|
|
- Piers Francis
- 7 years ago
- Views:
Transcription
1 Data Evaluation Why important? Questions answered depends on data collected Data format & storage (electronic / hard copies) Where to begin with examining the raw data Exploratory analysis Dealing with the real world (missing, below detection limit, non-normal, normal, autocorrelation) Natural variability (e.g., season, hydrology / meteorology) Statistical Approaches for Assessment and detecting trends Data analysis resources in notebook
2 Data Evaluation Why Important?? Get informational value out of data collect Communicate results in summary format that relates to questions want to answer Make correct analysis and interpretations of data Make sure everyone s on same page wrt information to be gained from monitoring Move future monitoring forward in direction that best meets objectives
3 Questions Answered Match Data Collected For Example: Frequency Explanatory Variables?? (e.g., Flow / stage / rainfall / season / Land use?) Question Answered 1 grab sample None Synoptic Even interval for multi-years (e.g., biweekly, quarterly) Yes (essential) No Screening for potential follow-up snap shot watershed assessment Not much Even interval for multi-years (e.g., biweekly, quarterly) Storm water samples Yes Yes Long-term trends (e.g., adjusted concentrations, biological health) Loads/watershed assessments (long term trends if sampling sustained)
4 Questions Asked (Con t( Con t): Linking Water Quality and Land Treatment / Use Watershed Experimental Design Essential Land Treatment / Land Use and Water Quality Monitoring Explanatory Variables to Isolate Water Quality Trends due to BMPs Match LT and WQ Data Hydrologic (Spatial) Time Basis (Temporal) - Multi - years
5 Data Evaluation Data Format and Storage Collect data in format / layout similar to computer entry (and visa versa) e.g., forms Include date, location, time, etc. on ALL records (not just in file name) Allows for more analysis flexibility Minimizes errors in data identification Make unique data fields that can sort upon (e.g., site, date, comments / data flags, TP) Don t combine fields - <.01 not good
6 Example Spread Sheet Data Entry Site NO2+3 TKN TP TSS TS FC FC_flag FS mg/l mg/l mg/l mg/l mg/l --- mpn/100ml Apr-93 E n 13-Apr-93 E n 21-Apr-93 E n 27-Apr-93 E n 04-May-93 E > n 11-May-93 E e e 18-May-93 E e n 25-May-93 E e n 02-Jun-93 E > 08-Jun-93 E n
7 Data Evaluation Data Format and Storage (con t) Build in data entry QA (e.g., allowable: minimal / maximum values Character vs. numeric Keep hard copies (remember the card readers, or CPM operating system (8 floppies) Have data entry fields for Field Observations or narrative Back-ups, back-ups, back-ups
8 Data Evaluation Exploratory Data Analysis Check for data entry errors: Minimum / Maximum / Average values to check for exceptionally high / low values ( outliers ) Box and Whisker Plots (Box Plots) to check for exceptionally high / low values and highly skewed data Time Plots or Time Series Plots. Plot data values vs. time to visually examine for unreasonable data Skewness Tests (e.g., Proc Univariate in SAS or Data Analysis Tools in Excel)
9 Data Evaluation Exploratory Data Analysis (Con t) Check for data distribution attributes : Normality Tests Test for departure from normal distribution or Bell-shaped curve (e.g., PROC UNIVARIATE in SAS or Data Analysis Tools in Excel) Skewness Tests. Test for long tails (e.g., PROC UNIVARIATE in SAS or Data Analysis Tools in Excel) Time Plots or Time Series Plots. Visually examine for seasonality, autocorrelation Autocorrelation tests.. (e.g., PROC AUTOREG in SAS)
10 Real World Dealing with Outliers Do you throw them out?? iff Perhaps you can trace the error back to data entry, lab or field QA/QC problem KEEP ELSE -- these may be where the real information is held
11 Real World Dealing with Below DL values BEST: Use the actual instrument values (could be negative), reflects variability and distribution at lower range. (In hard copy reports, use DL values with Less DL flag). If <20% below DL: Can substitute ½ value of DL (e.g., if DL is 0.01 mg/l, then substitute mg/l). BUT, if value is really 0.01, do not change (the value of a flag variable for DL). D Else: analysis Use alternative statistical analysis, e.g., Frequency Else: Generate synthetic data that mimics distribution at tail
12 Real World Dealing with Missing values BEST: Have sufficient data frequency and use rest of data values for analysis Substitution: e.g., Regression analysis: plot values of TP concentration vs. stream flow. If there is a good correlation, calculate estimated values for missing TP concentration when discharge is known. USE SPARINGLY Aggregation: Combine data over time intervals (e.g., weekly averages, annual averages)
13 Real World Dealing with Non-Normality Normality Data Transformation: Log(X). The log-normal distribution (i.e., the log transformed data has a normal distribution) is very common for water quality pollutant concentration data. An attribute of the data is that there are e a few high values in the tail. Utilize Non-parametric Statistical Analyses: However, doesn t cure all problems..
14 Parametric vs. Nonparametric Mean = Central Tendency Symmetrical Distribution about Mean (usually Normal) LogNormal and Slightly skewed OK Must Adjust for : - Autocorrelation (easy) - Seasonal Differences (easy) - Variance Heterogeneity (doable) - Hydrology, flow (easy) Versatile, Excellent for: - Assessments of variability - Step Trends - Linear Trends - Ramp Trends Median = Central Tendency Normality Not Required Skewed and Outlier Data OK Must Adjust for : - Autocorrelation (doable) - Seasonal Differences (easy) - Variance Heterogeneity (difficult) - Hydrology, flow (2-steps) Excellent for: - Assessments of variability - Step Trends - Monotonic Trends
15 Real World Dealing with Autocorrelation Time Series Analysis e.g., PROC AUTOREG in SAS (appropriate for weekly, monthly data) Useful in regression relationships (e.g., time trends, correlation between sites such as paired watersheds or upstream/downstream) Note: Spatial autocorrelation analysis methods available Aggregate Data: Average into larger time steps (e.g., quarterly, annually). Problem with potential loss of degrees of freedom.
16 Real World Dealing with Seasonality Use Explanatory Variable (covariate) Adjustment with measured variables dealing with hydrology / meteorological changes to adjust for seasonal changes, e.g.: TP concentration adjusted for stream discharge by including discharge as an X variable in trend analysis Normalize: e.g, adjusting the load value to average storm discharge level to allow comparison across storms Normalize: Model seasonal cycles into analysis: e.g, Indicator variables ( 0 or 1 ) for each month/season Sinusoidal Models
17 Natural Variability What s in a MEAN Central Tendency Good summary statistic Doesn t tell the full story The Fallacy of the Mean Doesn t show range or variability Hard to show statistically significant differences between mean values without variance Non-robust to extremes
18 Natural Variability Variability is our Friend Use to determine Minimal Detectable Changes (MDC) or differences Find the goods and the bads bads Avoid unrealistic expectation of good or bad conditions Recognize that year-to to-year variability can be LARGE
19 Natural Variability Utilize Explanatory Variables / Covariates to minimize unexplained variability and assist with making correct data interpretations), such as: Land use Stream flow / discharge / stage height Precipitation Ground water table depth Temperature Season Upstream conditions
20
21
22 Waukegan River, Illinois IBI (e.g., IBG I B Guessing ) Y IBI Pre Treatment Post Control Elapsed Months Y S1 S2
23 Statistical Analysis Toolbox No witch hunts allowed Pre-planned questions only Utilize statistical test(s) that address at questions / objective (e.g., assessments of central tendency and variability, step change, gradual change) Utilize multiple statistical approaches and graphical presentations
24 Statistical Distribution Assessment Box and whisker plots Mean & variance / standard deviation Median & percentile analysis Frequency distribution analysis (e.g., Percent of data in 25 percentile, 50 percentile, 75 percentile. Percent exceedance of standard
25 BMP Effectiveness: An Example Across Sites / Studies (e.g., multiple watersheds) % load reductio n Changes in Sediment Load - Conservation Tillage Range and Mean Lowest Highest Mean % load reductio n Changes in Total P Load - Conservation Tillage Range and Mean Lowest Highest Mean
26 Correlation between variables (e.g., TSS and Turbidity, Long Creek) Correlation Between TSS & Turbidity TSS Y Predicted Y Turbidity TSS vs. Turbidity, Long Creek, Site E log(tss) Log(Turbidity) Y Predicted Y
27 Statistical Approaches Comparisons Between Locations Parametric: T-test (compare mean values between 2 groups) Analysis of variance, AVOVA (compare more than 2 groups) Analysis of covariance (addition of explanatory variable can be continuous variable such as stream flow Non-Parametric Wilcoxon Rank Sum (~T-test) Kruskal-Wallis k-sample k (~ANOVA)
28 Statistical Approaches Step Trend Comparison between 2 time periods Parametric Tests: T-Test (Non-Paired or Paired) Paired is usually more powerful Analysis of Variance Analysis of Covariance
29 Statistical Approaches Step Trend Non-Parametric Tests: Step Trend (Non-Paired) Wilcoxon Rank Sum Test Seasonal Wilcoxon Rank Sum Test Kruskal-Wallis k-sample (compares more than 2 groups, ~Analysis of Variance) Step Trend (Paired Differences) Wilcoxon Signed Rank Test
30 Statistical Approaches Continuous Trend Parametric Tests: Linear Regression Add explanatory variables (covariates) were appropriate Can ADD dummy variable to mimic ramp Analysis of Covariance (e.g., adjustment for upstream concentration or control watershed) Time Series Analysis
31 Statistical Approaches Continuous Trend Non-Parametric Tests: Correlation Spearman's Rank Correlation (Spearman's rho) Monotonic Trends Kendall's tau (Mann-Kendall, Kendall Rank Correlation) Seasonal Kendall Test 2-step process by 1) calculating residuals from linear regression of concentration vs. discharge; 2) utilize residuals (adjusted values) in one of above tests Contingency Table (e.g., Cochran-Mantel-Haenszel (CMH) statistics
32 Long Creek, NC 319 NMP 83%, 76, 78, and 33% reductions in sediment, TP, TKN, Nitrate-N N loads, respectively (upstream/downstream before/after design)
33 Long Creek, NC (See NWQEP NOTES, July 1999, Figure 7 for SAS program to test for trends in downstream after adjusting for upsteam 7.0 of BMPs 6.0 downstream treatment upstream control log weekly TSS load, lbs Week
34 Section 319 NMP Projects Morro Bay, California 4-H H Watershed Model, Youth Education
Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation.
Computer Workshop 1 Part I Introduction to Minitab and basic commands. Manipulating data in Minitab Describing data; calculating statistics; transformation. Outlier testing Problem: 1. Five months of nickel
More informationTechnical Guidance for Exploring TMDL Effectiveness Monitoring Data
December 2011 Technical Guidance for Exploring TMDL Effectiveness Monitoring Data 1. Introduction Effectiveness monitoring is a critical step in the Total Maximum Daily Load (TMDL) process for addressing
More informationDescriptive Statistics
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
More informationbusiness statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
More informationProjects Involving Statistics (& SPSS)
Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,
More informationSPSS Tests for Versions 9 to 13
SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list
More informationMEASURES OF LOCATION AND SPREAD
Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
More informationMonitoring Data Exploring Your Data, The First Step
July 2005 Donald W. Meals and Steven A. Dressing. 2005. Monitoring data exploring your data, the first step, Tech Notes, July 2005. Developed for U.S. Environmental Protection Agency by Tetra Tech, Inc.,
More informationII. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
More informationPart II Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Part II
Part II covers diagnostic evaluations of historical facility data for checking key assumptions implicit in the recommended statistical tests and for making appropriate adjustments to the data (e.g., consideration
More informationX X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)
CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.
More informationUNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
More informationUsing Excel for inferential statistics
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationOverview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS
Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical
More informationAdditional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
More informationSTATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE
STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE Perhaps Microsoft has taken pains to hide some of the most powerful tools in Excel. These add-ins tools work on top of Excel, extending its power and abilities
More informationAssumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model
Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity
More informationSPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
More informationDirections for using SPSS
Directions for using SPSS Table of Contents Connecting and Working with Files 1. Accessing SPSS... 2 2. Transferring Files to N:\drive or your computer... 3 3. Importing Data from Another File Format...
More informationSimple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
More informationStatistical Analysis for Monotonic Trends
November 2011 Donald W. Meals, Jean Spooner, Steven A. Dressing, and Jon B. Harcum. 2011. Statistical analysis for monotonic trends, Tech Notes 6, November 2011. Developed for U.S. Environmental Protection
More informationStatistics for Sports Medicine
Statistics for Sports Medicine Suzanne Hecht, MD University of Minnesota (suzanne.hecht@gmail.com) Fellow s Research Conference July 2012: Philadelphia GOALS Try not to bore you to death!! Try to teach
More informationTrend Analysis and Presentation
Percent Total Concentration Trend Monitoring What is it and why do we do it? Trend monitoring looks for changes in environmental parameters over time periods (E.g. last 10 years) or in space (e.g. as you
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationDevelopment of Performance Measures. Task 3.1 Technical Memorandum. Determining Urban Stormwater Best Management Practice (BMP) Removal Efficiencies
1 Development of Performance Measures Task 3.1 Technical Memorandum Determining Urban Stormwater Best Management Practice (BMP) Removal Efficiencies Prepared by URS Greiner Woodward Clyde Urban Drainage
More informationComparing Means in Two Populations
Comparing Means in Two Populations Overview The previous section discussed hypothesis testing when sampling from a single population (either a single mean or two means from the same population). Now we
More information1 Nonparametric Statistics
1 Nonparametric Statistics When finding confidence intervals or conducting tests so far, we always described the population with a model, which includes a set of parameters. Then we could make decisions
More informationStudy Guide for the Final Exam
Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make
More informationT O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these
More informationDATA INTERPRETATION AND STATISTICS
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationGeostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt
More informationLesson 4 Measures of Central Tendency
Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central
More informationAnalysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk
Analysing Questionnaires using Minitab (for SPSS queries contact -) Graham.Currell@uwe.ac.uk Structure As a starting point it is useful to consider a basic questionnaire as containing three main sections:
More informationExploratory Data Analysis. Psychology 3256
Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
More informationCorrelational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots
Correlational Research Stephen E. Brock, Ph.D., NCSP California State University, Sacramento 1 Correlational Research A quantitative methodology used to determine whether, and to what degree, a relationship
More informationModule 5: Statistical Analysis
Module 5: Statistical Analysis To answer more complex questions using your data, or in statistical terms, to test your hypothesis, you need to use more advanced statistical tests. This module reviews the
More informationIntroduction to Quantitative Methods
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
More informationModule 5: Multiple Regression Analysis
Using Statistical Data Using to Make Statistical Decisions: Data Multiple to Make Regression Decisions Analysis Page 1 Module 5: Multiple Regression Analysis Tom Ilvento, University of Delaware, College
More informationCORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there
CORRELATIONAL ANALYSIS: PEARSON S r Purpose of correlational analysis The purpose of performing a correlational analysis: To discover whether there is a relationship between variables, To find out the
More informationHow To Test For Significance On A Data Set
Non-Parametric Univariate Tests: 1 Sample Sign Test 1 1 SAMPLE SIGN TEST A non-parametric equivalent of the 1 SAMPLE T-TEST. ASSUMPTIONS: Data is non-normally distributed, even after log transforming.
More informationPost-hoc comparisons & two-way analysis of variance. Two-way ANOVA, II. Post-hoc testing for main effects. Post-hoc testing 9.
Two-way ANOVA, II Post-hoc comparisons & two-way analysis of variance 9.7 4/9/4 Post-hoc testing As before, you can perform post-hoc tests whenever there s a significant F But don t bother if it s a main
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationAnalysis of Data. Organizing Data Files in SPSS. Descriptive Statistics
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to
More information2. Filling Data Gaps, Data validation & Descriptive Statistics
2. Filling Data Gaps, Data validation & Descriptive Statistics Dr. Prasad Modak Background Data collected from field may suffer from these problems Data may contain gaps ( = no readings during this period)
More informationAnalyzing Research Data Using Excel
Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial
More informationThe Statistics Tutor s Quick Guide to
statstutor community project encouraging academics to share statistics support resources All stcp resources are released under a Creative Commons licence The Statistics Tutor s Quick Guide to Stcp-marshallowen-7
More informationSimple linear regression
Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between
More informationDATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University
DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science
More informationStatistics Review PSY379
Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses
More informationChapter 12 Nonparametric Tests. Chapter Table of Contents
Chapter 12 Nonparametric Tests Chapter Table of Contents OVERVIEW...171 Testing for Normality...... 171 Comparing Distributions....171 ONE-SAMPLE TESTS...172 TWO-SAMPLE TESTS...172 ComparingTwoIndependentSamples...172
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
More informationSPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011
SPSS ADVANCED ANALYSIS WENDIANN SETHI SPRING 2011 Statistical techniques to be covered Explore relationships among variables Correlation Regression/Multiple regression Logistic regression Factor analysis
More informationA (very) short course on the analysis of Water Quality Data
A (very) short course on the analysis of Water Quality Data Carl James Schwarz Department of Statistics and Actuarial Science Simon Fraser University Burnaby, BC, Canada cschwarz @ stat.sfu.ca 1 / 118
More informationThe Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
More informationHandbook for Developing Watershed Plans to Restore and Protect Our Waters
This document is one chapter from the EPA Handbook for Developing Watershed Plans to Restore and Protect Our Waters, published in March 2008. The reference number is EPA 841-B-08-002. You can find the
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationNonparametric Two-Sample Tests. Nonparametric Tests. Sign Test
Nonparametric Two-Sample Tests Sign test Mann-Whitney U-test (a.k.a. Wilcoxon two-sample test) Kolmogorov-Smirnov Test Wilcoxon Signed-Rank Test Tukey-Duckworth Test 1 Nonparametric Tests Recall, nonparametric
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationBiostatistics: Types of Data Analysis
Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS
More informationChapter 7 Section 7.1: Inference for the Mean of a Population
Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used
More informationDescriptive Statistics
Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9
More informationData Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
More informationParametric and Nonparametric: Demystifying the Terms
Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD
More informationTHE KRUSKAL WALLLIS TEST
THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON
More informationWater Quality Data Analysis & R Programming Internship Central Coast Water Quality Preservation, Inc. March 2013 February 2014
Water Quality Data Analysis & R Programming Internship Central Coast Water Quality Preservation, Inc. March 2013 February 2014 Megan Gehrke, Graduate Student, CSU Monterey Bay Advisor: Sarah Lopez, Central
More informationEXPLORATORY DATA ANALYSIS: GETTING TO KNOW YOUR DATA
EXPLORATORY DATA ANALYSIS: GETTING TO KNOW YOUR DATA Michael A. Walega Covance, Inc. INTRODUCTION In broad terms, Exploratory Data Analysis (EDA) can be defined as the numerical and graphical examination
More informationEPS 625 INTERMEDIATE STATISTICS FRIEDMAN TEST
EPS 625 INTERMEDIATE STATISTICS The Friedman test is an extension of the Wilcoxon test. The Wilcoxon test can be applied to repeated-measures data if participants are assessed on two occasions or conditions
More informationADD-INS: ENHANCING EXCEL
CHAPTER 9 ADD-INS: ENHANCING EXCEL This chapter discusses the following topics: WHAT CAN AN ADD-IN DO? WHY USE AN ADD-IN (AND NOT JUST EXCEL MACROS/PROGRAMS)? ADD INS INSTALLED WITH EXCEL OTHER ADD-INS
More informationNONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)
NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions
More informationHYPOTHESIS TESTING WITH SPSS:
HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER
More information1 Quality Assurance and Quality Control Project Plan
1 Quality Assurance and Quality Control Project Plan The purpose of this section is to describe the quality assurance/quality control program that will be used during the system specific field testing
More informationSAS/STAT. 9.2 User s Guide. Introduction to. Nonparametric Analysis. (Book Excerpt) SAS Documentation
SAS/STAT Introduction to 9.2 User s Guide Nonparametric Analysis (Book Excerpt) SAS Documentation This document is an individual chapter from SAS/STAT 9.2 User s Guide. The correct bibliographic citation
More informationKSTAT MINI-MANUAL. Decision Sciences 434 Kellogg Graduate School of Management
KSTAT MINI-MANUAL Decision Sciences 434 Kellogg Graduate School of Management Kstat is a set of macros added to Excel and it will enable you to do the statistics required for this course very easily. To
More informationBasic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma siersma@sund.ku.dk The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
More informationconsider the number of math classes taken by math 150 students. how can we represent the results in one number?
ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.
More informationStatistics. One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples
Statistics One-two sided test, Parametric and non-parametric test statistics: one group, two groups, and more than two groups samples February 3, 00 Jobayer Hossain, Ph.D. & Tim Bunnell, Ph.D. Nemours
More informationRecall this chart that showed how most of our course would be organized:
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
More informationInstructions for SPSS 21
1 Instructions for SPSS 21 1 Introduction... 2 1.1 Opening the SPSS program... 2 1.2 General... 2 2 Data inputting and processing... 2 2.1 Manual input and data processing... 2 2.2 Saving data... 3 2.3
More information4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
More informationDiagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
More informationBusiness Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
More informationRank-Based Non-Parametric Tests
Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs
More informationDESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics
More informationCome scegliere un test statistico
Come scegliere un test statistico Estratto dal Capitolo 37 of Intuitive Biostatistics (ISBN 0-19-508607-4) by Harvey Motulsky. Copyright 1995 by Oxfd University Press Inc. (disponibile in Iinternet) Table
More informationcontaining Kendall correlations; and the OUTH = option will create a data set containing Hoeffding statistics.
Getting Correlations Using PROC CORR Correlation analysis provides a method to measure the strength of a linear relationship between two numeric variables. PROC CORR can be used to compute Pearson product-moment
More informationMeasures of Central Tendency and Variability: Summarizing your Data for Others
Measures of Central Tendency and Variability: Summarizing your Data for Others 1 I. Measures of Central Tendency: -Allow us to summarize an entire data set with a single value (the midpoint). 1. Mode :
More informationEXPLORING SPATIAL PATTERNS IN YOUR DATA
EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze
More informationUnivariate Regression
Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is
More informationSPSS Guide How-to, Tips, Tricks & Statistical Techniques
SPSS Guide How-to, Tips, Tricks & Statistical Techniques Support for the course Research Methodology for IB Also useful for your BSc or MSc thesis March 2014 Dr. Marijke Leliveld Jacob Wiebenga, MSc CONTENT
More informationINTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA)
INTERPRETING THE ONE-WAY ANALYSIS OF VARIANCE (ANOVA) As with other parametric statistics, we begin the one-way ANOVA with a test of the underlying assumptions. Our first assumption is the assumption of
More informationHYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
More informationChapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
More informationNonparametric Statistics
Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics
More informationResearch Methods & Experimental Design
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
More informationStatistical And Trend Analysis Of Rainfall And River Discharge: Yala River Basin, Kenya
Statistical And Trend Analysis Of Rainfall And River Discharge: Yala River Basin, Kenya Githui F. W.* +, A. Opere* and W. Bauwens + *Department of Meteorology, University of Nairobi, P O Box 30197 Nairobi,
More information3.2 Statistical Analysis Procedures
3.2 Statistical Analysis Procedures There are many different types of statistical analysis that can be performed on water quality data sets for reporting and interpretation purposes. Many inferences can
More information