Peer review on manuscript "Predicting environmental gradients with..." by Peer 410



Similar documents
Writing Thesis Defense Papers

Session 7 Bivariate Data and Analysis

McKinsey Problem Solving Test Top Tips

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 8: Quantitative Sampling

Predicting Box Office Success: Do Critical Reviews Really Matter? By: Alec Kennedy Introduction: Information economics looks at the importance of

A Basic Introduction to Missing Data

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Lesson 4 Measures of Central Tendency

CALCULATIONS & STATISTICS

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared.

Simple Regression Theory II 2010 Samuel L. Baker

Module 5: Multiple Regression Analysis

360 feedback. Manager. Development Report. Sample Example. name: date:

Simple linear regression

Module 3: Correlation and Covariance

Writing an essay. This seems obvious - but it is surprising how many people don't really do this.

Planning and Writing Essays

the Median-Medi Graphing bivariate data in a scatter plot

Linear functions Increasing Linear Functions. Decreasing Linear Functions

Student Writing Guide. Fall Lab Reports

The Effect of Dropping a Ball from Different Heights on the Number of Times the Ball Bounces

M-Series (BAS) Geolocators Short Manual

UK GDP is the best predictor of UK GDP, literally.

Practical Research. Paul D. Leedy Jeanne Ellis Ormrod. Planning and Design. Tenth Edition

Pre-Algebra Lecture 6

GUIDELINES FOR PROPOSALS: QUALITATIVE RESEARCH Human Development and Family Studies

First Affirmative Speaker Template 1

EUROPEAN COMMISSION. Better Regulation "Toolbox" This Toolbox complements the Better Regulation Guideline presented in in SWD(2015) 111

Planning a Critical Review ELS. Effective Learning Service

The fundamental question in economics is 2. Consumer Preferences

Writing a field report

Finding Job Openings on the Internet. A Special Report

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

Introduction to k Nearest Neighbour Classification and Condensed Nearest Neighbour Data Reduction

Why Sample? Why not study everyone? Debate about Census vs. sampling

Completing a Peer Review

0.8 Rational Expressions and Equations

Thought for the Day Master Lesson

Test Bias. As we have seen, psychological tests can be well-conceived and well-constructed, but

WRITING EFFECTIVE REPORTS AND ESSAYS

Chapter Seven. Multiple regression An introduction to multiple regression Performing a multiple regression on SPSS

Evaluation of Forest Road Network Planning According to Environmental Criteria

Climate and Weather. This document explains where we obtain weather and climate data and how we incorporate it into metrics:

Linear Programming Notes VII Sensitivity Analysis

An Experience Curve Based Model for the Projection of PV Module Costs and Its Policy Implications

Predicting daily incoming solar energy from weather data

Significant Figures, Propagation of Error, Graphs and Graphing

Writing Journal Articles

Answer: C. The strength of a correlation does not change if units change by a linear transformation such as: Fahrenheit = 32 + (5/9) * Centigrade

Geotechnical Measurements and Explorations Prof. Nihar Ranjan Patra Department of Civil Engineering Indian Institute of Technology, Kanpur

Market Prices of Houses in Atlanta. Eric Slone, Haitian Sun, Po-Hsiang Wang. ECON 3161 Prof. Dhongde

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy

Does pay inequality within a team affect performance? Tomas Dvorak*

Exploratory Data Analysis -- Introduction

PEER REVIEW HISTORY ARTICLE DETAILS VERSION 1 - REVIEW. Tatyana A Shamliyan. I do not have COI. 30-May-2012

The Phase Modulator In NBFM Voice Communication Systems

Determination of g using a spring

The next day, Ana calls the manager of the apartment complex with the news.

I m Randy Swaty, ecologist on The Nature Conservancy s LANDFIRE team. In the next half hour, I ll introduce LANDFIRE to you, talk about how we

Partial Estimates of Reliability: Parallel Form Reliability in the Key Stage 2 Science Tests

Distances, Clustering, and Classification. Heatmaps

Publishing papers in international journals

Constructing a TpB Questionnaire: Conceptual and Methodological Considerations

HM REVENUE & CUSTOMS. Child and Working Tax Credits. Error and fraud statistics

Power Analysis for Correlation & Multiple Regression

Measurement and Metrics Fundamentals. SE 350 Software Process & Product Quality

UNIVERSITY OF WAIKATO. Hamilton New Zealand

The basic principle is that one should not think of the properties of the process by means of the properties of the product

Physics Lab Report Guidelines

5. Correlation. Open HeightWeight.sav. Take a moment to review the data file.

Literature Discussion Strategies

Building a Dichotomous Key: Take home Assignment. - Copy of Aliens Handout - Question Sheet - Dichotomous Key Sheet

WRITING A CRITICAL ARTICLE REVIEW

A Short Tour of the Predictive Modeling Process

Introduction to Regression and Data Analysis

Reflections on Probability vs Nonprobability Sampling

Experiment #1, Analyze Data using Excel, Calculator and Graphs.

An introduction to impact measurement

to Become a Better Reader and Thinker

Poverty among ethnic groups

CHI-SQUARE: TESTING FOR GOODNESS OF FIT

1/9. Locke 1: Critique of Innate Ideas

Economics of Strategy (ECON 4550) Maymester 2015 Applications of Regression Analysis

1.7 Graphs of Functions

Chapter 5. Conditional CAPM. 5.1 Conditional CAPM: Theory Risk According to the CAPM. The CAPM is not a perfect model of expected returns.

Multivariate Analysis of Ecological Data

2. Incidence, prevalence and duration of breastfeeding

Course Objective This course is designed to give you a basic understanding of how to run regressions in SPSS.

Social Return on Investment

Symbolizing your data

Designer: Nathan Kimball. Stage 1 Desired Results

9. Momentum and Collisions in One Dimension*

Data, Measurements, Features

APPENDIX N. Data Validation Using Data Descriptors

CURVE FITTING LEAST SQUARES APPROXIMATION

Characteristics of Binomial Distributions

Transcription:

Peer review on manuscript "Predicting environmental gradients with..." by Peer 410 ADDED INFO ABOUT FEATURED PEER REVIEW This peer review is written by Dr. Richard Field, Associate Professor of Biogeography at Faculty of Social Sciences, University of Nottingham PEQ = 5.0 / 5 Peer reviewed by 2 Peers Introduction Revision Recommendations This study tests the proposal that... species can indicate cation concentration and the proportions of Question: Data: Accept sand, silt and clay in... It finds strong correlations between observed and predicted cation concentration (good QUALITATIVE predictions: correlation coefficients [calculated from the r2 values given] varying from 0.81 to 0.87), though how Inference: Writing: QUANTITATIVELY accurate the predictions were is not very clear (see Critique). The predictions were much poorer for particle size classes, both qualitatively (r=0.37 to 0.69) and quantitatively (RMSE values around half the typical values of the variables). Interestingly, using...... data for...... did not improve the predictions over......... data. When the analysis was done the other way round,...... species composition was strongly related to soil cation concentration (the flip-side of the above), and most species were typically found in quite a narrow range of soil cation concentrations - preferences that appear to align quite well with those in............... Therefore, many of the species can be used as indicators of soil cation levels. Methods: Major Merits The study uses a large dataset, with reasonable coverage across......... The main analyses are appropriate in my opinion, though can be improved in their application and reporting (see Critique). The conclusions are mostly well supported by the results. The predictions are done using two quite different approaches, which yield very similar results, reassuringly. The manuscript builds on a good body of research on this topic in the region and is written with some authority. It is also quite well written, apart from lack of clarity in some places, and

some specifics. The authors make clear the intended practical application of the research, as well as its novelty. I see no major flaws. Critique I could not discover whether the RMSE values for cation concentration in Table 2 are in units of log10(sum of bases) or untransformed sum of bases. (Line 136 says that these values were logtransformed before analysis.) This means the reader cannot tell how big the typical prediction error was for this variable. This is important because it is the main result, and the second conclusion stated in the abstract is "Species composition in a plot can provide good quantitative estimation of nutrient concentration in the soil." It may be that although the correlation between predicted and measured cation concentration was strong, the exact predictions were not very accurate, in which case the quantitative estimation was not as good as claimed. The predicted values may also have systematic errors, such as underprediction at high cation levels and overprediction at low levels. Or a non-linear relationship with the measured values. If an ultimate aim is to produce a map of important soil properties (final sentence of Discussion) then these things matter. The predictions were generated using a leave-one-out approach, at the plot level. But the 305 plots analysed were aggregated within about 38 'locations' (I count 38 triangles on Fig.1), with 5-30 plots per location (line 91). Each plot was at least 1km from all others (line 88), but I suspect many were in quite uniform terrain. Particularly for the K-NN analysis, which used the 4 nearest neighbours (in terms of species composition) to predict values (line 548), this could lead to predictions that are unduly good. That is, sites close together geographically may tend to be very similar in both species composition and soil properties. While that would not invalidate the results, it would suggest that to get the sorts of prediction strengths reported in the manuscript, you need known values from sites within a few km of the focal one - so the usefulness for low-cost, broadscale mapping is diminished. In fairness, the WA analysis is probably much less affected by this concern (but it could be quite strongly affected if the key indicator species have small geographic ranges). It is not clear how large the geographic ranges of the 54..... species are. Overall, for a study explicitly aimed at an eventual mapping outcome, I was surprised at the lack of analysis of spatial structure in the data and errors. All four soil variables were positively skewed. Cation concentration was log-transformed but there is no mention of transformation of any of the other variables. This could affect many of the analyses, particularly the r2s of the regressions of predictions vs measured values (Table 2; the main results). Finally, it seems possible that...... alter the soil properties, rather than only responding to them. Was there any consideration of this in the soil sampling? Perhaps worth discussing. Discussion

Some environmental variables are quite well mapped, even for areas with little field sampling (eg from satellite imagery). Soil variables are much more difficult because they usually require intensive, quite costly field sampling (and laboratory work). At least for ecologically important soil variables, a good proxy is potentially very useful, for example for conservation planning (as the authors state), or other purposes. The manuscript can make useful advances in this respect. The results reported show that cation concentration is important for......, where the climate is mostly 'benign' and not strongly limiting. Whether other soil variables are needed for other taxonomic groups remains to be seen, as the authors note; thus it is not clear whether the usefulness is restricted to......, or is much wider. Some suggestions for improving the research within this manuscript: *Report and interpret the slope and intercept of the fit line for measured vs predicted soil values. *Use the leave-one-out approach at the 'location' level. That is, leave out all the plots from the focal location when generating the predictions for each plot. *Divide...... geographically into a few 'regions' - Fig.1 and Appendix S1 suggest some quite 'natural' clusters of sampling points - and try predicting from one region to others. This may really help in establishing the limits to prediction (especially for mapping purposes), and thus inform data-collection needs for the future. *Examine the distributions of the errors in geographic and environmental space, try nonlinear fits, etc. Finally, a possible strength of this approach is not mentioned. Species' presence reflects conditions over a much longer period than the moment of the field sampling, and over a broader area than the exact soil-sample location. So, a bit like using invertebrate species to indicate water quality, using...... to indicate soil properties may have advantages over direct soil measurements. References [1] Anonymous authors (2013) Predicting environmental gradients with......... (unpublished manuscript) - Peerage of Science Additional comments for authors Will you publish the...... plot data? I hope so because they could be useful for other purposes. The equivalents of Fig. 3 for the particle size variables would be nice to publish in supplementary material.

There are various typos and very minor errors, which I have not listed here (e.g. in line 22 it should be 'Main conclusions'). The last sentence of the abstract (lines 26-27) is incomplete. Lines 41-42 -- this sentence needs rewriting because at the moment it says that biotic heterogeneity is an important determinant of biotic heterogeneity! Lines 81-82 -- the word 'if' should be used as a conditional. Here the correct word is 'whether'. Also, data are plural; this sentence has both singular and plural. So the sentence should read 'Finally, we assess whether species abundance data are needed to obtain useful predictions, or whether the more easily obtainable presence-absence data are adequate.' Lines 93-94 -- if the idea was NOT to sample any environmental gradient then why were transects used? 'Data collection' section (and rest of manuscript): I could find no mention of WHEN the floristic data were collected. Line 224 -- latitude and longitude were tested, apparently. Why is there no mention of these in the methods or Table 1, or anywhere else in the manuscript? Was anything else tested but not reported? Line 229 -- I do not follow "All twenty indicator species of the richer soils". In Fig.2, I count 18 indicators (and 21 at the second level for the richer soils). I have no idea what is being referred to here. Line 230-231 - I do not follow the end of the same sentence! It does not seem to relate to results reported in Fig.2. Is it referring to a possible alternative first-level split? Line 252 -- "a minimum of four neighbouring plots". I do not understand. From lines 172-174 I thought that the final predictions would use a fixed number of neighbouring plots, once the 'best' k had been established, NOT a variable number with a fixed minimum. Also Table 2 legend suggests it is k=4 not k>=4. Line 270 -- better to join this paragraph to the previous one so the first paragraph of the discussion summarises the key findings. Lines 312-313 -- Surely you can investigate this by sub-sampling your data. I suggest you either do this section properly or remove it. From what you say on page 15 the better option seems to be to keep the section and do the extra analysis.

References -- various formatting inconsistencies. Table 1 -- how appropriate is it to report standard deviations for the variables that are highly rightskewed? Table 2 -- as well as saying whether the RMSEs are on logged values or not, the formatting needs fixing. Figure 1 -- the legend says "crosses" but they look like triangles to me. Figure 2 -- why is the figure repeated?