NeuralEnsembles: a neural network based ensemble forecasting. program for habitat and bioclimatic suitability analysis. Jesse R.

NeuralEnsembles: a neural network based ensemble forecasting program for habitat and bioclimatic suitability analysis Jesse R. O Hanley Kent Business School, University of Kent, Canterbury, Kent CT2 7PE, United Kingdom j.ohanley@kent.ac.uk Draft: March 2008 Keywords: habitat suitability modeling, bioclimatic envelope modeling, ensemble forecasting, artificial neural networks, species presence/absence data

ABSTRACT NeuralEnsembles is an integrated modeling and assessment tool for predicting areas of species habitat/bioclimatic suitability based on presence/absence data. This free, Windows based program, which comes with a friendly graphical user interface, generates predictions using ensembles of artificial neural networks. Models can quickly and easily be produced for multiple species and subsequently be extrapolated either to new regions or under different future climate scenarios. An array of options is provided for optimizing the construction and training of ensemble models. Main outputs of the program include text files of suitability predictions, maps and various statistical measures of model performance and accuracy.

SUMMARY There has, over the past few decades, been a proliferation in the ecological and climate change literature dealing with both methodological aspects and applied studies involving species habitat/bioclimatic suitability models. Such models attempt to predict the potential occurrence of a species across some area of interest based on inferred correlations between observations of species presence (and possibly absence) and a small set of environmental or climatic variables, which are normally perceived to be biologically important in determining the species distribution pattern (Guisan and Zimmermann 2000). Having parameterized a model for a chosen species, the model can subsequently be used to predict areas of likely presence either within the sampling area or extrapolated to an entirely new unobserved region (Fielding and Haworth 1995). Alternatively, as is often the case in climate change studies, models can also be used to project areas of potentially suitable climate space in either the past or, more commonly, into the future under different climate change scenarios (Pearson and Dawson 2003). More recently, a number of automated software programs have been developed to expediently build and output results of species habitat/suitability models. With the exception of a few R based programs like BIOMOD (Thuiller 2003), GRASP (Lehmann et al. 2002) and PresenceAbsence (Freeman and Moisen 2008), which are specifically designed for modeling species presence/absence data 1, most, including the more widely used ones with friendly graphical user interfaces like Maxent (Phillips et al. 2006), BIOMAPPER (Hirzel et al. 2002), GARP (Stockwell and Peters 1999), DOMAIN (Carpenter et al. 1993), and BIOCLIM (Nix 1986), have been specially designed for modeling presence-only datasets 2. 1 A type of dataset comprising a set of locations with both confirmed presence and confirmed absence of a species. 2 Datasets comprising a set of confirmed presence locations for a species but no confirmed absence locations. Given a set of presence locations, suitability is then normally defined in a relative manner to either background environment data or a set of pseudo-absence locations (Pearce and Boyce 2006) 3

In this software note, I present a new, freely available Windows based program called NeuralEnsembles version 1.0 (<http://purl.oclc.org/neuralensembles>), which comes replete with an easy-to-use graphical user interface, for predicting areas of habitat/bioclimatic suitability based on species presence/absence data. Primary estimates of suitability are derived in NeuralEnsembles by means of training and running artificial neural networks (ANNs). ANNs are non-linear statistical models, inspired by the structure and function of the nervous system that have the ability to learn underlying patterns of correlation between observed input (environmental/climatic variables) and target (species presence/absence) data. ANNs have been used with great success in a variety of species habitat/bioclimatic suitability analyses (Araújo et al. 2005; Berry et al. 2007; Pearson et al. 2002; Segurado and Araújo 2004; Thuiller 2003) in addition to a great many other environmental fields including remote sensing (Gopal and Woodcock 1996), climatology (Cavazos 1997), hydrology (Dawson and Wilby 2001), and geology (Lee et al. 2004). ANNs are implemented in NeuralEnsembles using a modification of the open source FANN (fast artificial neural network) library version 2.1.0 written in C (D. Oberhoff, pers. comm.; Nissen 2008). In comparison to other software applications, the most important distinguishing feature of NeuralEnsembles is the use of an ensemble (or committee) of multiple ANN submodels for making predictions about species habitat/bioclimatic suitability. Although the use of ensemble forecasting is still rather nascent within the ecological modeling literature (Araújo and New 2007), there is a large body of evidence from theory and practical work which clearly demonstrate the superiority of using an ensemble model over any single model (Sharkey 1999; Granitto et al. 2005). In practical terms, ensemble forecasting offers improved precision, as measured statistically by the accuracy of the combined predictions. 4

There are two basic execution modes for NeuralEnsembles, which can be set on the primary user interface (Figure 1). In standard training and projection mode, a user must input one or more species presence/absence data files and a single environmental/climatic training data file. The program uses these data during an iterative training (calibration) phase to produce a set of parameterized ANN submodels for each species. Projections from the individual ANNs are then subsequently combined to produce an ensemble forecast for the observed set of presence/absence locations within the study area. Note that the set of locations in a species presence/absence file need only, in fact, be a subset of the full list of points provided in the environmental training file. The program automatically pairs each observed presence/absence location with its matching array of observed environmental data. As an option, a user can also produce multiple projections of habitat/bioclimatic suitability for different spatial regions or different time periods (e.g., under alternative future climate change scenarios) by simply loading one or more environmental projection files. This is also particularly useful when the user wishes to produce a projection for an entire area that only has a subset of observed presence/absence locations on which a model is being trained. The other basic execution mode for NeuralEnsembles is projection only. This is useful when a trained model has already been developed for a particular species and the user latter wishes to use the model to make new projections. In this mode, no model training is carried out. As such, instead of inputting one or more species presence/absence files and an environmental training file, a user is prompted to input for each species the main directory where the outputs from the previous training run have been stored, along with any new set of environmental projection files on which a model is to be run. 5

Both types of input files should be formatted as space or tab delimited text files with or without a header line. The presence/absence file should have a row for each presence and absence location and a total of 3 columns (x y pres): the first two (x and y) defining a pair of geographic coordinates (e.g., longitude and latitude or easting and northing) and the third (pres) being either a 1 or 0, depending on the observed presence or absence of the species, respectively. Similarly, the environmental data file (as well as the environmental scenario files) should have a row for each environmental coordinate and a total of 2 + n columns (x y val 1 val 2 val n ): the first two (x and y) again corresponding to a pair of geographic coordinates and the remaining n columns (val 1 val 2 val n ) being a set of n environmental/climatic predictor variables. Although not strictly necessary, it is strongly suggested that the environmental/climatic variables be normalized before being loaded in NeuralEnsembles (e.g., by computing z-scores or by normalizing onto a 0-1 range using min and max values). When in training and projection mode, various parameter settings are available in the options setting window (Figure 1) for controlling and optimizing the architecture and training of the ANN submodels. By default, individual ANN submodels are constructed in NeuralEnsembles as fully connected, feed-forward neural networks containing a single hidden layer with ⅔(n+1) hidden units, where n is the number of input variables. The use of sigmoid transfer functions in both the hidden and output layers ensures that the outputs from the ANNs range between 0 and 1 and can thus be interpreted as conditional probability estimates of species presence (Bishop 1995). As an option, both the number of hidden layers and hidden units in each layer can be freely adjusted by the user. Additionally, instead of having a fixed architecture, a user can decide to use an evolving network structure, based on the Cascade 2 training algorithm (Fahlman et al. 6

1996), which iteratively adds hidden units/layers to the network in order to optimize the hidden structure of the ANN. When using an evolving topology, hidden units are added to a network according to Akaike s Information Criterion (AIC) (Ren and Zhao 2002). Although it is possible in NeuralEnsembles to train each ANN submodel independently using a simple bagging type procedure (Breiman 1996), a much more elaborate ensemble construction procedure called SECA (Granitto et al. 2005) is available and used by default. SECA, which stands for stepwise ensemble construction algorithm, attempts to optimize the performance of the entire ensemble through the sequential training and aggregation of the individual ANNs. This is accomplished by first generating for each ANN a separate calibration dataset via bootstrapping the available data, while using the compliment set of unsampled data to form a matching validation dataset. In successive fashion, each ANN is then trained until the combined error of the current ANN and any previous-stage aggregate ensemble reaches an approximate minimum 3 in terms of total error on the current calibration and validation datasets. At each new stage, only the weights of the ANN currently being added are updated in the usual manner, while the weights of any previous-stage ANNs are kept constant. Once training is complete, the newly trained ANN is combined with the previous-stage aggregate model. Ensemble models are by default constructed as a weighted average of the individual ANN submodels. While a simple unweighted average can also be used, weighting has the added benefit of putting greater weight on statistically better performing submodels, which in turn serves to increase the prediction power of the full ensemble. Per Granitto et al. s (2005) weighted version of SECA (W-SECA), individual weights are computed by normalizing the inverse squared classification error of each ANN on a full dataset of available observations. 3 Training is stopped when the combined error fails to decrease below a pre-set tolerance for a given number of training epochs or until a maximum number of training epochs has been reached. 7

Classification error is determined in NeuralEnsembles by means of the cross-entropy (CE) error function (Bishop 1995), making this another important distinguishing feature of the program. The most common error measure used in ANN training is the standard mean squared error (MSE). MSE derives from the maximum-likelihood principle when the target data follow a Gaussian distribution. While this is generally appropriate for regression type problems, it is obviously not the best error measure to use for binary targets like species presence/absence data. In contrast, CE, which derives from the maximum-likelihood function for Bernoulli random variables, is a more natural error measure to use when dealing with classification-type problems (e.g., presence vs absence). The main benefit of using a CE error measure is an improved level of prediction accuracy as measured by Kappa and AUC (see below). Other options include setting: (1) the number of submodels to be used in the ensemble; (2) several stopping conditions for controlling the duration of network training; (3) the type of training algorithm (standard back propagation, batch back propagation, Rprop and Quickprop) and associated learning parameter; (4) the initial random weights in the network; (5) the number of training runs for each ANN in order to minimize the error of a given submodel; and (6) random shuffling of training patterns on or off. Key outputs of the model include: (1) text files of all model results for further analysis or manipulation inside a GIS; (2) maps of observed presence and predicted suitability within the study area; (3) various statistics for evaluating a models accuracy based on discrimination ability and calibration; and (4) saved ANN parameter settings files for making any subsequent projections in projection only mode. Note that if additional environmental projection files have also been loaded, then maps of predicted suitability will also be generated for each possible scenario. 8

Projection outputs include for each location the mean suitability value produced by the ensemble model as well as the standard error and 95% confidence interval half-width 4 for the estimated mean. Mean suitability values ranging between 0 and 1 are calculated as an unweighted or weighted average (see above) of the individual projections produced by each ANN submodel. Additionally, a binary prediction defining a location as either suitable (1) or unsuitable (0) is produced by applying a user-specified threshold to the mean suitability value. Options for the threshold include the maximum Kappa cutoff value, the sensitivity-specificity cross-over point defined on a receiver operating characteristic (ROC) plot, or the 99%, 95% or 90% sensitivity values (see below for details). Plotting of maps (an optional setting) is carried out by running an automated program script written in R. R is a free and widely used programming language and software environment for statistical computing and graphics (R Development Core Team 2008). Consequently, map production requires that R version 2.6.0 or higher already be installed on the user s computer (see the R website for instructions on downloading and installing). The two basic types of maps (Figure 2) produced for the study area and any environmental projection scenarios include (1) a suitability surface map showing mean suitability values for each location and (2) a suitability distribution map showing areas of potentially suitable or unsuitable habitat/bioclimatic space. For the study area, a map of observed presence locations is also plotted. As an option, the user can have the observed presence locations overlayed on the suitability distribution map in order to facilitate a simple visual inspection of model performance. Key statistical outputs include a calibration plot showing the numerical accuracy of the predicted values (Vaughan and Ormerod 2005) and the two most common measures of 4 Based on a Student s t-test statistic for ensembles with 100 submodels and standardized z-values for ensembles of size >100 submodels. 9

discrimination accuracy 5 used in species distribution modeling: Cohen s Kappa statistic (K) and the Area Under the receiver operating characteristic Curve (AUC). Kappa provides a measure of similarity between spatial patterns, adjusted for chance agreement (Cohen 1960). Values of Kappa range from 0, indicating no agreement between observed and projected distributions, to 1 for perfect agreement. Because Kappa must be computed given a threshold for distinguishing presence from absence points, maximum values for Kappa are calculated by iteratively adjusting the threshold from 0 to 1 in increments of 1 10-4. AUC is determined from a plot of the Receiver Operating Characteristic (ROC) curve, which measures the model s sensitivity (the proportion of correctly predicted presences to the total number of predicted presences) versus its false positive fraction (the proportion of falsely predicted presences to the total number of predicted absences) for all possible classification thresholds. AUC provides an unbiased measure of a model s predictive accuracy that is independent of both species prevalence and classification threshold (Fielding & Bell 1997). Values for AUC range from 0.5 for models with no discrimination ability, to 1 for models with perfect discrimination. Besides reporting confidence intervals and one-tailed p-values of significance for both Kappa and AUC statistics, also provided in the statistics summary are the CE error of the full ensemble and the average CE error of the individual ANN submodels. Under fairly general conditions, it can be shown that the CE error of the full ensembles should normally be less than or equal to the average of the individual ANNs (Bishop 1995). Hence, any positive difference between the two gives a clear measure of the benefit of using an ensemble forecast compared to any single model. 5 Testing is performed by only combining predictions from networks that have not been trained on a particular input/target pattern. 10

To cite NeuralEnsembles or acknowledge its use, please use the following, substituting the version of the application you are using for Version 1.0 along with the appropriate access date: O Hanley, J.R. 2008. NeuralEnsembles: a neural network based ensemble forecasting program for habitat and bioclimatic suitability analysis (Version 1.0). [Online] Available at: <http://purl.oclc.org/neuralensembles> (Access Date). ACKNOWLEDGEMENTS Partial funding for the NeuralEnsemble program was provided by the MONARCH and BRANCH projects. I would especially like to thank Daniel Oberhoff from the Fraunhofer Institute for Applied Information Technology for sharing his modified version of the FANN C library, which implements the cross-entropy error function. This has significantly added to quality of the end-product. 11

REFERENCES Araújo, M.B., and New, M. 2007. Ensemble forecasting of species distributions. - Trends in Ecology and Evolution 22: 42-47. Araújo, M.B., Pearson, R.G., Thuiller, W., Erhard, M. 2005. Validation of species-climate impact models under climate change. - Global Change Biology 11: 1504-1513. Berry, P.M., O Hanley, J.R., Thomson, C.L., Harrison, P.A, Masters, G.J. and Dawson, T.P. (eds.) 2007. Modelling Natural Resource Responses to Climate Change (MONARCH): MONARCH 3 Contract report. - UKCIP Technical Report, Oxford. Bishop, C.M. 1995. Neural networks for pattern recognition. - Oxford University Press, Oxford. Breiman, L. 1996. Bagging predictors. - Machine Learning 24: 123-140. Cavazos, T. 1997. Downscaling large-scale circulation to local winter rainfall in north-eastern Mexico. - International Journal of Climatology 17: 1069 1082 Cohen, J. 1960. A coefficient of agreement for nominal scales. - Educational and Psychological Measurement 20: 37-46. Dawson, C.W. and Wilby, R.L. 2001. Hydrological modelling using artificial neural networks. - Progress in Physical Geography 25: 80-108. Fahlman, S.E., Baker, L.D., Boyan, J.A. 1996. The cascade 2 learning architecture. - Technical Report, CMU-CS-TR96-184, Carnegie Mellon University. Fielding, A.H. and Bell, J.F. 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. - Environmental Conservation 24: 38-49. Fielding, A.H. and Haworth, P.F. 1995. Testing the generality of bird-habitat models. - Conservation Biology 9: 1466-1481. 12

Freeman, E.A. and Moisen, G. 2008. PresenceAbsence: an R package for presence absence analysis. - Journal of Statistical Software 23: 1-31. Gopal, S., and Woodcock, C. 1996. Remote sensing of forest change using artificial neural networks. - IEEE Transactions of Geoscience and Remote Sensing 34: 398-404. Granitto, P.M., Verdes, P.F., Ceccatto, H.A. 2005. Neural network ensembles: evaluation of aggregation algorithms. - Artificial Intelligence 163: 139-162. Guisan, A. and Zimmermann, N.E. 2000. Predictive habitat distribution models in ecology. - Ecological Modelling 135: 147-186. Lee, S., Ryu, J.H., Won, J.S., Park, H.J. 2004. Determination and application of the weights for landslide susceptibility mapping using an artificial neural network. - Engineering Geology 71: 289-302. Nissen, S. 2008. Fast Artificial Neural Network Library (FANN). <http://leenissen.dk/fann> (March 2008). Nix, HA. 1986 A biogeographic analysis of Australian Elapid Snakes. - In: Longmore, R. (ed.), Atlas of Elapid Snakes of Australia. Australian Flora and Fauna Series Number 7, Australian Government Publishing Service: Canberra, pp. 4-15. Pearce, J.L. and Boyce, M.S. 2006. Modelling distribution and abundance with presence-only data. - Journal of Applied Ecology 43: 405-412. Pearson, R.G. and Dawson, T.E. 2003. Predicting the impacts of climate change on the distribution of species: are bioclimatic envelope models useful? - Global Ecology and Biogeography 12: 361-371. Pearson R.G., Dawson T.E., Berry P.M., Harrison, P.A. 2002. SPECIES: a spatial evaluation of climate impact on the envelope of species. - Ecological Modelling 154: 289-300. 13

Phillips, S.J., Anderson, R.P., and Schapire, R.E. 2006. Maximum entropy modeling of species geographic distributions. - Ecological Modelling, 190:231-259. R Development Core Team. 2008. R: A language and environment for statistical computing. - R Foundation for Statistical Computing, Vienna. <http://www.r-project.org> (March 2008). Ren, L. and Zhao, Z. 2002. An optimal neural network and concrete strength modeling. - Advances in Engineering Software 33: 117-130. Segurado, P. and Araújo, M.B. 2004. An evaluation of methods for modelling species distributions. - Journal of Biogeography 31: 1555-1568. Sharkey, A.J.C. 1999. Combining artificial neural nets. - Springer, London. Thuiller, W. 2003. BIOMOD - optimizing predictions of species distributions and projecting potential future shifts under global change. - Global Change Biology 9: 1353-1362. Vaughan, I.P. and Ormerod, S.J. 2005. The continuing challenges of testing species distribution models. - Journal of Applied Ecology 42: 720-730. 14

Figure 1. The NeuralEnsembles main graphical user interface and options settings windows. 15

Figure 2. Sample suitability surface and suitability distribution maps for Boloria euphrosyne (Pearl-bordered Fritillary). 16