TauDEM and CyberGIS Johnathan Rush CyberGIS Center for Advanced Digital and Spatial Studies cybergis.illinois.edu National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign CyberInfrastructure and Geospatial Information Laboratory (CIGI) cigi.illinois.edu Department of Geography and Geographic Information Science School of Earth, Society and Environment 1
Intro to TauDEM TauDEM (Terrain Analysis Using Digital Elevation Models) is a suite of Digital Elevation Model (DEM) tools for the extraction and analysis of hydrologic information from topography as represented by a DEM. TauDEM provides the following capability: Development of hydrologically correct (pit removed) DEMs using the flooding approach Calculates flow paths (directions) and slopes Calculates contributing area using single and multiple flow direction methods Multiple methods for the delineation of stream networks including topographic formbased methods sensitive to spatially variable drainage density Objective methods for determination of the channel network delineation threshold based on stream drops Delineation of watersheds and subwatersheds draining to each stream segment and association between watershed and segment attributes for setting up hydrologic models http://hydrology.usu.edu/taudem/ 2
Intro to TauDEM Specialized functions for terrain analysis, including: Calculates the slope/area ratio that is the basis for the topographic wetness index Calculates both the distance up to ridges and down to streams in horizontal, vertical, along slope and direct variants Maps locations upslope where activities have an effect on a downslope location Evaluates upslope contribution subject to decay or attenuation Calculates accumulation where the uptake is subject to concentration limitations Calculates accumulation where the uptake is subject to transport limitations Evaluates reverse accumulation Evaluates potential avalanche runout areas http://hydrology.usu.edu/taudem/ 3
Windows command line Unix command line ArcGIS Toolbox Compute Clusters CyberGIS Gateway TauDEM Platforms 4
CyberGIS CyberGIS is Geographic Information Science and Systems based on advanced cyberinfrastructure. Cyberinfrastructure includes: (high performance) computing systems data storage systems advanced instruments data repositories visualization environments people linked by high speed networks 5
Big Data Big Compute Big Collaboration Big Problems 6
Research Application Focus Areas: Agriculture and Food Disaster and Emergency Earth and Environment Energy and Water Resources Health and Wellness
GISolve Middleware CyberGIS Toolkit DB Controller Data Storage CyberGIS Gateway Computing Environment Job Panel Data Selection Job Job Wrappers Analysis Input Panels Geo-Input Editing Data Visualization Data Retrieval Workflow Mapping Geo Data Processing Visualization Sharing Data Servers Mapping Servers Metadata Servers External Data Sources Execution Setup Parallel Computing Post-processing Geo-visualization 8
CyberGIS Toolkit cybergis.cigi.uiuc.edu/cybergiswiki/doku.php/ct 9
GISolve Middleware www.cigi.illinois.edu/dokuwiki/doku.php/projects/gisolve/index 10
Infrastructure NSF Blue Waters @ UofI 13,300 TFlop/s NSF XSEDE: SDSC Trestles 100 TFlop/s CyberGIS Roger ~60 TFlop/s
CyberGIS Gateway gateway.cybergis.org sandbox.cybergis.org 12
CyberGIS Gateway Applications 13
TauDEM on Gateway 14
Workflows 15
TauDEM Functions Hydrologically Conditioned Elevation Grid (Pit Remove) D8 Flow Direction D8 Slope D8 Contributing Area Dinfinity Flow Direction Dinfinity Slope Dinfinity Specific Catchment Area Contributing Area Stream Raster Peuker Douglas Stream Raster Stream Network And Subwatersheds Topographic Wetness Index http://hydrology.usu.edu/taudem/taudem5/documentation.html 16
TauDEM Functions Move outlets to streams Gage Subwatersheds Drop Analysis Slope-Area Stream Raster D8 Flow Accumulation D8 Distance To Streams Dinfinity Flow Accumulation D-Infinity Distance Down, D-Infinity Distance Up Grid Network: Path Length, Strahler Order, Total Length Slope Area Ratio Slope Average Down http://hydrology.usu.edu/taudem/taudem5/documentation.html 17
Workflow Provenance 18
Workflow Provenance 19
Workflow Provenance 20
Visualization Tool
Job Time and ROGER TauDEM currently runs on 64 cores on Trestles ROGER has around 1040 cores available 22
Optimization Increasing data size as elevation data resolution National Elevation Dataset o 30 meters: 175 GB o 10 meters: 500 GB o 1 meter: 50 TB to 4-5 PB OpenTopography Lidar-derived DEM data Other Lidar sources, including ISGS XSEDE Extended Collaborative Support Service (ECSS) Improve the productivity of the XSEDE user community through both successful, meaningful collaborations and well-planned training activities in order to optimize their applications, improve their work and data flows, and increase their effective use of the XSEDE digital infrastructure; engage with members of underrepresented demographic communities and in underrepresented domain areas to broadly expand the XSEDE user base. 23
TauDEM Collaboration TauDEM 5.0 Lidarderived DEMs OpenTopography Scalability Enhancement (XSEDE ECSS) TauDEM 5.x CyberGIS DEMs USGS NED OT User DEMs OT TauDEM Services CyberGIS- TauDEM App TauDEM-enabled Research 24
Optimization Parallel programming model: Message Passing Interface (MPI) Spatial data decomposition o Each process reads a sub-region for processing o MPI communication for exchanging runtime hydrological information o Each process writes a sub-region defined by output data decomposition Parallel input/output (IO) o In-house GeoTIFF library (no support for files >4GB) o MPI IO for DEM read and write 25
Multi-File Input mpiexec n 5 pitremove Uses five processes, splitting domain into horizontal stripes. 5 Input files (red rectangles) may be arbitrarily positioned and may overlap or not fill domain completely. All files in folder are taken to comprise domain. 4GB file limit corresponds to roughly 32000x32000 rows and columns. 26
Multi-File Output 3 -mf 3 2 Multifile option allows each input stripe being output as multiple files, in this case, 3 columns and two rows. 5 2 27
CyberGIS Improvements Improved file IO increased scalability. http://dx.doi.org/10.1145/2616498.2616510 - Fan et al. 2014 28
CyberGIS Improvements Before ECSS round of improvements, the largest DEM size that could be processed was 6GB Test results on 36GB: Target is 500GB 10m NED with 4000+ cores 29
Acknowledgements XSEDE (NSF Grant No. 1053575) This material is based in part upon work supported by NSF under Grant Numbers 0846655 and 1047916 TauDEM team work is supported by the US Army Research and Development Center contract No. W912HZ-11-P-0338 and the NSF Grant Number 1135482. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation 30
Thanks! Johnathan Rush: Shaowen Wang: jfr@illinois.edu shaowen@illinois.edu Sign up on mailing list or visit our webpage to find out about upcoming speakers, training, and other events! lists.illinois.edu/lists/subscribe/cybergis-center cybergis.illinois.edu 31