CyberGIS Toolkit: A Software Toolbox Built for. Scalable cybergis Spatial Analysis and Modeling

Size: px
Start display at page:

Download "CyberGIS Toolkit: A Software Toolbox Built for. Scalable cybergis Spatial Analysis and Modeling"

Transcription

1 CyberGIS Tlkit: A Sftware Tlbx Built fr Scalable cybergis Spatial Analysis and Mdeling Yan Liu 1,2, Michael Finn 4, Ha Hu 1, Jay Laura 3, David Mattli 4, Anand Padmanabhan 1,2, Serge Rey 3, Eric Shk 5, Krnelijus Survila 1, and Shawen Wang 1,2 1 CyberInfrastructure and Gespatial Infrmatin Labratry (CIGI) 2 Natinal Center fr Supercmputing Applicatins (NCSA) University f Illinis at Urbana-Champaign 3 GeDa Center fr Spatial Analysis and Cmputatin Arizna State University 4 Center f Excellence fr Gespatial Infrmatin Science U.S. Gelgical Survey 5 Department f Gegraphy Kent State University CyberGIS All-Hands Meeting 2013 Seattle, WA., September 15, 2013

2 Outline Purpses Sftware integratin apprach Sftware cmpnent selectin Sftware engineering Scalability analysis Prgress update CyberGIS Tlkit 0.5-alpha release Cntinuus integratin framewrk Deplyment n advanced cyberinfrastructure Case study: Parallel PySAL (ppysal) Parallel PySAL prject Illustratin f parallelizatin strategies Future wrk and cncluding discussin 2

3 Six Majr Gals f the NSF CyberGIS Prject Sftware as 1. Engage multidisciplinary cmmunities thrugh a participatry apprach t evlving deliverables CyberGIS sftware requirements; 2. Integrate and sustain a cre set f cmpsable, interperable, manageable, and reusable CyberGIS sftware elements based n cmmunity-driven and pen surce strategies; 3. Empwer high-perfrmance and scalable CyberGIS by expliting spatial characteristics f data and analytical peratins fr achieving unprecedented capabilities fr gespatial scientific discveries; 4. Enhance an nline gespatial prblem slving envirnment t allw fr the cntributin, Building sharing blcks and learning f CyberGIS sftware by numerus users, which will fster the develpment f crsscutting educatin, utreach and training prgrams with significant brad impacts; 5. Deply and test CyberGIS sftware by linking with natinal and internatinal cyberinfrastructure t achieve scalability t significant sizes f gespatial prblems, amunts f cyberinfrastructure resurces, and number f users; and 6. Evaluate Cmputatinal and imprve the CyberGIS framewrk thrugh dmain science applicatins and vibrant partnerships t gain better understanding f the cmplexity slutins f cupled human-natural systems. 3

4 CyberGIS Sftware Envirnment CyberGIS Gateway CyberGIS Tlkit GISlve Middleware 4

5 CyberGIS Tlkit, a deep apprach t CyberGIS sftware integratin research and develpment, is fcused n develping and leveraging innvative cmputatinal strategies needed t slve significant gespatial scientific prblems by expliting high-end cyberinfrastructure (CI) resurces. -- NSF CyberGIS Prject, 3 rd Year Annual Reprt (2013) 5

6 Objectives Identify and integrate a set f lsely cupled scalable gespatial sftware cmpnents int the CyberGIS Tlkit Establish and sustain the CyberGIS Tlkit as a reliable sftware tlbx thrugh an pen and rigrus sftware building, testing, packaging, and deplyment framewrk Capture cmputatinal and spatial characteristics f a sftware element fcusing n cmputatinal perfrmance, scalability, and prtability in varius CI envirnments XSEDE (Extreme Science and Engineering Discvery Envirnment. Open Science Grid Extreme-scale supercmputers Prvide a sftware envirnment fr cmputatinal and data scientists t easily cnfigure and use CyberGIS Tlkit cmpnents 6

7 Sftware Integratin Apprach Sftware cmpnent selectin Open surce strategy Cmmunity-driven cmpnent identificatin Benefits t related science areas Sftware engineering Cmplexity in sftware integratin A hlistic framewrk fr streamlined cmpnent integratin CI-based scalability evaluatin and enhancement Cmputatinal intensity analysis theretical apprach Perfrmance analysis and prfiling experimental apprach 7

8 Cmpnents Open Surce Strategy Each cmpnent must be pen surced N restrictin n any particular pen surce license Tlkit Imprves the accessibility t integrated sftware capabilities thrugh the establishment f a sftware tlbx Fcuses n the rbustness, prtability, cmpatibility, and scalability f each cmpnent within advanced CI envirnments 8

9 Cmpnent Selectin Example: prasterblaster Need fr scalable map reprjectin in cybergis analytics Spatial analysis and mdeling Distance calculatin n raster cells requires apprpriate prjectin Visualizatin Reprjectin fr faster visualizatin n Web Mercatr base maps prasterblaster integratin in CyberGIS Tlkit and Gateway Sftware cmpnentizatin: librasterblaster, prasterblaster, MapIMG Build, test, and dcumentatin Gateway user interface 9

10 prasterblaster Cmpnent View prasterblaster librasterblaster MapIMG Cyberinfrastructure Service Prviders Develpers End Users 10

11 Integratin Challenges Diversity and hetergeneity f sftware cmpnents Prgramming languages Applicatin vs. library Dependent libraries Cde availability Prgramming mdels Cyberinfrastructure resurces Multiple levels f integratin Desktp level Hetergeneus sftware envirnments Cyberinfrastructure Sftware distributin CI deplyment Packaging fr brader distributin 11

12 Strategies Build and test Establish a streamlined prcess t build and test sftware cdes Cmputatinal intensity analysis Cmputatinal bttleneck evaluatin Scalability analysis Generalize slutins as cmputatinal and spatial knwledge Packaging and distributin Package sftware fr dwnlad and build in user cmputing envirnments Prvide cmmn distributin packages (Debian, RPM, Windws installer, etc.) Establish CyberGIS Tlkit as a cnfigurable sftware mdule that can be laded/unladed n supercmputers Dcumentatin and training Build a CyberGIS Tlkit web site t hst the sftware suite, user guide, develpment dcumentatin, educatin materials, user feedbacks, and frums Develp nline training materials 12

13 Cntinuus Integratin Framewrk Develper-level Testing Supprt e.g., Travis CI NMI-based Cntinuus Integratin Cntinuus Integratin Management Service Scalability Analysis and Enhancement e.g., Jenkins NMI: Natinal Middleware Initiative ( 13

14 Cmpnent Integratin Prcess Selectin Parallelizatin Develpment f test cases Develpment f build and test plans NMI regular build and test Scalability analysis n CI Identificatin f ptential cmputatinal bttlenecks Scalability t the number f prcessrs Scalability t prblem size Release in the CyberGIS Tlkit Accessibility in CyberGIS Cyberinfrastructure deplyment fr high-end users Lwering access barriers fr cmmunity users thrugh Gateway and GISlve Incrpratin in advanced applicatin wrkflw 14

15 Dem CyberGIS Tlkit 0.5-alpha release 15

16 Sftware Cmpnents Parallel Agent-Based Mdeling PABM HPC mdels: MPI, parallel I/O Cntributr: UIUC team Parallel PySAL Parallel pythn implementatin Cntributr: ASU team Parallel map reprjectin prasterblaster HPC mdels: MPI, parallel I/O Cntributr: High-perfrmance mapping grup, CEGIS, USGS SpatialText Full-text gecding f massive scial media and text data Cntributr: Kalev Leetaru and UIUC team 16

17 Dependency Graph CyberGIS Tlkit 0.5-alpha PAPI MPI GEOS Prj4 Perl Pythn Numpy, Scipy IPM Lustre GDAL PySAL PABM prasterblaster SpatialText ppysal 17

18 Tlkit Deplyment n XSEDE XSEDE resurces Stampede@TACC: 10 petaflps; cluster cmputing, multi-threaded, GPU cmputing Lnestar@TACC: 0.3 petaflps; cluster cmputing, multi-threaded, large memry Trestles@SDSC: 0.1 petaflps; cluster cmputing, data I/O intensive cmputing Grdn@SDSC: 0.34 petaflps; cluster cmputing, data I/O intensive cmputing, large memry Blacklight@PSC: petaflps; shared memry Keeneland@NICS: cluster cmputing, GPU cmputing Cmpiling and install Cmpilers: Intel, GNU, PGI MPI: Open MPI, mvapich2, mpich2, PGI, Intel MPI Sftware envirnment cnfiguratin Envirnment Mdules Adaptive sftware mdule lading/unlading Dem 18

19 Case Study: Parallel PySAL Serge Rey, Jay Laura 19

20 Cncluding Discussin The first set f selected sftware cmpnents has been integrated int the CyberGIS Tlkit 0.5-alpha A prttype integratin framewrk has been established t allw sphisticated multi-level integratin tests Scalability analysis has been effective t identify cmputatinal bttlenecks and imprve the perfrmance f several cmpnents CyberGIS Tlkit has been deplyed n XSEDE fr cmmunity access Thrugh GISlve Open Service API Cmmunity evaluatin and feedback is critical fr building high-quality and scientifically sund sftware 20

21 Acknwledgements NSF Sftware Infrastructure fr Sustained Innvatin (SI2) Prgram This material is based in part upn wrk supprted by NSF under Grant Number OCI Any pinins, findings, and cnclusins r recmmendatins expressed in this material are thse f the authr and d nt necessarily reflect the views f the Natinal Science Fundatin 21

22 We need yur feedback! Cntact: Thanks! 22