Big Data Challenges in Simulation-based Science

Size: px
Start display at page:

Download "Big Data Challenges in Simulation-based Science"

Transcription

1 Big Data Challenges in Simulation-based Science Manish Parashar* Rutgers Discovery Informatics Institute (RDI 2 ) Department of Computer Science *Hoang Bui, Tong Jin, Qian Sun, Fan Zhang, ADIOS Team, and other ex-students and collaborators. Manish Parashar, Rutgers University

2 Outline Data Grand Challenges Data challenges of simulation-based science Rethinking the simulations -> insights pipeline The DataSpaces Project Conclusion

3 RDI2 Data-driven Discovery in Science Nearly every field of discovery is transitioning from data poor to data rich Oceanography: OOI Astronomy: LSST Physics: LHC Biology: Sequencing Sociology: The Web Neuroscience: EEG, fmri Economics: POS terminals Credit: Ed Lazowska, Univ. of WA

4 Scientific Discovery through Simulations Scientific simulations running on high-end computing systems generate huge amounts of data! If a single core produces 2MB/minute on average, one of these machines could generate simulation data between ~170TB per hour -> ~700PB per day -> ~1.4EB per year Successful scientific discovery depends on a comprehensive understanding of this enormous simulation data How we enable the computation scientists to efficiently manage and explore extreme scale data: find the needles in haystack??

5 Scientific Discovery through Simulations Complex workflows integrating coupled models, data management/ processing, analytics Tight / loose coupling, data driven, ensembles Advanced numerical methods (E.g., Adaptive Mesh Refinement) Integrated (online) analytics, uncertainty quantification, Complex, heterogeneous components Large data volumes and data rates Data re-distribution (MxNxP), data transformations Dynamic data exchange patterns Strict performance/overhead constraints

6 Traditional Simulation -> Insight Pipelines Break Down Simulation Machines Simulation Raw Data Storage Servers Analysis/Visualization Cluster Traditional simulation -> insight pipeline Run large-scale simulation workflows on large supercomputers Dump data to parallel disk systems Export data to archives Move data to users sites usually selected subsets Figure. Traditional data analysis pipeline Perform data manipulations and analysis on mid-size clusters Collect experimental / observational data Move to analysis sites Perform comparison of experimental/observational to validate simulation data

7 Challenges Faced by Traditional HPC Data Pipelines Data analysis challenge Can current data mining, manipulation and visualization algorithms still work effectively on extreme scale machine? I/O and storage challenge Increasing performance gap: disks are outpaced by computing speed Data movement challenge Figure. Traditional data analysis pipeline Lots of data movement between simulation and analysis machines, between coupled multi-physics simulation components -> longer latencies Improving data locality is critical: do work where the data resides! Energy challenge Future extreme systems are designed to have low-power chips however, much greater power consumption will be due to memory and data movement! The costs of data movement are increasing and dominating!

8 The Cost of Data Movement Moving data between node memory and persistent storage is slow! performance gap The energy cost of moving data is a significant concern Energy _ move _ data = bitrate * length 2 cross_section_area_of_wire From K. Yelick, Software and Algorithms for Exascale: Ten Ways to Waste an Exascale Computer

9 Rethinking the Data Management Pipeline Hybrid Staging + In-Situ & In-Transit Execution Issues/Challenges Programming abstractions/systems Mapping and scheduling Control and data flow Autonomic runtime Link with the external workflow Systems Glean, Darshan, FlexPath,..

10 Design space of possible workflow architectures Loca%on of the compute resources Same cores as the simulahon (in situ) Some (dedicated) cores on the same nodes Some dedicated nodes on the same machine Dedicated nodes on an external resource Data access, placement, and persistence Direct access to simulahon data structures Shared memory access via hand- off / copy Shared memory access via non- volahle near node storage (NVRAM) Data transfer to dedicated nodes or external resources Synchroniza%on and scheduling Execute synchronously with simulahon every n th simulahon Hme step Execute asynchronously Analysis Tasks Simulation Visualization DRAM DRAM Node 2 Node 1 Sharing cores with the simulation Using distinct cores on same node CPUs DRAM NVRAM SSD Hard Disk... Network Node N DRAM DRAM Simulation Node Network Communication Staging Node Processing data on remote nodes CPUs DRAM NVRAM Staging option 3 SSD Hard Disk Staging option 1 Staging option 2

11 l l l l DataSpaces: A Scalable Shared Space Abstraction for Hybrid Data Staging [JCC12] Shared-space programming abstraction over hybrid staging l l l l Simple API for coordination, interaction and messaging Provides a global-view programming abstraction Distributed, associative, in-deepmemory object store Online data indexing, flexible querying Exposed as a persistent service Autonomic cross-layer runtime management High-throughput/low-latency asynchronous data transport 0 Throughput(GB/s) Simula on get() App put()/pub() m m m m d d d d d put() Shared Space Model 64K/4K 32K/2K 16K/1K 8192/ / / /64 512/32 d: Data object m: Meta-data object App1 get()/sub() get() App3 2GB 4GB 8GB 16GB 32GB 64GB 128GB 256GB Aggregate data redistribution throughput DataSpaces scalability on ORNL Titan

12 DataSpaces Abstraction: Indexing + DHT [HPDC10] Dynamically constructed structured overlay using hybrid staging cores Index constructed online using SFC mappings and applications attributes E.g., application domain, data, field values of interest, etc. DHT used to maintain meta-data information E.g., geometric descriptors for the shared data, FastBit indices, etc. Data objects load-balanced separately across staging cores

13 DataSpaces: Enabling Coupled Scientific Workflows at Extreme Scales Multphysics Code Coupling at Extreme Scales [CCGrid10] PGAS Extensions for Code Coupling [CCPE13] compute nodes running simulation and in-situ analytics Pool of computation bucket in the staging area Simulation Simulation Simulation Simulation in-situ analytics DART in-situ analytics in-situ analytics DART in-situ analytics DART DART Data in-transit analytics in-transit analytics DART in-transit analytics DART in-transit analytics DART DART... "bucket-ready" event "data-ready" event Assigned task & data descriptors Resoure scheduler Data Interaction and coordnation DataSpaces data_kernels.o 0x69 0x20 0x61 0x6d 0x20 0x63 0x6f 0x6f 0x6c 0x0 data_kernels.c kernel_min { for i = 1, n for j = 1, m for k = 1, p if (min > A(i, j, k)) min = A(i, j, k) } Applications.o Applications executable Data-centric Mappings for In-Situ Workflows [IPDPS12] Dynamic Code Deployment In-Staging [IPDPS11] gcc -c Link aprun Compute nodes return() Staging nodes aprun rexec() Load Runtime execution system (Rexec) data_kernels.lua for i = 0, ni-1 do for j = 0, nj-1 do for k = 0, nk-1 do val = input:get_val(i,j,k) if min > val then min = val end end end

14 Integrating In-situ and In-transit Analytics [SC 12] S3D: First-principles direct numerical simulation Simulation resolves features on the order of 10 simulation time steps Currently on the order of every 400 th time step can be written to disk Temporal fidelity is compromised when analysis is done as a post-process Recent data sets generated by S3D, developed at the CombusHon Research Facility, Sandia NaHonal Laboratories

15 Integrating In-situ and In-transit Analytics Design Couple concurrent data analysis with simulation StaHsHcs Volume Rendering Topology Identify features of interest *J. C. Bennett et al., Combining In-Situ and In-Transit Processing to Enable Extreme-Scale Scientific Analysis, SC 12, Salt Lake City, Utah, November,

16 In-situ/In-transit Data Analytics Online data analytics for large-scale simulation workflows Combine both in-situ and in-transit placement to reduce impact on simulation, reduce network data movement, reduce total execution time Adaptive runtime management: Optimize the performance of simulation-analysis workflow through adaptive data and analysis placement, data-centric task mapping primary resources Simulation asynchronous in-situ to in-transit data transfers secondary resources In-transit Task Executors shared data structures In-situ processing data ready Pull task & data descriptors Workflow Manager Figure. Overview of the in-situ/in-transit data analysis framework Figure. System architecture for cross-layer runtime adaptation

17 Simulation case study with S3D: Timing results for 4896 cores and analysis every simulation time step S3D& data&movement& 1.69& hybrid&sta3s3cs& 1.7& 0.73& hybrid&visualiza3on& 5.07& hybrid&topology& 2.72& 2.06& & simula3on& 16.85& 0& 2& 4& 6& 8& 10& 12& 14& 16& seconds&

18 Simulation case study with S3D: Timing results for 4896 cores and analysis every simulation time step S3D& data&movement& 1.69& 10.0% hybrid&sta3s3cs& 1.7& 10.1% 0.73& 4.3% hybrid&visualiza3on& 5.07&.04% hybrid&topology& 2.72& 2.06& & 16.1% simula3on& 16.85& 0& 2& 4& 6& 8& 10& 12& 14& 16& seconds&

19 Simulation case study with S3D: Timing results for 4896 cores and analysis every 10 th simulation time step S3D& in>situ& data&movement& in>transit&& in>situ&sta1s1cs& hybrid&sta1s1cs& in>situ&visualiza1on& hybrid&visualiza1on& 1.0% 1.0%.43%.004% hybrid&topology& & 1.61% simula1on& 168.5& 0& 20& 40& 60& 80& 100& 120& 140& 160& seconds&

20 Simulation case study with S3D: Timing results for 4896 cores and analysis every 100 th simulation time step S3D% in<situ% data%movement% in<transit%% in<situ%sta/s/cs% hybrid%sta/s/cs% in<situ%visualiza/on% hybrid%visualiza/on% hybrid%topology%.1%.1%.04%.004%.16% simula/on% 1685% 0% 200% 400% 600% 800% 1000% 1200% 1400% 1600% seconds%

21 In-Situ Feature Extraction and Tracking using Decentralized Online Clustering (DISC 12) DOC workers executed in- situ on simula%on machines DOC Overlay SimulaHon Compute Nodes Processor core runs simulahon Processor core runs DOC worker Benefits of runtime feature extraction and tracking (1) Scientists can follow the events of interest (or data of interest) (2) Scientists can do real-time monitoring of the running simulations One compute node

22 In-situ viz. and monitoring with staging (SC 12) Pixie3D 1024 cores Pixplot 8 cores ParaView Server 4 cores record.bp record.bp pixie3d.bp pixie3d.bp DataSpaces Pixmon 1 core (login node)

23 Scalable Online Data Indexing and Querying (BDAC 14, CCPE 15) Enable query-driven data analytics to extract insights from the large volume of scientific simulation data Query1 Hilbert SFC Index- 1 Coord- based Query2 Coord- based Z- order SFC Bitmap Index Hash: H1 Hash: H2 Hash: H3 Index- 2 Index- 3 Index- 4 Query3 Value- based Hash: H4 Query4 Tree Index Value- based N 1 N 2 N 3 N 4 Figure. Conceptual view of the programmable online indexing & querying using multiple index Figure. Value-based online index and query framework using FastBit

24 Multi-tiered Data Staging with Autonomic Data Placement Adaptation Overview (IPDPS 15) Fig. Conceptual framework for realizing multi-tiered staging for coupled dataintensive simulation workflows. A multi-tiered data staging approach that uses both DRAM and SSD to support data-intensive simulation workflows with large data coupling requirements. Efficient application-aware data placement mechanism that leverages user provided data access pattern information

25 Wide-area Staging (Demo at SC 14) Wide-area shared staging abstraction achieved by interconnecting multiple staging instances Leverages SDN for provisioning; Workflow/Grid technologies Data placement optimized based on demand, availability, locality, etc. App Dataspaces WA Control Node App TCP/IP get()/sub() put()/ pub() ADIOS/ Dataspaces ADIOS/ Dataspaces RDMA, Gridftp Data Tx/ Rx Node

26 Combus%on: S3D uses chemical model to simulate combushon process in order to understand the ignihon characterishc of bio- fuels for automohve usage. PublicaHon: SC Fusion: EPSi simulahon workflow provides insight into edge plasma physics in magnehc fusion devices by simulahng movement of millions parhcles. PublicaHon: CCGrid 2010, Cluster 2014 Subsurface modeling: IPARS uses mulh- physics, mulh- model formulahons and coupled mulh- block grids to support oil reservoir simulahon of realishc, high- resoluhon reservoir studies with a million or more grids. PublicaHon: CCGrid 2011 Computa%onal mechanics: FEM coupled with AMR is oben used to solve heat transfer or fluid dynamic problems. AMR only performs refinements on important region of the mesh. (work in progress) Material Science: SNS workflow enables integrahon of data, analysis, and modeling tools for materials science experiments to accelerate scienhfic discoveries which requires beam lines data query capability. (work in progress) Material Science: QMCPack uhlizes quantum Monte Carlo (QMC) studies in heterogeneous catalysis of transihon metal nanoparhcles, phase transihons, properhes of materials under pressure, and strongly correlated materials. (work in progress)

27 Summary & Conclusions Complex applications running on high-end systems generate extreme amounts of data that must be managed and analyzed to get insights Data costs (performance, latency, energy) are quickly dominating Traditional data management/analytics pipelines are breaking down Hybrid data staging, in-situ workflow execution, etc. can address this challenges Users to efficiently intertwine applications, libraries, middleware for complex analytics Many challenges; Programming, mapping and scheduling, control and data flow, autonomic runtime management. The DataSpaces project explores solutions at various levels

28 Thank You!! EPSi Edge Physics Simulation Manish Parashar WWW: parashar.rutgers.edu WWW: dataspaces.org

29 DataSpaces RU 39

Addressing Big Data Challenges in Simulation-based Science

Addressing Big Data Challenges in Simulation-based Science Addressing Big Data Challenges in Simulation-based Science Manish Parashar* Rutgers Discovery Informatics Institute (RDI2) NSF Center for Cloud and Autonomic Computing (CAC) Department of Electrical and

More information

Big Data Challenges in Simulation-based Science

Big Data Challenges in Simulation-based Science Big Data Challenges in Simulation-based Science Manish Parashar Rutgers Discovery Informatics Institute (RDI 2 ) NSF Center for Cloud and Autonomic Computing (CAC) http://parashar.rutgers.edu *Hoang Bui,

More information

Big Data Challenges in Simulation-based Science

Big Data Challenges in Simulation-based Science Big Data Challenges in Simulation-based Science Manish Parashar* Rutgers Discovery Informatics Institute (RDI2) NSF Center for Cloud and Autonomic Computing (CAC) Department of Electrical and Computer

More information

Addressing Crosscutting Compute/Data Challenges

Addressing Crosscutting Compute/Data Challenges Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation Addressing Crosscutting Compute/Data Challenges Manish Parashar parashar@rutgers.edu Manish Parashar, Rutgers

More information

Data Centric Systems (DCS)

Data Centric Systems (DCS) Data Centric Systems (DCS) Architecture and Solutions for High Performance Computing, Big Data and High Performance Analytics High Performance Computing with Data Centric Systems 1 Data Centric Systems

More information

Visualization and Data Analysis

Visualization and Data Analysis Working Group Outbrief Visualization and Data Analysis James Ahrens, David Rogers, Becky Springmeyer Eric Brugger, Cyrus Harrison, Laura Monroe, Dino Pavlakos Scott Klasky, Kwan-Liu Ma, Hank Childs LLNL-PRES-481881

More information

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University

In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps. Yu Su, Yi Wang, Gagan Agrawal The Ohio State University In-Situ Bitmaps Generation and Efficient Data Analysis based on Bitmaps Yu Su, Yi Wang, Gagan Agrawal The Ohio State University Motivation HPC Trends Huge performance gap CPU: extremely fast for generating

More information

Towards Scalable Visualization Plugins for Data Staging Workflows

Towards Scalable Visualization Plugins for Data Staging Workflows Towards Scalable Visualization Plugins for Data Staging Workflows David Pugmire Oak Ridge National Laboratory Oak Ridge, Tennessee James Kress University of Oregon Eugene, Oregon Jeremy Meredith, Norbert

More information

Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED

Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory 21 st Century Research Continuum Theory Theory embodied in computation Hypotheses tested through experiment SCIENTIFIC METHODS

More information

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk

HPC and Big Data. EPCC The University of Edinburgh. Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk HPC and Big Data EPCC The University of Edinburgh Adrian Jackson Technical Architect a.jackson@epcc.ed.ac.uk EPCC Facilities Technology Transfer European Projects HPC Research Visitor Programmes Training

More information

Building Platform as a Service for Scientific Applications

Building Platform as a Service for Scientific Applications Building Platform as a Service for Scientific Applications Moustafa AbdelBaky moustafa@cac.rutgers.edu Rutgers Discovery Informa=cs Ins=tute (RDI 2 ) The NSF Cloud and Autonomic Compu=ng Center Department

More information

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO The Fusion of Supercomputing and Big Data Peter Ungaro President & CEO The Supercomputing Company Supercomputing Big Data Because some great things never change One other thing that hasn t changed. Cray

More information

HPC & Visualization. Visualization and High-Performance Computing

HPC & Visualization. Visualization and High-Performance Computing HPC & Visualization Visualization and High-Performance Computing Visualization is a critical step in gaining in-depth insight into research problems, empowering understanding that is not possible with

More information

BSC vision on Big Data and extreme scale computing

BSC vision on Big Data and extreme scale computing BSC vision on Big Data and extreme scale computing Jesus Labarta, Eduard Ayguade,, Fabrizio Gagliardi, Rosa M. Badia, Toni Cortes, Jordi Torres, Adrian Cristal, Osman Unsal, David Carrera, Yolanda Becerra,

More information

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell R&D Manager, Scalable System So#ware Department Sandia National Laboratories is a multi-program laboratory managed and

More information

Exploring Software Defined Federated Infrastructures for Science

Exploring Software Defined Federated Infrastructures for Science Exploring Software Defined Federated Infrastructures for Science Manish Parashar NSF Cloud and Autonomic Computing Center (CAC) Rutgers Discovery Informatics Institute (RDI 2 ) Rutgers, The State University

More information

Big Data Management in the Clouds and HPC Systems

Big Data Management in the Clouds and HPC Systems Big Data Management in the Clouds and HPC Systems Hemera Final Evaluation Paris 17 th December 2014 Shadi Ibrahim Shadi.ibrahim@inria.fr Era of Big Data! Source: CNRS Magazine 2013 2 Era of Big Data! Source:

More information

Big Data Challenges in Bioinformatics

Big Data Challenges in Bioinformatics Big Data Challenges in Bioinformatics BARCELONA SUPERCOMPUTING CENTER COMPUTER SCIENCE DEPARTMENT Autonomic Systems and ebusiness Pla?orms Jordi Torres Jordi.Torres@bsc.es Talk outline! We talk about Petabyte?

More information

Data Semantics Aware Cloud for High Performance Analytics

Data Semantics Aware Cloud for High Performance Analytics Data Semantics Aware Cloud for High Performance Analytics Microsoft Future Cloud Workshop 2011 June 2nd 2011, Prof. Jun Wang, Computer Architecture and Storage System Laboratory (CASS) Acknowledgement

More information

PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D.

PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD. Natasha Balac, Ph.D. PACE Predictive Analytics Center of Excellence @ San Diego Supercomputer Center, UCSD Natasha Balac, Ph.D. Brief History of SDSC 1985-1997: NSF national supercomputer center; managed by General Atomics

More information

New Jersey Big Data Alliance

New Jersey Big Data Alliance Rutgers Discovery Informatics Institute (RDI 2 ) New Jersey s Center for Advanced Computation New Jersey Big Data Alliance Manish Parashar Director, Rutgers Discovery Informatics Institute (RDI 2 ) Professor,

More information

HPC technology and future architecture

HPC technology and future architecture HPC technology and future architecture Visual Analysis for Extremely Large-Scale Scientific Computing KGT2 Internal Meeting INRIA France Benoit Lange benoit.lange@inria.fr Toàn Nguyên toan.nguyen@inria.fr

More information

Hank Childs, University of Oregon

Hank Childs, University of Oregon Exascale Analysis & Visualization: Get Ready For a Whole New World Sept. 16, 2015 Hank Childs, University of Oregon Before I forget VisIt: visualization and analysis for very big data DOE Workshop for

More information

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

Data-intensive HPC: opportunities and challenges. Patrick Valduriez Data-intensive HPC: opportunities and challenges Patrick Valduriez Big Data Landscape Multi-$billion market! Big data = Hadoop = MapReduce? No one-size-fits-all solution: SQL, NoSQL, MapReduce, No standard,

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

IBM Deep Computing Visualization Offering

IBM Deep Computing Visualization Offering P - 271 IBM Deep Computing Visualization Offering Parijat Sharma, Infrastructure Solution Architect, IBM India Pvt Ltd. email: parijatsharma@in.ibm.com Summary Deep Computing Visualization in Oil & Gas

More information

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets

The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets The Data Grid: Towards an Architecture for Distributed Management and Analysis of Large Scientific Datasets!! Large data collections appear in many scientific domains like climate studies.!! Users and

More information

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging Outline High Performance Computing (HPC) Towards exascale computing: a brief history Challenges in the exascale era Big Data meets HPC Some facts about Big Data Technologies HPC and Big Data converging

More information

HadoopTM Analytics DDN

HadoopTM Analytics DDN DDN Solution Brief Accelerate> HadoopTM Analytics with the SFA Big Data Platform Organizations that need to extract value from all data can leverage the award winning SFA platform to really accelerate

More information

Hadoop on the Gordon Data Intensive Cluster

Hadoop on the Gordon Data Intensive Cluster Hadoop on the Gordon Data Intensive Cluster Amit Majumdar, Scientific Computing Applications Mahidhar Tatineni, HPC User Services San Diego Supercomputer Center University of California San Diego Dec 18,

More information

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research

Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research Cloud Computing for Research Roger Barga Cloud Computing Futures, Microsoft Research Trends: Data on an Exponential Scale Scientific data doubles every year Combination of inexpensive sensors + exponentially

More information

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical

Write a technical report Present your results Write a workshop/conference paper (optional) Could be a real system, simulation and/or theoretical Identify a problem Review approaches to the problem Propose a novel approach to the problem Define, design, prototype an implementation to evaluate your approach Could be a real system, simulation and/or

More information

Project Convergence: Integrating Data Grids and Compute Grids. Eugene Steinberg, CTO Grid Dynamics May, 2008

Project Convergence: Integrating Data Grids and Compute Grids. Eugene Steinberg, CTO Grid Dynamics May, 2008 Project Convergence: Integrating Data Grids and Compute Grids Eugene Steinberg, CTO May, 2008 Data-Driven Scalability Challenges in HPC Data is far away Latency of remote connection Latency of data movement

More information

In situ data analysis and I/O acceleration of FLASH astrophysics simulation on leadership-class system using GLEAN

In situ data analysis and I/O acceleration of FLASH astrophysics simulation on leadership-class system using GLEAN In situ data analysis and I/O acceleration of FLASH astrophysics simulation on leadership-class system using GLEAN Venkatram Vishwanath 1, Mark Hereld 1, Michael E. Papka 1, Randy Hudson 2, G. Cal Jordan

More information

Harnessing the Potential of Data Scientists and Big Data for Scientific Discovery

Harnessing the Potential of Data Scientists and Big Data for Scientific Discovery Harnessing the Potential of Data Scientists and Big Data for Scientific Discovery Ed Lazowska, University of Washington Saul Perlmu=er, UC Berkeley Yann LeCun, New York University Josh Greenberg, Alfred

More information

High Performance Computing OpenStack Options. September 22, 2015

High Performance Computing OpenStack Options. September 22, 2015 High Performance Computing OpenStack PRESENTATION TITLE GOES HERE Options September 22, 2015 Today s Presenters Glyn Bowden, SNIA Cloud Storage Initiative Board HP Helion Professional Services Alex McDonald,

More information

Parallel Analysis and Visualization on Cray Compute Node Linux

Parallel Analysis and Visualization on Cray Compute Node Linux Parallel Analysis and Visualization on Cray Compute Node Linux David Pugmire, Oak Ridge National Laboratory and Hank Childs, Lawrence Livermore National Laboratory and Sean Ahern, Oak Ridge National Laboratory

More information

Big Data Mining Services and Knowledge Discovery Applications on Clouds

Big Data Mining Services and Knowledge Discovery Applications on Clouds Big Data Mining Services and Knowledge Discovery Applications on Clouds Domenico Talia DIMES, Università della Calabria & DtoK Lab Italy talia@dimes.unical.it Data Availability or Data Deluge? Some decades

More information

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN

PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN 1 PARALLEL & CLUSTER COMPUTING CS 6260 PROFESSOR: ELISE DE DONCKER BY: LINA HUSSEIN Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction

More information

Use Cases for Large Memory Appliance/Burst Buffer

Use Cases for Large Memory Appliance/Burst Buffer Use Cases for Large Memory Appliance/Burst Buffer Rob Neely Bert Still Ian Karlin Adam Bertsch LLNL-PRES- 648613 This work was performed under the auspices of the U.S. Department of Energy by under contract

More information

Big Data and Science: Myths and Reality

Big Data and Science: Myths and Reality Big Data and Science: Myths and Reality H.V. Jagadish http://www.eecs.umich.edu/~jag Six Myths about Big Data It s all hype It s all about size It s all analysis magic Reuse is easy It s the same as Data

More information

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix

Alternative Deployment Models for Cloud Computing in HPC Applications. Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix Alternative Deployment Models for Cloud Computing in HPC Applications Society of HPC Professionals November 9, 2011 Steve Hebert, Nimbix The case for Cloud in HPC Build it in house Assemble in the cloud?

More information

HPC Trends and Directions in the Earth Sciences Per Nyberg nyberg@cray.com Sr. Director, Business Development

HPC Trends and Directions in the Earth Sciences Per Nyberg nyberg@cray.com Sr. Director, Business Development HPC Trends and Directions in the Earth Sciences Per Nyberg nyberg@cray.com Sr. Director, Business Development Cray Solutions for the Earth Sciences Why Cray? Cray solutions support: Range of modeling capabilities

More information

Data Requirements from NERSC Requirements Reviews

Data Requirements from NERSC Requirements Reviews Data Requirements from NERSC Requirements Reviews Richard Gerber and Katherine Yelick Lawrence Berkeley National Laboratory Summary Department of Energy Scientists represented by the NERSC user community

More information

Make the Most of Big Data to Drive Innovation Through Reseach

Make the Most of Big Data to Drive Innovation Through Reseach White Paper Make the Most of Big Data to Drive Innovation Through Reseach Bob Burwell, NetApp November 2012 WP-7172 Abstract Monumental data growth is a fact of life in research universities. The ability

More information

Advanced Big Data Analytics with R and Hadoop

Advanced Big Data Analytics with R and Hadoop REVOLUTION ANALYTICS WHITE PAPER Advanced Big Data Analytics with R and Hadoop 'Big Data' Analytics as a Competitive Advantage Big Analytics delivers competitive advantage in two ways compared to the traditional

More information

EOFS Workshop Paris Sept, 2011. Lustre at exascale. Eric Barton. CTO Whamcloud, Inc. eeb@whamcloud.com. 2011 Whamcloud, Inc.

EOFS Workshop Paris Sept, 2011. Lustre at exascale. Eric Barton. CTO Whamcloud, Inc. eeb@whamcloud.com. 2011 Whamcloud, Inc. EOFS Workshop Paris Sept, 2011 Lustre at exascale Eric Barton CTO Whamcloud, Inc. eeb@whamcloud.com Agenda Forces at work in exascale I/O Technology drivers I/O requirements Software engineering issues

More information

Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF)

Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF) Scheduling and Load Balancing in the Parallel ROOT Facility (PROOF) Gerardo Ganis CERN E-mail: Gerardo.Ganis@cern.ch CERN Institute of Informatics, University of Warsaw E-mail: Jan.Iwaszkiewicz@cern.ch

More information

Einsatzfelder von IBM PureData Systems und Ihre Vorteile.

Einsatzfelder von IBM PureData Systems und Ihre Vorteile. Einsatzfelder von IBM PureData Systems und Ihre Vorteile demirkaya@de.ibm.com Agenda Information technology challenges PureSystems and PureData introduction PureData for Transactions PureData for Analytics

More information

Parallel Large-Scale Visualization

Parallel Large-Scale Visualization Parallel Large-Scale Visualization Aaron Birkland Cornell Center for Advanced Computing Data Analysis on Ranger January 2012 Parallel Visualization Why? Performance Processing may be too slow on one CPU

More information

HPC enabling of OpenFOAM R for CFD applications

HPC enabling of OpenFOAM R for CFD applications HPC enabling of OpenFOAM R for CFD applications Towards the exascale: OpenFOAM perspective Ivan Spisso 25-27 March 2015, Casalecchio di Reno, BOLOGNA. SuperComputing Applications and Innovation Department,

More information

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010

Flash Memory Arrays Enabling the Virtualized Data Center. July 2010 Flash Memory Arrays Enabling the Virtualized Data Center July 2010 2 Flash Memory Arrays Enabling the Virtualized Data Center This White Paper describes a new product category, the flash Memory Array,

More information

Integration of Supercomputers and Data-Warehouse Appliances

Integration of Supercomputers and Data-Warehouse Appliances August, 2013 Integration of Supercomputers and Data-Warehouse Appliances Approved for Public Release: SAND 2013-6833 C Ron A. Oldfield, George Davidson, Craig Ulmer, Andy Wilson Sandia National Laboratories

More information

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC HPC Architecture End to End Alexandre Chauvin Agenda HPC Software Stack Visualization National Scientific Center 2 Agenda HPC Software Stack Alexandre Chauvin Typical HPC Software Stack Externes LAN Typical

More information

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals

High Performance Computing. Course Notes 2007-2008. HPC Fundamentals High Performance Computing Course Notes 2007-2008 2008 HPC Fundamentals Introduction What is High Performance Computing (HPC)? Difficult to define - it s a moving target. Later 1980s, a supercomputer performs

More information

Data, Computation, and the Fate of the Universe

Data, Computation, and the Fate of the Universe Data, Computation, and the Fate of the Universe Saul Perlmutter University of California, Berkeley Lawrence Berkeley National Laboratory NERSC Lunchtime Nobel Keynote Lecture June 2014 Atomic Matter 4%

More information

Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory

Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory Customer Success Story Los Alamos National Laboratory Panasas High Performance Storage Powers the First Petaflop Supercomputer at Los Alamos National Laboratory June 2010 Highlights First Petaflop Supercomputer

More information

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

More information

Kriterien für ein PetaFlop System

Kriterien für ein PetaFlop System Kriterien für ein PetaFlop System Rainer Keller, HLRS :: :: :: Context: Organizational HLRS is one of the three national supercomputing centers in Germany. The national supercomputing centers are working

More information

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste

for my computation? Stefano Cozzini Which infrastructure Which infrastructure Democrito and SISSA/eLAB - Trieste Which infrastructure Which infrastructure for my computation? Stefano Cozzini Democrito and SISSA/eLAB - Trieste Agenda Introduction:! E-infrastructure and computing infrastructures! What is available

More information

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions

Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions Simplifying Big Data Deployments in Cloud Environments with Mellanox Interconnects and QualiSystems Orchestration Solutions 64% of organizations were investing or planning to invest on Big Data technology

More information

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21)

CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) CYBERINFRASTRUCTURE FRAMEWORK FOR 21 st CENTURY SCIENCE AND ENGINEERING (CIF21) Goal Develop and deploy comprehensive, integrated, sustainable, and secure cyberinfrastructure (CI) to accelerate research

More information

Collaborative Project in Cloud Computing

Collaborative Project in Cloud Computing Collaborative Project in Cloud Computing Project Title: Virtual Cloud Laboratory (VCL) Services for Education Transfer of Results: from North Carolina State University (NCSU) to IBM Technical Description:

More information

Performance Monitoring of Parallel Scientific Applications

Performance Monitoring of Parallel Scientific Applications Performance Monitoring of Parallel Scientific Applications Abstract. David Skinner National Energy Research Scientific Computing Center Lawrence Berkeley National Laboratory This paper introduces an infrastructure

More information

Supercomputing on Windows. Microsoft (Thailand) Limited

Supercomputing on Windows. Microsoft (Thailand) Limited Supercomputing on Windows Microsoft (Thailand) Limited W hat D efines S upercom puting A lso called High Performance Computing (HPC) Technical Computing Cutting edge problems in science, engineering and

More information

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage

Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage White Paper Scaling Objectivity Database Performance with Panasas Scale-Out NAS Storage A Benchmark Report August 211 Background Objectivity/DB uses a powerful distributed processing architecture to manage

More information

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study

DISTRIBUTED SYSTEMS AND CLOUD COMPUTING. A Comparative Study DISTRIBUTED SYSTEMS AND CLOUD COMPUTING A Comparative Study Geographically distributed resources, such as storage devices, data sources, and computing power, are interconnected as a single, unified resource

More information

Next-Generation Networking for Science

Next-Generation Networking for Science Next-Generation Networking for Science ASCAC Presentation March 23, 2011 Program Managers Richard Carlson Thomas Ndousse Presentation

More information

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE

WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE WITH A FUSION POWERED SQL SERVER 2014 IN-MEMORY OLTP DATABASE 1 W W W. F U S I ON I O.COM Table of Contents Table of Contents... 2 Executive Summary... 3 Introduction: In-Memory Meets iomemory... 4 What

More information

BlobSeer: Towards efficient data storage management on large-scale, distributed systems

BlobSeer: Towards efficient data storage management on large-scale, distributed systems : Towards efficient data storage management on large-scale, distributed systems Bogdan Nicolae University of Rennes 1, France KerData Team, INRIA Rennes Bretagne-Atlantique PhD Advisors: Gabriel Antoniu

More information

Jean-Pierre Panziera Teratec 2011

Jean-Pierre Panziera Teratec 2011 Technologies for the future HPC systems Jean-Pierre Panziera Teratec 2011 3 petaflop systems : TERA 100, CURIE & IFERC Tera100 Curie IFERC 1.25 PetaFlops 256 TB ory 30 PB disk storage 140 000+ Xeon cores

More information

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System

Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System Pentaho High-Performance Big Data Reference Configurations using Cisco Unified Computing System By Jake Cornelius Senior Vice President of Products Pentaho June 1, 2012 Pentaho Delivers High-Performance

More information

Data Centric Interactive Visualization of Very Large Data

Data Centric Interactive Visualization of Very Large Data Data Centric Interactive Visualization of Very Large Data Bruce D Amora, Senior Technical Staff Gordon Fossum, Advisory Engineer IBM T.J. Watson Research/Data Centric Systems #OpenPOWERSummit Data Centric

More information

Conquering the Astronomical Data Flood through Machine

Conquering the Astronomical Data Flood through Machine Conquering the Astronomical Data Flood through Machine Learning and Citizen Science Kirk Borne George Mason University School of Physics, Astronomy, & Computational Sciences http://spacs.gmu.edu/ The Problem:

More information

Application of Grid-Enabled Technologies for Solving Optimization Problems in Data Driven Reservoir Systems

Application of Grid-Enabled Technologies for Solving Optimization Problems in Data Driven Reservoir Systems Application of Grid-Enabled Technologies for Solving Optimization Problems in Data Driven Reservoir Systems M. Parashar, H. Klie, U. Catalyurek, T. Kurc, V. Matossian, J. Saltz, M.F. Wheeler ITR Collaborators

More information

Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo

Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo Sensor Network Messaging Service Hive/Hadoop Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp Department of Civil Engineering The University of Tokyo Contents 1 Introduction 2 What & Why Sensor Network

More information

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution

Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Analyzing Big Data with Splunk A Cost Effective Storage Architecture and Solution Jonathan Halstuch, COO, RackTop Systems JHalstuch@racktopsystems.com Big Data Invasion We hear so much on Big Data and

More information

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago

Globus Striped GridFTP Framework and Server. Raj Kettimuthu, ANL and U. Chicago Globus Striped GridFTP Framework and Server Raj Kettimuthu, ANL and U. Chicago Outline Introduction Features Motivation Architecture Globus XIO Experimental Results 3 August 2005 The Ohio State University

More information

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression

WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression WHITE PAPER Improving Storage Efficiencies with Data Deduplication and Compression Sponsored by: Oracle Steven Scully May 2010 Benjamin Woo IDC OPINION Global Headquarters: 5 Speen Street Framingham, MA

More information

Cloud Computing and the Future of Internet Services. Wei-Ying Ma Principal Researcher, Research Area Manager Microsoft Research Asia

Cloud Computing and the Future of Internet Services. Wei-Ying Ma Principal Researcher, Research Area Manager Microsoft Research Asia Cloud Computing and the Future of Internet Services Wei-Ying Ma Principal Researcher, Research Area Manager Microsoft Research Asia Computing as Utility Grid Computing Web Services in the Cloud What is

More information

JOURNAL OF OBJECT TECHNOLOGY

JOURNAL OF OBJECT TECHNOLOGY JOURNAL OF OBJECT TECHNOLOGY Online at www.jot.fm. Published by ETH Zurich, Chair of Software Engineering JOT, 2008 Vol. 7, No. 8, November-December 2008 What s Your Information Agenda? Mahesh H. Dodani,

More information

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland The Lattice Project: A Multi-Model Grid Computing System Center for Bioinformatics and Computational Biology University of Maryland Parallel Computing PARALLEL COMPUTING a form of computation in which

More information

The Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT): A Vision for Large-Scale Climate Data

The Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT): A Vision for Large-Scale Climate Data The Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT): A Vision for Large-Scale Climate Data Lawrence Livermore National Laboratory? Hank Childs (LBNL) and Charles Doutriaux (LLNL) September

More information

Distributed Systems LEEC (2005/06 2º Sem.)

Distributed Systems LEEC (2005/06 2º Sem.) Distributed Systems LEEC (2005/06 2º Sem.) Introduction João Paulo Carvalho Universidade Técnica de Lisboa / Instituto Superior Técnico Outline Definition of a Distributed System Goals Connecting Users

More information

Appendix 1 ExaRD Detailed Technical Descriptions

Appendix 1 ExaRD Detailed Technical Descriptions Appendix 1 ExaRD Detailed Technical Descriptions Contents 1 Application Foundations... 3 1.1 Co- Design... 3 1.2 Applied Mathematics... 5 1.3 Data Analytics and Visualization... 8 2 User Experiences...

More information

How To Write A Data Mining Program On A Large Data Set

How To Write A Data Mining Program On A Large Data Set On the Role of Indexing for Big Data in Scientific Domains Arie Shoshani Lawrence Berkeley National Lab BIGDATA and EXTREME-SCALE COMPUTING April 3-May, 23 Outline q Examples of indexing needs in scientific

More information

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment

2. Research and Development on the Autonomic Operation. Control Infrastructure Technologies in the Cloud Computing Environment R&D supporting future cloud computing infrastructure technologies Research and Development on Autonomic Operation Control Infrastructure Technologies in the Cloud Computing Environment DEMPO Hiroshi, KAMI

More information

HPCOR 2014. Data Management Policies Co-Chairs: Bill Allcock, Julia White. HPCOR 2014, June 18-19, Oakland, CA 1

HPCOR 2014. Data Management Policies Co-Chairs: Bill Allcock, Julia White. HPCOR 2014, June 18-19, Oakland, CA 1 HPCOR 2014 Data Management Policies Co-Chairs: Bill Allcock, Julia White HPCOR 2014, June 18-19, Oakland, CA 1 Contributors Bill Allcock, ANL Sasha Ames, LLNL Ashley Barker, ORNL Laura Biven, DOE Jeff

More information

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes Anthony Kenisky, VP of North America Sales About Appro Over 20 Years of Experience 1991 2000 OEM Server Manufacturer 2001-2007

More information

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database

Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database WHITE PAPER Improve Business Productivity and User Experience with a SanDisk Powered SQL Server 2014 In-Memory OLTP Database 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents Executive

More information

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC2013 - Denver 1 The PHI solution Fujitsu Industry Ready Intel XEON-PHI based solution SC2013 - Denver Industrial Application Challenges Most of existing scientific and technical applications Are written for legacy execution

More information

Bringing Big Data Modelling into the Hands of Domain Experts

Bringing Big Data Modelling into the Hands of Domain Experts Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks david.willingham@mathworks.com.au 2015 The MathWorks, Inc. 1 Data is the sword of the

More information

Bandwidth Modeling in Large Distributed Systems for Big Data Applications

Bandwidth Modeling in Large Distributed Systems for Big Data Applications Bandwidth Modeling in Large Distributed Systems for Big Data Applications Bahman Javadi School of Computing, Engineering and Mathematics University of Western Sydney, Australia Email: b.javadi@uws.edu.au

More information

Reference Architecture, Requirements, Gaps, Roles

Reference Architecture, Requirements, Gaps, Roles Reference Architecture, Requirements, Gaps, Roles The contents of this document are an excerpt from the brainstorming document M0014. The purpose is to show how a detailed Big Data Reference Architecture

More information

www.xenon.com.au STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING

www.xenon.com.au STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING VISUALISATION GPU COMPUTING www.xenon.com.au STORAGE HIGH SPEED INTERCONNECTS HIGH PERFORMANCE COMPUTING GPU COMPUTING VISUALISATION XENON Accelerating Exploration Mineral, oil and gas exploration is an expensive and challenging

More information

NA-ASC-128R-14-VOL.2-PN

NA-ASC-128R-14-VOL.2-PN L L N L - X X X X - X X X X X Advanced Simulation and Computing FY15 19 Program Notebook for Advanced Technology Development & Mitigation, Computational Systems and Software Environment and Facility Operations

More information

Manjrasoft Market Oriented Cloud Computing Platform

Manjrasoft Market Oriented Cloud Computing Platform Manjrasoft Market Oriented Cloud Computing Platform Innovative Solutions for 3D Rendering Aneka is a market oriented Cloud development and management platform with rapid application development and workload

More information

Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform

Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform 212 IEEE 26th International Parallel and Distributed Processing Symposium Enabling In-situ Execution of Coupled Scientific Workflow on Multi-core Platform Fan Zhang, Ciprian Docan, Manish Parashar Center

More information

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery

Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Center for Information Services and High Performance Computing (ZIH) Managing Complexity in Distributed Data Life Cycles Enhancing Scientific Discovery Richard Grunzke*, Jens Krüger, Sandra Gesing, Sonja

More information

IO Informatics The Sentient Suite

IO Informatics The Sentient Suite IO Informatics The Sentient Suite Our software, The Sentient Suite, allows a user to assemble, view, analyze and search very disparate information in a common environment. The disparate data can be numeric

More information