Hank Childs, University of Oregon



Similar documents
Large Vector-Field Visualization, Theory and Practice: Large Data and Parallel Visualization Hank Childs + D. Pugmire, D. Camp, C. Garth, G.

Parallel Analysis and Visualization on Cray Compute Node Linux

Visualization and Data Analysis

The Ultra-scale Visualization Climate Data Analysis Tools (UV-CDAT): A Vision for Large-Scale Climate Data

Hadoop on the Gordon Data Intensive Cluster

Barry Bolding, Ph.D. VP, Cray Product Division

Visualizing Large, Complex Data

Towards Scalable Visualization Plugins for Data Staging Workflows

Linux Cluster Computing An Administrator s Perspective

Kriterien für ein PetaFlop System

Large Scale Simulation on Clusters using COMSOL 4.2

VisIt: A Tool for Visualizing and Analyzing Very Large Data. Hank Childs, Lawrence Berkeley National Laboratory December 13, 2010

Big Data Management in the Clouds and HPC Systems

General Parallel File System (GPFS) Native RAID For 100,000-Disk Petascale Systems

Use Cases for Large Memory Appliance/Burst Buffer

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Solvers, Algorithms and Libraries (SAL)

Information Visualization WS 2013/14 11 Visual Analytics

Data Centric Systems (DCS)

Jean-Pierre Panziera Teratec 2011

DIY Parallel Data Analysis

Summit and Sierra Supercomputers:

Parallel Computing. Benson Muite. benson.

An Analytical Framework for Particle and Volume Data of Large-Scale Combustion Simulations. Franz Sauer 1, Hongfeng Yu 2, Kwan-Liu Ma 1

The Scientific Data Mining Process

Agenda. HPC Software Stack. HPC Post-Processing Visualization. Case Study National Scientific Center. European HPC Benchmark Center Montpellier PSSC

COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Big Data by the numbers

The Design and Implement of Ultra-scale Data Parallel. In-situ Visualization System

High Performance Computing. Course Notes HPC Fundamentals

New Dimensions in Configurable Computing at runtime simultaneously allows Big Data and fine Grain HPC

Scientific Discovery at the Exascale:

Temperature Control Loop Analyzer (TeCLA) Software

The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

IT of SPIM Data Storage and Compression. EMBO Course - August 27th! Jeff Oegema, Peter Steinbach, Oscar Gonzalez

Achieving Performance Isolation with Lightweight Co-Kernels

Application and Micro-benchmark Performance using MVAPICH2-X on SDSC Gordon Cluster

How swift is your Swift? Ning Zhang, OpenStack Engineer at Zmanda Chander Kant, CEO at Zmanda

Oracle Database In-Memory The Next Big Thing

Mission Need Statement for the Next Generation High Performance Production Computing System Project (NERSC-8)

Trends in High-Performance Computing for Power Grid Applications

Big Data Processing. Patrick Wendell Databricks

Designing and Building Applications for Extreme Scale Systems CS598 William Gropp

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Data Deduplication in a Hybrid Architecture for Improving Write Performance

Hue Streams. Seismic Compression Technology. Years of my life were wasted waiting for data loading and copying

Cosmological simulations on High Performance Computers

ABSTRACT FOR THE 1ST INTERNATIONAL WORKSHOP ON HIGH ORDER CFD METHODS

Data Analytics at NERSC. Joaquin Correa NERSC Data and Analytics Services

YALES2 porting on the Xeon- Phi Early results

HETEROGENEOUS HPC, ARCHITECTURE OPTIMIZATION, AND NVLINK

Performance Analysis of Flash Storage Devices and their Application in High Performance Computing

The Fusion of Supercomputing and Big Data: The Role of Global Memory Architectures in Future Large Scale Data Analytics

Appro Supercomputer Solutions Best Practices Appro 2012 Deployment Successes. Anthony Kenisky, VP of North America Sales

HPC & Visualization. Visualization and High-Performance Computing

OS/Run'me and Execu'on Time Produc'vity

Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory

Recommended hardware system configurations for ANSYS users

Introduction to Visualization with VTK and ParaView

Practical issues in DIY RAID Recovery

Application of Predictive Analytics for Better Alignment of Business and IT

Building Platform as a Service for Scientific Applications

Lecture 3: Evaluating Computer Architectures. Software & Hardware: The Virtuous Cycle?

VisIt: An End-User Tool For Visualizing and Analyzing Very Large Data

Overlapping Data Transfer With Application Execution on Clusters

Energy Efficient MapReduce

Reconfigurable Architecture Requirements for Co-Designed Virtual Machines

Current Status of FEFS for the K computer

Building a Top500-class Supercomputing Cluster at LNS-BUAP

COMP/CS 605: Intro to Parallel Computing Lecture 01: Parallel Computing Overview (Part 1)

CSC384 Intro to Artificial Intelligence

PERFORMANCE MODELS FOR APACHE ACCUMULO:

Transcription:

Exascale Analysis & Visualization: Get Ready For a Whole New World Sept. 16, 2015 Hank Childs, University of Oregon

Before I forget VisIt: visualization and analysis for very big data DOE Workshop for Data Management, Analysis, and Visualization

What I will be talking about Theme: looking ahead to the exascale era, and thinking about visualization and analysis Talking about research challenges (Occasionally) discussing recent research results

Visualization & analysis is a key aspect of the simulation process. Three main phases to simulation process: Lots and lots of data Problem setup (i.e. meshing) Simulation Visualization/ Analysis Visualization & analysis is used in three distinct ways: Scientists confirm their simulation is running correctly. Scientists explore data, leading to new insights. Scientists communicate simulation results to an audience.

Post-Hoc Analysis (Post-processing) Problem setup (i.e. meshing) Simulation Visualization Week 1 Week 2 Week 2 Week 3 Week 2 Week 3 Week 5 Note: simulations produce Filesystem data every cycle, but only store data infrequently (e.g., every Nth cycle)

Star%ng point for an exascale talk: I believe some of the things that scared us about exascale will happen well before exascale. the Slide of Doom System Parameter 2011 2018 Factor Change System Peak 2 PetaFLOPS 1 ExaFLOP 500 Power 6 MW 20 MW 3 System Memory 0.3 PB 32 64 PB 100 200 Total Concurrency 225K 1B 10 1B 100 40,000 400,000 Node Performance 125 GF 1 TF 10 TF 8 80 Node Concurrency 12 1,000 10,000 83 830 Network BW 1.5 KB/s 100 GB/s 1000 GB/s 66 660 System Size (nodes) 18,700 1,000,000 100,000 50 500 I/O Capacity 15 PB 300 1000 PB 20 67 I/O BW 0.2 TB/s 20 60 TB/s 100 300 credit: Ken Moreland, Sandia

I/O bandwidth: 0.2TB/s à 20-60TB/s Argument #1: I don t believe it Machine Year Time to write memory ASCI Red 1997 300 sec ASCI Blue Pacific 1998 400 sec ASCI White 2001 660 sec ORNL Titan (2012): 1TB/s ORNL Summit (2017): 1TB/s ASCI Red Storm 2004 660 sec ASCI Purple 2005 500 sec Jaguar XT4 2007 1400 sec Roadrunner 2008 1600 sec Jaguar XT5 2008 1250 sec c/o Jeremy Meredith & David Pugmire, ORNL

I/O bandwidth: 0.2TB/s à 20-60TB/s Argument #2: it will be even worse than expected because of technology changes Important distinction: WRITE bandwidth vs READ bandwidth Old model: parallel file system (PFS) What are the implications? New model: SSDs on nodes, backed by parallel file system Write bandwidth Read bandwidth If bandwidth(ssds) > bandwidth(pfs) à n Write bandwidth > Read bandwidth If bandwidth(ssds) >> bandwidth(pfs) à n Write bandwidth >> Read bandwidth

I/O and vis/analysis n n n n Parallel vis & analysis on supercomputer is almost always >50% I/O and sometimes 98% Summary I/O of last few slides: Amount of data the slowest to visualize part of postprocessing is is typically O(total getting slower, mem) and so performance will suffer greatly. Two big factors: This is bad. 1 2 how much data you have to read how fast you can read it Future machine Current machine FLOPs Memory I/O à Relative I/O (ratio of total memory and I/O) is key

Result: community is moving toward in situ processing In situ defined: produce vis & analysis w/o writing to disk Multiple ways to accomplish: loosely coupled vs tightly coupled Common perceptions of in situ: Pros: n No I/O & plenty of compute n Access to more data Cons: n Very memory constrained n Some operations not possible n Once the simulation has advanced, you cannot go back and analyze it n User must know what to look at a priori

Do we have our use cases covered? (thinking in broad strokes ) Three primary use cases: Exploration Confirmation Communication? Examples: Scientific discovery Debugging Examples: Data analysis Images / movies In situ Examples: Data analysis Images / movies Comparison

Enabling exploration via in situ processing Exascale Simulation = 1 MPI task of simula%on = in situ data reduc%on rou%ne Disk It is not clear what the best way is to use in situ processing Reduced data to set enable exploration with post-hoc analysis it is only clear that we need to do it. Explora%on via post hoc analysis (post- processing) Requirement: must transform the data in a way that both reduces and enables meaningful exploration.

Select Current Activity #1: Wavelet Compression S. Li, K. Gruchalla, K. Potter, J. Clyne, and H. Childs. Evaluating the Efficacy of Wavelet Configurations on Turbulent-Flow Data. In Proceedings of IEEE Symposium on Large Data Analysis and Visualization, Oct. 2015. To appear.

Select Current Activity #1: Wavelet Compression Analysis #1: error is too big beyond 8:1 compression Analysis #2: error is acceptable up to 256:1 compression

Displace massless particle based on velocity field S(t) = position of curve at time t S(t 0 ) = p 0 n t 0 : initial time n p 0 : initial position S (t) = v(t, S(t)) Select Current Activity #2: n v(t, p): velocity at time t and position p n S (t): derivative of the integral curve at time t Particle Advection Analyzing particle trajectories (i.e., advection) is foundational to almost all flow visualization and analysis techniques.

Courtesy Garth, Kaiserslautern Advection is the basis for many flow visualization techniques Takeaways: Used for diverse analyses Diverse computational loads

Select Current Activity #2: Particle Advection Post hoc exploratory particle advection of timevarying data This means: Save time-varying data from simulation After simulation, explore data using particle advection techniques n IMPORTANT: locations of desired particles are not known a priori A. Agranovsky, D. Camp, C. Garth, E. W. Bethel, K. I. Joy, and H. Childs. Improved Post Hoc Flow Analysis Via Lagrangian Representations. In Proceedings of the IEEE Symposium on Large Data Visualization and Analysis (LDAV), pages 67 75, Paris, France, Nov. 2014. Best paper award.

Post Hoc Exploratory Particle Advection: Traditional Method T=0 T=10 Eulerian: v( ) = Eulerian: v( ) = Save time slices of vector fields Integrate (RK4) using interpolation, both spatially and temporally T=20 T=30 Eulerian: v( ) = Eulerian: v( ) =

Our Algorithm Two Phase Phase 1: in situ extraction of basis trajectories Phase 2: post-hoc analysis, where new trajectories are interpolated from basis Simulation In Situ Extraction of Basis Trajectories Storage Basis Trajectories Post Hoc Interpolation of Basis Trajectories

Traditional Advection 0 6 12 18 24 New Technique 0 6 12 18 24 F0 F6 F12 F18 F24 F0->6 F6->12 F12->18 F18->24 T=6 T=24 T=6 T=24 T=12 T=18 T=12 T=18

Our New Method Provides Significant Improvement Over The Traditional Method Accuracy increases of up to 10X, with the same storage as traditional techniques Storage decreases of up to 64X, with the same accuracy as traditional techniques Speed interaction up to 80X faster Keys to success: we can access more data in situ than we could post hoc accepted method (post hoc) producing poor answers

Summary Exascale ramifications for visualization & analysis: Slow I/O will change our modus operandi So we will need to run in situ And exploration is important n So we need to think about operators that transform and reduce data in situ n And how they tradeoff between accuracy and data reduction Many uncovered topics: SW for many-core architectures, power consumption, specific forms of in situ, fault tolerance, outputs from multiphysics codes, ensemble analysis, etc.