HPC in the age of (Big)Data The fusion of supercomputing and Big, Fast Data Analytics Ugo Varetto (uvaretto@cray.com) CRAY EMEA Research Lab
Agenda Outlook Requirements Implementation 2
Outlook
Cray s Vision: The Fusion of Supercomputing and Big, Fast Data Math Models Modeling and simulation augmented with data to provide the highest fidelity virtual reality results Compute Store Data-Intensive Processing High throughput data: capture, processing, retention, and management Analyze Data Models Integration of datasets and math models for search, analysis, predictive modeling and knowledge discovery 4
An Enormous Opportunity Scientific Data Historical Data Sensor Data Research & Demographics Social Media Big Data Public Data 3 rd Party Data Enterprise Data Operational Data Transactions 5
Data processing: Rising demand 6
HPC: New large-scale data-intensive projects Whole system design for Interactive Supercomputing (PCP) 100 Petaflops for data processing Establish platforms to preserve and share data C O M P U T E S T O R E A N A L Y Z E 7
HPC: Data Centric Systems Development Accommodate new data centric system architectures Handle data motion highly variable in Size Access patterns Temporal behavior System level co-design required: I/O Compute Interconnect fabric Storage Reliability Software 8
Next generation data centric HPC - Implications (Industry) Systems of Insight : Tightly integrated design - modeling simulation loop Interactive analysis, visualization and simulation (Basic Science) Discovery through: Modeling Simulation Analysis 9
The traditional scientific discovery workflow Observe Postulate Model Predict Validate 10
The emerging scientific discovery workflow Acquire, filter, preprocess Store Insights C O M P U T E S T O R E A N A L Y Z E 11
The Fourth Paradigm (?) 1. Theory 2. Experiments 3. Computer Simulations 4. Data-Intensive and Data-Driven Discovery C O M P U T E S T O R E A N A L Y Z E 12
The intersection of HPC and BigData Traditional HPC tasks with BigData tools E.g. (Sparse) linear algebra with MLlib + Breeze and jblas Traditional BigData tasks on HPC systems: E.g. SPIDAL (Scalable Parallel Interoperable Data Analytics Library), NSF funded HPC Numerical Libraries BigData BigData on Cray HPC systems (undisclosed project): Standard analytics tools Optimized graph analytics engine myhadoop: Hadoop on standard HPC systems 13
Multi-scale Equation-free Modeling Standard modeling: Description of system at fine level Question asked are at macroscopic level Derive macroscopic behavior from microscopic evolution equations Equation Free modeling: 1. Run short bursts of simulation at fine scale 2. Input results from (1) into coarse scale simulation 3. Use results from (2) to initialize next burst of fine scale computation 4. Mapping between fine and coarse scale (w/ dimensionality reduction if needed) 14
The fusion of HPC and BigData Commun. Math. Sci. Volume 1, Number 4 (2003), 715-762. Equation-Free, Coarse-Grained Multiscale Computation: Enabling Macroscopic Simulators to Perform System-Level Analysis - C. William Gear, James M. Hyman, Panagiotis Yannnis G. Kevrekidis, Olof Runborg, and Constantinos Theodoropoulos Coarse time stepper HPC ensemble * * ML? Data-intensive/ Data-driven 15
The fusion of HPC and BigData Integrated simulation and data analysis Simulation Amber MD HPC Averaging & Pattern matching BigData 16
Requirements
HPC and BigData HPC Fast single-node performance Fast interconnect (with RDMA) Homogeneous (w.r.t. software tools, languages) environment Simple job scheduling policies Regular memory access patterns Floating point BigData Reliability Heterogeneous environment Complex task-based scheduling Irregular memory access patterns Integer BigData & HPC: Fast and Scalable I/O 18
Benchmarks Berkin Özisikyilmaz, Ramanathan Narayanan, Joseph Zambreno, Gokhan Memik, and Alok N. Choudhary. An architectural characterization study of data mining and bioinformatics work-loads. IISWC, pages 61 70, 2006. 19
Compute Intensive vs Data Intensive Synergistic Challenges in Data Intensive Science and Exascale Computing DoE ASCAC Data Subcommittee Report March 2013. Figure 2.2 20
Implementation
Components Storage I/O Network Computing Hardware System software - e.g. job scheduler Software Middleware Libraries Programming languages 22
Storage Scalable, fast and reliable storage: Sonexion + DataWarp + TAS DataWarp local (to compute node) file system, additional cache layer TAS = Tiered Adaptive Storage: Transparent access to high latency backup storage (e.g. tapes), files are automatically moved between tape and hard drive Same concept as regular CPU cache 23
I/O Lustre parallel file system Parallel I/O libraries available (MPI I/O, HDF5, NetCDF) iobuf for fast serial access Optimal access: Each process accesses a different Object Storage Target (a hard drive) Optimal access for BigData: Use Lustre in place of HDFS Stripe count = number of processes Stripe size = file size / stripe count Access data through Cray-optimized parallel I/O library Consider striping at the directory level, files inherit striping from parent directory 24
I/O Explore the use of currently available HPC solutions ADIOS I/O abstraction framework Automatic data distribution from configuration Change I/O method on-the-fly Staging: separation of data access from storage Asynchronous Indexing and querying Tested at scale Java API (accessible from JVM languages) Interface)to)apps)for)descrip/on)of)data)(ADIOS,)etc.)) Feedback) Mul/Bresolu/on) methods) Provenance) Data)Management)Services) Buffering) Data)Compression) methods) )Schedule) Data)Indexing) (FastBit))methods) Workflow))Engine) Run/me)engine) Data)movement) Analysis)Plugins) Plugins)to)the)hybrid)staging)area) Visualiza/on)Plugins) AdiosBbp) IDX) HDF5) pnetcdf) raw )data) Image)data) Parallel)and)Distributed)File)System) Viz.)Client) 25
ADIOS aspiration Using the Adaptable I/O System (ADIOS) Joint Facilities User Forum on Data-Intensive Computing June 18, 2014 Norbert Podhorszki 26
Network - ARIES Cray s custom interconnect Dragonfly topology Adaptive routing All to all oriented o Chassis with 16 compute blades o 128 Sockets o Inter-Aries communication over backplane o Per-Packet adaptive Routing C O M P U T E S T O R E A N A L Y Z E 27
Computing HW vendors working on better support for data intensive workloads e.g. Intel TSX Research on new architectures for data-intensive computing e.g. GoblinCore64, RISC-V ISA extension Re-configurable FPGAs Accelerators (useful for the massively parallel part of the analysis) Data-flow computers (e.g. Maxeler) 28
Job scheduling system Sample Taverna workflow for gene identification Source: The Fourth Paradigm book page 139 29
Job scheduling system Task based Dynamic: allocate resources as needed Dynamic job scheduling solution being tested on Cray Integration of workflow management with job scheduler BigData: MESOS, YARN HPC: Swift/K, Swift/T, integrates with SLURM and PBS 30
Software - middleware BigData applications rely on vast and diverse set of runtime systems, storage options, software libraries and programming languages Hard to impossible to support everything through a single system-wide configuration Custom Images (Cray), Virtualization, Docker, OpenStack 31
Software libraries Many BigData applications use the same algorithms found in HPC: Eigensolvers Factorization Optimization e.g. (stochastic) Gradient Descent SVD FFT BigData-optimized implementations do exist, e.g.: Efficient Multilevel Eigensolvers with Applications to Data Analysis Tasks - Dan Kushnir, Meirav Galun, and Achi Brandt Recent Development in the Sparse Fourier Transform [A compressed Fourier transform for BigData] - IEEE SIGNAL PROCESSING MAGAZINE [91] SEP. 2014 Use the right tool for the problem at hand: share libraries between HPC and BigData applications 32
Optimizing data access Data-intensive/driven computing can greatly benefit from data-flow (vs control-flow) strategies: Configure the computation and structure of data Let the system optimally distribute the data and run the units of computation Similar to Hadoop, Spark approach Chapel: High-level programming language for parallel distributed computing Transparent handling of data exchange among compute nodes Eiger: Compile/run-time framework to minimize data movement between memory layers 33
Chapel Shared memory Distributed Memory Data Parallel Hello, world! config const numiters = 100000; Distributed memory const D = {1..numIters} dmapped Cyclic(startIdx=1); forall i in D do writeln( Hello, world!, from iteration, i, of, numiters); 34
Shasta Cray next generation HPC system for supporting both compute intensive and data intensive workloads Support for multiple infrastructures and software environments Adaptive supercomputing: support for different building blocks: Intel Xeon Intel Xeon Phi Intel Omni-Path network fabrics 35
Thank you