Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner



Similar documents
Data and beyond

Recent Advances in Periscope for Performance Analysis and Tuning

Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München

GPU System Architecture. Alan Gray EPCC The University of Edinburgh

3rd International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2016) March 10-11, 2016 VIT University, Chennai, India

1 st Symposium on Colossal Data and Networking (CDAN-2016) March 18-19, 2016 Medicaps Group of Institutions, Indore, India

Part I Courses Syllabus

Dr. Raju Namburu Computational Sciences Campaign U.S. Army Research Laboratory. The Nation s Premier Laboratory for Land Forces UNCLASSIFIED

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

HPC Wales Skills Academy Course Catalogue 2015

Reference Architecture, Requirements, Gaps, Roles

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

BIG DATA IN THE CLOUD : CHALLENGES AND OPPORTUNITIES MARY- JANE SULE & PROF. MAOZHEN LI BRUNEL UNIVERSITY, LONDON

HPC technology and future architecture

Introducing PgOpenCL A New PostgreSQL Procedural Language Unlocking the Power of the GPU! By Tim Child

Overview on Modern Accelerators and Programming Paradigms Ivan Giro7o

Xeon+FPGA Platform for the Data Center

Programming models for heterogeneous computing. Manuel Ujaldón Nvidia CUDA Fellow and A/Prof. Computer Architecture Department University of Malaga

Case Study on Productivity and Performance of GPGPUs

High Performance Computing

Data-intensive HPC: opportunities and challenges. Patrick Valduriez

ICT4 - Customised and low power computing

High Performance Computing and Big Data: The coming wave.

Data-Flow Awareness in Parallel Data Processing

Unleashing the Performance Potential of GPUs for Atmospheric Dynamic Solvers

Data Center and Cloud Computing Market Landscape and Challenges

Debugging in Heterogeneous Environments with TotalView. ECMWF HPC Workshop 30 th October 2014

BSC vision on Big Data and extreme scale computing

Scalable and High Performance Computing for Big Data Analytics in Understanding the Human Dynamics in the Mobile Age

Introduction to GPU Programming Languages

Accelerating Simulation & Analysis with Hybrid GPU Parallelization and Cloud Computing

IoT is a King, Big data is a Queen and Cloud is a Palace

Unified Performance Data Collection with Score-P

Maximize Performance and Scalability of RADIOSS* Structural Analysis Software on Intel Xeon Processor E7 v2 Family-Based Platforms

Data Centric Systems (DCS)

Introduction to grid technologies, parallel and cloud computing. Alaa Osama Allam Saida Saad Mohamed Mohamed Ibrahim Gaber

Optimizing a 3D-FWT code in a cluster of CPUs+GPUs

Denis Caromel, CEO Ac.veEon. Orchestrate and Accelerate Applica.ons. Open Source Cloud Solu.ons Hybrid Cloud: Private with Burst Capacity

The Lattice Project: A Multi-Model Grid Computing System. Center for Bioinformatics and Computational Biology University of Maryland

Access, Documentation and Service Desk. Anupam Karmakar / Application Support Group / Astro Lab

Cluster, Grid, Cloud Concepts

An introduction to System Architecture

Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.

HPC with Multicore and GPUs

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

Developing Scalable Smart Grid Infrastructure to Enable Secure Transmission System Control

Oracle Big Data SQL Technical Update

Silviu Panica, Marian Neagul, Daniela Zaharie and Dana Petcu (Romania)

Big Data Visualization on the MIC

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Manjrasoft Market Oriented Cloud Computing Platform

Outline. High Performance Computing (HPC) Big Data meets HPC. Case Studies: Some facts about Big Data Technologies HPC and Big Data converging

Jean-Pierre Panziera Teratec 2011

Standard Big Data Architecture and Infrastructure

News and trends in Data Warehouse Automation, Big Data and BI. Johan Hendrickx & Dirk Vermeiren

Big Data R&D Initiative

Program Grid and HPC5+ workshop

< IMPACT > START ACCELERATE IMPACT

A Brief Introduction to Apache Tez

The Fusion of Supercomputing and Big Data. Peter Ungaro President & CEO

Extending the Power of FPGAs. Salil Raje, Xilinx

Bringing Big Data Modelling into the Hands of Domain Experts

A Case Study - Scaling Legacy Code on Next Generation Platforms

Integrated Communication Systems

Keeneland Enabling Heterogeneous Computing for the Open Science Community Philip C. Roth Oak Ridge National Laboratory

Resource Scheduling Best Practice in Hybrid Clusters

PRACE: access to Tier-0 systems and enabling the access to ExaScale systems Dr. Sergi Girona Managing Director and Chair of the PRACE Board of

An adaptive framework for utility-based optimization of scientific applications in the cloud

Evaluation of CUDA Fortran for the CFD code Strukti

Seeking Opportunities for Hardware Acceleration in Big Data Analytics

Building Platform as a Service for Scientific Applications

HIGH PERFORMANCE BIG DATA ANALYTICS

Workprogramme

NVIDIA CUDA Software and GPU Parallel Computing Architecture. David B. Kirk, Chief Scientist

Machine Learning and Cloud Computing. trends, issues, solutions. EGI-InSPIRE RI

BMW11: Dealing with the Massive Data Generated by Many-Core Systems. Dr Don Grice IBM Corporation

From Distributed Computing to Distributed Artificial Intelligence

Experiences With Mobile Processors for Energy Efficient HPC

Overview of HPC Resources at Vanderbilt

So#ware Tools and Techniques for HPC, Clouds, and Server- Class SoCs Ron Brightwell

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Scientific Computing Programming with Parallel Objects

HARNESS project: Managing Heterogeneous Compute Resources for a Cloud Platform

High Performance Spatial Queries and Analytics for Spatial Big Data. Fusheng Wang. Department of Biomedical Informatics Emory University

High Performance Applications over the Cloud: Gains and Losses

Intel HPC Distribution for Apache Hadoop* Software including Intel Enterprise Edition for Lustre* Software. SC13, November, 2013

BIG DATA AND ANALYTICS

Big Data Analytics. Chances and Challenges. Volker Markl

2015 The MathWorks, Inc. 1

The UNECE Big Data Sandbox: What Means to What Ends?

Exascale Challenges and General Purpose Processors. Avinash Sodani, Ph.D. Chief Architect, Knights Landing Processor Intel Corporation

Cray: Enabling Real-Time Discovery in Big Data

Auto-Tuning TRSM with an Asynchronous Task Assignment Model on Multicore, GPU and Coprocessor Systems

The Uintah Framework: A Unified Heterogeneous Task Scheduling and Runtime System

The PHI solution. Fujitsu Industry Ready Intel XEON-PHI based solution. SC Denver

HPC Programming Framework Research Team

Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction

Using the Grid for the interactive workflow management in biomedicine. Andrea Schenone BIOLAB DIST University of Genova

A REVIEW PAPER ON THE HADOOP DISTRIBUTED FILE SYSTEM

Parallel Computing. Benson Muite. benson.

Transcription:

Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at

Research Group Scientific Computing One of twelve research groups at the Faculty of Computer Science, University of Vienna. Parallel Computing / HPC o o o Programming Models and Languages Compiler and Runtime Technologies Programming Environments and Tools Vienna Fortran, HPF+, Hybrid Programming, Multicore, GPU, Phi Grid/Cloud/Big Data o Compute and Data Services o Semantic Data Integration o Big Data and HPC Vienna Cloud Environment, VPH-Share,

Recent Research Projects PEPPHER, Performance Portability and Programmability for Heterogeneous Many-core Architectures European Commission, FP7, 2010-2014, Coordinator VPH-SHARE, Virtual Physiological Human - Sharing for Healthcare, European Commission, FP7, 2011-2015 AutoTune - Automatic Online Tuning, European Commission, FP7, 2011-2015 RETIDA - Real-Time Data Analytics for the Mobility Domain, FFG, 2014-2017

EU Project PEPPHER http://www.peppher.eu Performance Portability & Programmability for Heterogeneous Many-Core Architectures FP7 ICT, Computing Systems; 2010-2014 Partners: UNIVIE, INRIA, LIU, Intel, Movidus, Codeplay, KIT, Chalmers, TUW Methodology & framework for development of performance portable code Execute same application efficiently on different heterogeneous architectures Support real hybrid execution to exploit all available computing units Application (C/C++ PEPPHER Framework Many-core CPU CPU+GPU PEPPHER Sim PePU (Movidius Focus: Single-node heterogeneous architectures Intel MIC

PEPPHER Framework http://www.peppher.eu Applications Embedded General Purpose HPC High-Level Coordination/Patterns/Skeletons Pla%orm Pla%orm Model Component Model (PDL Descriptor (PDL Components C/C++, OpenMP, CUDA, OpenCL, TBB, Offload Autotuned Algorithms Data Structures Transformation & Composition Component Task Graph Performance Models Performance feedback PEPPHER Run-time (StarPU Scheduling Strategy Scheduling Strategy Drivers (CUDA, OpenCL, OpenMP CPU GPU MIC SIM PePU Pla%orm Pla%orm Pla%orm Model Model (PDL Descriptor (PDL (PDL

The AutoTune Project http://www.autotune-project.eu FP7 ICT, Computing Systems; 42 months, 2011-2015 Partners: TUM, UAB, CAPS, LRZ, ICHEC, UVIE Combine performance analysis and tuning into single framework. Periscope Tuning Framework (PTF Extend Periscope with automatic tuning plugins for performance and energy efficiency tuning. Design'Time' Coding' Performance'Analysis' Execute whole tuning process online (performance analysis and tuning during single application run. Use expert knowledge to guide search for performance properties and tuned versions. Tuning' Run,me' Performance'Analysis' Tuning'

Periscope Tuning Framework (PTF Parallel architectures Multicore servers Supercomputers (SuperMUC Accelerated systems (GPU, Xeon Phi Tuning (at Different Layers of SW Stack High-level language (directives/annotations Compilers / Transformation systems Runtime systems and libraries Operating system Programming paradigms MPI, MPI/OpenMP OpenCL/CUDA Parallel Patterns (PEPPHER Developed Tuning Plugins MPI Parameter Tuning Pipeline Patterns for CPU/GPU DVFS Plugin Compiler Flags Selection MPI Master/Worker Tuning OpenCL Worksize Tuning PTF available as open source: http://periscope.in.tum.de/releases/latest/tar/ptf-latest.tar.bz2

EU Project VPH-Share http://www.vph-share.eu Virtual Physiological Human: Sharing for Healthcare (European Commission, FP7, 2011-2015, Coordinator: The University of Sheffield VPH-Share developed a Cloud infostructure to facilitate integration of/ access to... Patient data across different systems, hospitals, countries... Information/models related to various parts and processes of human body Knowledge (guidelines, standards, protocols in research and clinical practice Specific requirements Complex, distributed, heterogeneous data (multi-scale, multi-modal Sensitive data (security, privacy, legal issues, data quality Specialized analytics to integrate bioinformatics and systems biology information with clinical observations

Retrieve Existing Data Select Workflow Infer missing items Return Results & Support Run simulation euheart @neurist Infostructure VPH Outreach Knowledge Discovery Data Inference Data Services: Patient/Population HPC Infrastructure DraC 1v0, July 2012 University of Sheffield, UK Partners: VPH OP ViroLab Patient Centred Computational Workflows Knowledge Management Compute Services Project No: 269978 Co-ordinator: Personalised Model Patient Avatar Application Patient Data Workflow Inputs Workflow Outputs Users Decision Support Storage Services CYFRONET, PL Sheffield Teaching Hospitals, UK ATOS Origin, ES Kings College London, UK Universitat Pompeu Fabra, ES Empirica, DE SCS SRL, IT NHS IC, UK INRIA, FR IOR, IT Open Univ., UK Philips Elec., NL TU Eindhoven, NL Univ. Auckland, NZ Uv Amsterdam, NL UCL, UK Univ. Vienna, AT AATRM, ES FCRB, ES Cloud Platform (Public / Private (DEISA / PRACE p- medicine & VPH- Share: Sankt Augus9n, Friday, 31st August 2012 9

VPH-Share Data Management Data Publication Suite select & annotate data with semantic concepts, de-identify selected data items expose data as Cloud services Institution Data Publication Suite Internal( Dataset( Publish Cloud Infrastructure Data Management Deploy & Expose Datasets Dataset Service Dataset( Data Service Environment (VDSE based on Vienna Cloud Environment (VCE provision of data sets (RDBs as Cloud services access via SQL or SPARQL provide on-demand customized views federation across multiple data sets preserve autonomy of underlying data sources Client Client Client SQL/SPARQL Virtual Dataset Service Query Dataset Service Virtual Dataset Services Dataset( Dataset( DBS DBS Semantic mechanisms to discover, link and search data Dataset Service Vienna Data Service Environment DBS DBS S. Benkner, Research Group Scientific Computing, University of Vienna.

VPH-Share Data Publication Suite S. Benkner, Research Group Scientific Computing, University of Vienna.

RETIDA https://dts.ait.ac.at/projects/retida/ Real-time data analytics for the mobility domain Austrian FFG, ICT of the Future, 2014-2016 Real-time integration and analytics of large-scale heterogeneous data sources mobile phone data, floating car data, GIS, weather, social media, Massively parallel, adaptive execution of generic data analytics workflows support for heterogeneous architectures (GPUs, Xeon Phi, Application-specific visualizations S. Benkner, Research Group Scientific Computing, University of Vienna.

RETIDA https://dts.ait.ac.at/projects/retida/ Streaming Data Sources BatchProcessingLayer Scalable Storage (HDFS complete data set Stream processing (current data ServingLayer Computation of batch views Batch views Real-Time views Increment views Queries Real%TimeProcessingLayer Hadoop-based lambda architecture scalable and fault-tolerant processing real-time vs. batch High performance data pipeline C++ re-configurable framework real-time processing capabilities S. Benkner, Research Group Scientific Computing, University of Vienna.

Future Research Programming Support for Big Data Applications Taking Advantage of Heterogeneous Architectures Streaming and Real-Time Support Runtime systems for Big Data Applications Cloud-based Scientific Data Management (VDSE S. Benkner, Research Group Scientific Computing, University of Vienna.