Recent Advances in Periscope for Performance Analysis and Tuning

Size: px
Start display at page:

Download "Recent Advances in Periscope for Performance Analysis and Tuning"

Transcription

1 Recent Advances in Periscope for Performance Analysis and Tuning Isaias Compres, Michael Firbach, Michael Gerndt Robert Mijakovic, Yury Oleynik, Ventsislav Petkov Technische Universität München Yury Oleynik,

2 Outline Periscope overview Advances in Periscope Development I. PAThWay II. Performance Dynamics Analysis with Periscope III. Periscope Tuning Framework Yury Oleynik, 2

3 Projects LMAC Leistungsdynamik massiv-paralleler Codes Performance Dynamics of Massively Parallel Codes BMBF project AutoTune Automatic Online Tuning European Union FP7 project Yury Oleynik, 3

4 Periscope overview Distributed Architecture Analysis performed by multiple distributed hierarchical agents Iterative Online Analysis Measurements are configured, obtained and evaluated on the fly Automatic Analysis Based on formalized knowledge of performance optimization experts Eclipse Integration Eclipse based integrated development and performance analysis environment Measurement and Instrumentation Score-P or MRIMonitor Yury Oleynik, 4

5 Advances in Periscope Development Performance Dynamics Cross-experiment performance dynamics: Provide a tool for automating and organization of performance experiments during the optimization process Runtime performance dynamics: Automatically search for runtime performance dynamics properties Performance Tuning Perform automatic search for application configuration delivering best performance according to given objective Yury Oleynik, 5

6 I. Cross-experiment performance dynamics PATHWAY Yury Oleynik, 6

7 Problem statement Performance Engineering Performance engineering is an iterative cycle Requires in-depth knowledge of hw and sw Each step may involve many tools & different configurations Repetitive and manual Optimization spans over months Hard to organize data & results No clear track of process evolution Examples Scalability analysis Cross-platform analysis Verify Optimize problematic code sections Baseline Establish/Update Execute Parallel application Monitor Performance Analyze Bottlenecks Yury Oleynik, 7

8 PAThWay Eclipse plug-in for structured and methodical performance engineering using workflows Goals: Manage individual tasks as part of one workflow Automate performance engineering tasks, where possible Keep track and organize the process Abstract complexity of the underlying software and hardware Yury Oleynik, 8

9 Yury Oleynik, 9

10 Workflow Editor Workflow editor Available workflow components Yury Oleynik, 10

11 Experiment Browser Database stores also properties of the tools Experiments view Standard output and environment configuration Experiments Meta-data Yury Oleynik, 11

12 Project Documentation Accessible documentation is important Requirements Work progress Optimization ideas Commonly spread around multiple documents Wiki-based editor Completed experiments Links to other external resources Other wiki pages Yury Oleynik, 12

13 Supportive Modules Parallel Tools Platform Module Starting interactive/batch jobs Monitoring execution & accessing data Code Managements Keeps snapshots of the sources Based on Git Environment Detection Detects loaded modules Copies defined environment variables Yury Oleynik, 13

14 PAThWay Available as an Eclipse plugin from the update site: Installation guide: Yury Oleynik, 14

15 II. Performance Dynamics: at runtime AUTOMATIC PERFORMANCE DYNAMICS ANALYSIS WITH PERISCOPE Yury Oleynik, 15

16 Automatic Performance Dynamics Analysis with Periscope Motivation for Performance Dynamics Analysis Location and severity of performance bottlenecks is time-dependent Performance changes manifest themselves at various time scales Dimensionality of performance measurements makes manual investigation by the user tedious Analysis goals: Automatically detect changes in temporal performance behavior Quantify the negative impact of performance changes Reduce complexity and size of time-dependent measurements Simplify comprehension (no graphical visualization) Group entities with similar temporal performance behavior Yury Oleynik, 16

17 Automatic Performance Dynamics Analysis with Periscope Helps to answer following typical questions: Does the performance degrade over time? When is the degradation observed? What is the impact of the particular change? Which process/location is impacted by the performance degradation? Are there similar degradations found in other processes or functions? Approach Multi-scale analysis Qualitative abstraction of time series with quantitative information sufficient to characterize impact Representation mimics human mental model of temporal behavior Automatic search for performance dynamics properties Yury Oleynik, 17

18 Automatic Performance Dynamics Analysis with Periscope: Analysis Steps 1. Measurement a) Collect dynamic profile time-series using Score-P 2. Preprocessing a) Perform Scale-Space Filtering by filtering with Gaussian b) Extract extremas and inflexion points 3. Qualitative Abstraction a) Track extremas and inflexion points from coarse to fine scales b) Label intervals between extremas and inflexion points c) Extract maximum lifetime level of the resulting tree of intervals 4. Search for performance dynamics properties a) Search maximum lifetime level for predefined patterns both qualitatively and quantitatively Yury Oleynik, 18

19 Automatic Performance Dynamics Analysis with Periscope: Analysis Steps DABCBCDABCDABCDABCDABC D A D A B A C C C B CD B C CD B CD C B CD AB C B C B C C B C A - concave increase B - concave decrease C - convex decrease D - convex increase E - linear increase F - linear decrease G - constant Yury Oleynik, 19

20 Automatic Performance Dynamics Analysis with Periscope: Search for dynamics properties Search for dynamic properties: Find all picks (AB): DABCBCDABCDABCDABCDABC Find the most prominent valley (CD): DABCBCDABCDABCDABCDABC Find the highest increase (DA): DABCBCDABCDABCDABCDABC Yury Oleynik, 20

21 III. Performance tuning PERISCOPE TUNING FRAMEWORK Yury Oleynik, 21

22 Periscope Tuning Framework Goals: Tune codes to improve performance and energy efficiency Combine analysis and tuning to speedup the tuning process Support multicore and GPU accelerated parallel systems Idea: Automatically evaluate optimization space Produce tuning recommendation Use it to improve production runs Yury Oleynik, 22

23 PTF: Approach Define tuning strategies combining performance analysis infrastructure and tuning plugins Measured performance and energy properties are used in plugins to navigate the search for optimal configuration Available tuning plugins focus on: Tuning of High-Level Patterns for GPGPU Tuning of HMPP Codelets Tuning of Energy Consumption via CPU frequency Tuning of Master-Worker Pattern in MPI Tuning of MPI Runtime Tuning of Compiler Flag Selection Yury Oleynik, 23

24 Yury Oleynik, 24

25 Tuning of High-Level Patterns for GPGPU Target applications Applications implemented in the pipeline patterns framework (developed in PEPPHER project) Tuning objective Optimize throughput of the pipeline Tuning points and tuning actions Replication factors of individual stages Buffer sizes of input and output ports of individual stages Splitting and merging of the stages Yury Oleynik, 25

26 Tuning of HMPP Codelets Target applications OpenHMPP annotated applications To be run on heterogeneous many-core architecture Tuning Objective Optimize HMPP codelets performance Tuning points and tuning actions Static codelet tuning points: operations, transformations and algorithms used to implement a codelet, e.g. unrolling factor, the HMPP grid size Dynamic codelet tuning points: variables or callbacks available at runtime Yury Oleynik, 26

27 Tuning of Energy Consumption via CPU Frequency Target applications Any application running on the thin-node islands of SuperMUC Tuning objective Minimize energy consumption of an application Tuning points and tuning actions Available governors or direct frequency settings Yury Oleynik, 27

28 Tuning of the Master-Worker Pattern in MPI Target applications Applications implemented with Master Worker Pattern Tuning objective Improve load balancing Tuning points and tuning actions Partition factor Number of workers Yury Oleynik, 28

29 Tuning of MPI Runtime Target application Currently parallel applications build with ibm MPI Tuning objective Optimize performance Tuning points and tuning actions MPI environment parameters MPI application mapping adapting tasks per node/core, adapting the affinity of the processes MPI communication buffer/protocol adapting the sending/receiving buffer analyzing the size pattern of the messages adapting the communication protocol (eager/rendezvous) code variants for MPI communication Yury Oleynik, 29

30 Tuning of Compiler Flag Selection Target applications Any application Tuning objective Reduce the execution time of the application s phase region Tuning points and tuning actions Individual compiler flags of the compiler Switching ON or OFF of compiler switches during recompilation Yury Oleynik, 30

31 Thank you! Questions? Yury Oleynik, 31

Tools for Analysis of Performance Dynamics of Parallel Applications

Tools for Analysis of Performance Dynamics of Parallel Applications Tools for Analysis of Performance Dynamics of Parallel Applications Yury Oleynik Fourth International Workshop on Parallel Software Tools and Tool Infrastructures Technische Universität München Yury Oleynik,

More information

Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München

Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München Automatic Tuning of HPC Applications for Performance and Energy Efficiency. Michael Gerndt Technische Universität München SuperMUC: 3 Petaflops (3*10 15 =quadrillion), 3 MW 2 TOP 500 List TOTAL #1 #500

More information

Performance analysis with Periscope

Performance analysis with Periscope Performance analysis with Periscope M. Gerndt, V. Petkov, Y. Oleynik, S. Benedict Technische Universität München September 2010 Outline Motivation Periscope architecture Periscope performance analysis

More information

for High Performance Computing

for High Performance Computing Technische Universität München Institut für Informatik Lehrstuhl für Rechnertechnik und Rechnerorganisation Automatic Performance Engineering Workflows for High Performance Computing Ventsislav Petkov

More information

Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner

Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Recent and Future Activities in HPC and Scientific Data Management Siegfried Benkner Research Group Scientific Computing Faculty of Computer Science University of Vienna AUSTRIA http://www.par.univie.ac.at

More information

FAKULTÄT FÜR INFORMATIK. Automatic Characterization of Performance Dynamics with Periscope

FAKULTÄT FÜR INFORMATIK. Automatic Characterization of Performance Dynamics with Periscope FAKULTÄT FÜR INFORMATIK DER TECHNISCHEN UNIVERSITÄT MÜNCHEN Dissertation Automatic Characterization of Performance Dynamics with Periscope Yury Oleynik Technische Universität München FAKULTÄT FÜR INFORMATIK

More information

Unified Performance Data Collection with Score-P

Unified Performance Data Collection with Score-P Unified Performance Data Collection with Score-P Bert Wesarg 1) With contributions from Andreas Knüpfer 1), Christian Rössel 2), and Felix Wolf 3) 1) ZIH TU Dresden, 2) FZ Jülich, 3) GRS-SIM Aachen Fragmentation

More information

AMD WHITE PAPER GETTING STARTED WITH SEQUENCEL. AMD Embedded Solutions 1

AMD WHITE PAPER GETTING STARTED WITH SEQUENCEL. AMD Embedded Solutions 1 AMD WHITE PAPER GETTING STARTED WITH SEQUENCEL AMD Embedded Solutions 1 Optimizing Parallel Processing Performance and Coding Efficiency with AMD APUs and Texas Multicore Technologies SequenceL Auto-parallelizing

More information

Performance Analysis and Optimization Tool

Performance Analysis and Optimization Tool Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Analysis Team, University of Versailles http://www.maqao.org Introduction Performance Analysis Develop

More information

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data

StreamStorage: High-throughput and Scalable Storage Technology for Streaming Data : High-throughput and Scalable Storage Technology for Streaming Data Munenori Maeda Toshihiro Ozawa Real-time analytical processing (RTAP) of vast amounts of time-series data from sensors, server logs,

More information

Unprecedented Performance and Scalability Demonstrated For Meter Data Management:

Unprecedented Performance and Scalability Demonstrated For Meter Data Management: Unprecedented Performance and Scalability Demonstrated For Meter Data Management: Ten Million Meters Scalable to One Hundred Million Meters For Five Billion Daily Meter Readings Performance testing results

More information

Enhance visibility into and control over software projects IBM Rational change and release management software

Enhance visibility into and control over software projects IBM Rational change and release management software Enhance visibility into and control over software projects IBM Rational change and release management software Accelerating the software delivery lifecycle Faster delivery of high-quality software Software

More information

IBM WebSphere DataStage Online training from Yes-M Systems

IBM WebSphere DataStage Online training from Yes-M Systems Yes-M Systems offers the unique opportunity to aspiring fresher s and experienced professionals to get real time experience in ETL Data warehouse tool IBM DataStage. Course Description With this training

More information

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic

BigData. An Overview of Several Approaches. David Mera 16/12/2013. Masaryk University Brno, Czech Republic BigData An Overview of Several Approaches David Mera Masaryk University Brno, Czech Republic 16/12/2013 Table of Contents 1 Introduction 2 Terminology 3 Approaches focused on batch data processing MapReduce-Hadoop

More information

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS

VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS VALAR: A BENCHMARK SUITE TO STUDY THE DYNAMIC BEHAVIOR OF HETEROGENEOUS SYSTEMS Perhaad Mistry, Yash Ukidave, Dana Schaa, David Kaeli Department of Electrical and Computer Engineering Northeastern University,

More information

MCA Standards For Closely Distributed Multicore

MCA Standards For Closely Distributed Multicore MCA Standards For Closely Distributed Multicore Sven Brehmer Multicore Association, cofounder, board member, and MCAPI WG Chair CEO of PolyCore Software 2 Embedded Systems Spans the computing industry

More information

Experiment design and administration for computer clusters for SAT-solvers (EDACC) system description

Experiment design and administration for computer clusters for SAT-solvers (EDACC) system description Journal on Satisfiability, Boolean Modeling and Computation 7 (2010) 77 82 Experiment design and administration for computer clusters for SAT-solvers (EDACC) system description Adrian Balint Daniel Gall

More information

Data Center and Cloud Computing Market Landscape and Challenges

Data Center and Cloud Computing Market Landscape and Challenges Data Center and Cloud Computing Market Landscape and Challenges Manoj Roge, Director Wired & Data Center Solutions Xilinx Inc. #OpenPOWERSummit 1 Outline Data Center Trends Technology Challenges Solution

More information

26 April (Next Friday)

26 April (Next Friday) MAXIMUM ADDITIONAL SCORE: 2 points Description: 1. Selection of a research paper of interest from a given list 2. Study of the selected paper and the referenced material 3. Presentation of the paper in

More information

10g versions followed on separate paths due to different approaches, but mainly due to differences in technology that were known to be huge.

10g versions followed on separate paths due to different approaches, but mainly due to differences in technology that were known to be huge. Oracle BPM 11g Platform Analysis May 2010 I was privileged to be invited to participate in "EMEA BPM 11g beta bootcamp" in April 2010, where I had close contact with the latest release of Oracle BPM 11g.

More information

Fast and Easy Delivery of Data Mining Insights to Reporting Systems

Fast and Easy Delivery of Data Mining Insights to Reporting Systems Fast and Easy Delivery of Data Mining Insights to Reporting Systems Ruben Pulido, Christoph Sieb rpulido@de.ibm.com, christoph.sieb@de.ibm.com Abstract: During the last decade data mining and predictive

More information

Spring 2011 Prof. Hyesoon Kim

Spring 2011 Prof. Hyesoon Kim Spring 2011 Prof. Hyesoon Kim Today, we will study typical patterns of parallel programming This is just one of the ways. Materials are based on a book by Timothy. Decompose Into tasks Original Problem

More information

Multi-GPU Load Balancing for Simulation and Rendering

Multi-GPU Load Balancing for Simulation and Rendering Multi- Load Balancing for Simulation and Rendering Yong Cao Computer Science Department, Virginia Tech, USA In-situ ualization and ual Analytics Instant visualization and interaction of computing tasks

More information

Cluster, Grid, Cloud Concepts

Cluster, Grid, Cloud Concepts Cluster, Grid, Cloud Concepts Kalaiselvan.K Contents Section 1: Cluster Section 2: Grid Section 3: Cloud Cluster An Overview Need for a Cluster Cluster categorizations A computer cluster is a group of

More information

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems

Simplified Management With Hitachi Command Suite. By Hitachi Data Systems Simplified Management With Hitachi Command Suite By Hitachi Data Systems April 2015 Contents Executive Summary... 2 Introduction... 3 Hitachi Command Suite v8: Key Highlights... 4 Global Storage Virtualization

More information

Automating Big Data Benchmarking for Different Architectures with ALOJA

Automating Big Data Benchmarking for Different Architectures with ALOJA www.bsc.es Jan 2016 Automating Big Data Benchmarking for Different Architectures with ALOJA Nicolas Poggi, Postdoc Researcher Agenda 1. Intro on Hadoop performance 1. Current scenario and problematic 2.

More information

A QUICK OVERVIEW OF THE OMNeT++ IDE

A QUICK OVERVIEW OF THE OMNeT++ IDE Introduction A QUICK OVERVIEW OF THE OMNeT++ IDE The OMNeT++ 4.x Integrated Development Environment is based on the Eclipse platform, and extends it with new editors, views, wizards, and additional functionality.

More information

Private Public Partnership Project (PPP) Large-scale Integrated Project (IP)

Private Public Partnership Project (PPP) Large-scale Integrated Project (IP) Private Public Partnership Project (PPP) Large-scale Integrated Project (IP) D9.4.2: Application Testing and Deployment Support Tools Project acronym: FI-WARE Project full title: Future Internet Core Platform

More information

SOFTWARE TESTING TRAINING COURSES CONTENTS

SOFTWARE TESTING TRAINING COURSES CONTENTS SOFTWARE TESTING TRAINING COURSES CONTENTS 1 Unit I Description Objectves Duration Contents Software Testing Fundamentals and Best Practices This training course will give basic understanding on software

More information

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces

Decomposition into Parts. Software Engineering, Lecture 4. Data and Function Cohesion. Allocation of Functions and Data. Component Interfaces Software Engineering, Lecture 4 Decomposition into suitable parts Cross cutting concerns Design patterns I will also give an example scenario that you are supposed to analyse and make synthesis from The

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Automatic Performance Tuning for Multicore Architectures. Rudi Eigenmann Purdue University

Automatic Performance Tuning for Multicore Architectures. Rudi Eigenmann Purdue University Automatic Performance Tuning for Multicore Architectures Rudi Eigenmann Purdue University 1 Why Autotuning? my bias Ultimate goal: Dynamic Optimization Support For Compilers and More Runtime decisions

More information

Hardware Acceleration for Just-In-Time Compilation on Heterogeneous Embedded Systems

Hardware Acceleration for Just-In-Time Compilation on Heterogeneous Embedded Systems Hardware Acceleration for Just-In-Time Compilation on Heterogeneous Embedded Systems A. Carbon, Y. Lhuillier, H.-P. Charles CEA LIST DACLE division Embedded Computing Embedded Software Laboratories France

More information

Exploiting GPU Hardware Saturation for Fast Compiler Optimization

Exploiting GPU Hardware Saturation for Fast Compiler Optimization Exploiting GPU Hardware Saturation for Fast Compiler Optimization Alberto Magni School of Informatics University of Edinburgh United Kingdom a.magni@sms.ed.ac.uk Christophe Dubach School of Informatics

More information

Windchill Service Information Manager 10.1. Curriculum Guide

Windchill Service Information Manager 10.1. Curriculum Guide Windchill Service Information Manager 10.1 Curriculum Guide Live Classroom Curriculum Guide Building Information Structures with Windchill Service Information Manager 10.1 Building Publication Structures

More information

Performance Tuning Guidelines for Relational Database Mappings

Performance Tuning Guidelines for Relational Database Mappings Performance Tuning Guidelines for Relational Database Mappings 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

ENEA BARE METAL PERFORMANCE TOOLS FOR NETLOGIC XLP AND CAVIUM OCTEON PLUS

ENEA BARE METAL PERFORMANCE TOOLS FOR NETLOGIC XLP AND CAVIUM OCTEON PLUS 1 Run Time Performance Visualization Tools for Optimization of Bare Metal IP Packet Processing Applications - Quickly and Easily Identify Performance Bottlenecks and Correct System Behavior Optimizing

More information

HP Application Lifecycle Management (ALM)

HP Application Lifecycle Management (ALM) HP Application Lifecycle Management (ALM) Knowledge Share Maheshwar Salendra Date : 12/02/2012 AGENDA: Introduction to ALM ALM Functionality by Edition ALM Home page Side bars: Management Requirements

More information

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH

Equalizer. Parallel OpenGL Application Framework. Stefan Eilemann, Eyescale Software GmbH Equalizer Parallel OpenGL Application Framework Stefan Eilemann, Eyescale Software GmbH Outline Overview High-Performance Visualization Equalizer Competitive Environment Equalizer Features Scalability

More information

SCADE System 17.0. Technical Data Sheet. System Requirements Analysis. Technical Data Sheet SCADE System 17.0 1

SCADE System 17.0. Technical Data Sheet. System Requirements Analysis. Technical Data Sheet SCADE System 17.0 1 SCADE System 17.0 SCADE System is the product line of the ANSYS Embedded software family of products and solutions that empowers users with a systems design environment for use on systems with high dependability

More information

Learn CUDA in an Afternoon: Hands-on Practical Exercises

Learn CUDA in an Afternoon: Hands-on Practical Exercises Learn CUDA in an Afternoon: Hands-on Practical Exercises Alan Gray and James Perry, EPCC, The University of Edinburgh Introduction This document forms the hands-on practical component of the Learn CUDA

More information

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems

David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems David Rioja Redondo Telecommunication Engineer Englobe Technologies and Systems About me David Rioja Redondo Telecommunication Engineer - Universidad de Alcalá >2 years building and managing clusters UPM

More information

Visualizing gem5 via ARM DS-5 Streamline. Dam Sunwoo (dam.sunwoo@arm.com) ARM R&D December 2012

Visualizing gem5 via ARM DS-5 Streamline. Dam Sunwoo (dam.sunwoo@arm.com) ARM R&D December 2012 Visualizing gem5 via ARM DS-5 Streamline Dam Sunwoo (dam.sunwoo@arm.com) ARM R&D December 2012 1 The Challenge! System-level research and performance analysis becoming ever so complicated! More cores and

More information

Hardware design for ray tracing

Hardware design for ray tracing Hardware design for ray tracing Jae-sung Yoon Introduction Realtime ray tracing performance has recently been achieved even on single CPU. [Wald et al. 2001, 2002, 2004] However, higher resolutions, complex

More information

SAP Data Services 4.X. An Enterprise Information management Solution

SAP Data Services 4.X. An Enterprise Information management Solution SAP Data Services 4.X An Enterprise Information management Solution Table of Contents I. SAP Data Services 4.X... 3 Highlights Training Objectives Audience Pre Requisites Keys to Success Certification

More information

Part I Courses Syllabus

Part I Courses Syllabus Part I Courses Syllabus This document provides detailed information about the basic courses of the MHPC first part activities. The list of courses is the following 1.1 Scientific Programming Environment

More information

LDPC Decoding on the Intel SCC

LDPC Decoding on the Intel SCC LDPC Decoding on the Intel SCC Andreas Diavastos, Panayiotis Petrides, Gabriel Falcao, Pedro Trancoso Computer Science Department University of Cyprus Department of Electrical and Computer Engineering

More information

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS

HIGH PERFORMANCE CONSULTING COURSE OFFERINGS Performance 1(6) HIGH PERFORMANCE CONSULTING COURSE OFFERINGS LEARN TO TAKE ADVANTAGE OF POWERFUL GPU BASED ACCELERATOR TECHNOLOGY TODAY 2006 2013 Nvidia GPUs Intel CPUs CONTENTS Acronyms and Terminology...

More information

Introduction to Dataflow Computing

Introduction to Dataflow Computing Introduction to Dataflow Computing Maxeler Dataflow Computing Workshop STFC Hartree Centre, June 2013 Programmable Spectrum Control-flow processors Dataflow processor GK110 Single-Core CPU Multi-Core Several-Cores

More information

The Complete Performance Solution for Microsoft SQL Server

The Complete Performance Solution for Microsoft SQL Server The Complete Performance Solution for Microsoft SQL Server Powerful SSAS Performance Dashboard Innovative Workload and Bottleneck Profiling Capture of all Heavy MDX, XMLA and DMX Aggregation, Partition,

More information

Scala Storage Scale-Out Clustered Storage White Paper

Scala Storage Scale-Out Clustered Storage White Paper White Paper Scala Storage Scale-Out Clustered Storage White Paper Chapter 1 Introduction... 3 Capacity - Explosive Growth of Unstructured Data... 3 Performance - Cluster Computing... 3 Chapter 2 Current

More information

Software for High Performance. Computing. Requirements & Research Directions. Marc Snir

Software for High Performance. Computing. Requirements & Research Directions. Marc Snir Software for High Performance Requirements & Research Directions Computing Marc Snir May 2006 Outline Petascale hardware Petascale operating system Programming models 2 Jun-06 Petascale Systems are Coming

More information

DELL s Oracle Database Advisor

DELL s Oracle Database Advisor DELL s Oracle Database Advisor Underlying Methodology A Dell Technical White Paper Database Solutions Engineering By Roger Lopez Phani MV Dell Product Group January 2010 THIS WHITE PAPER IS FOR INFORMATIONAL

More information

Integrity 10. Curriculum Guide

Integrity 10. Curriculum Guide Integrity 10 Curriculum Guide Live Classroom Curriculum Guide Integrity 10 Workflows and Documents Administration Training Integrity 10 SCM Administration Training Integrity 10 SCM Basic User Training

More information

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel

Parallel Databases. Parallel Architectures. Parallelism Terminology 1/4/2015. Increase performance by performing operations in parallel Parallel Databases Increase performance by performing operations in parallel Parallel Architectures Shared memory Shared disk Shared nothing closely coupled loosely coupled Parallelism Terminology Speedup:

More information

A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs

A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs A Data Structure Oriented Monitoring Environment for Fortran OpenMP Programs Edmond Kereku, Tianchao Li, Michael Gerndt, and Josef Weidendorfer Institut für Informatik, Technische Universität München,

More information

SQL Server 2012 Optimization, Performance Tuning and Troubleshooting

SQL Server 2012 Optimization, Performance Tuning and Troubleshooting 1 SQL Server 2012 Optimization, Performance Tuning and Troubleshooting 5 Days (SQ-OPT2012-301-EN) Description During this five-day intensive course, students will learn the internal architecture of SQL

More information

ANDROID DEVELOPER TOOLS TRAINING GTC 2014. Sébastien Dominé, NVIDIA

ANDROID DEVELOPER TOOLS TRAINING GTC 2014. Sébastien Dominé, NVIDIA ANDROID DEVELOPER TOOLS TRAINING GTC 2014 Sébastien Dominé, NVIDIA AGENDA NVIDIA Developer Tools Introduction Multi-core CPU tools Graphics Developer Tools Compute Developer Tools NVIDIA Developer Tools

More information

Delivering information you can trust December IBM Information Server FastTrack: The need for speed accelerating data integration projects

Delivering information you can trust December IBM Information Server FastTrack: The need for speed accelerating data integration projects December 2007 IBM Information Server FastTrack: The need for speed accelerating data integration projects Page 2 Contents 3 Creating a collaborative development environment 5 Optimizing data integration

More information

Chapter 17: Database System Architectures

Chapter 17: Database System Architectures Chapter 17: Database System Architectures Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 17: Database System Architectures Centralized and Client-Server Systems

More information

Scientific Computing Programming with Parallel Objects

Scientific Computing Programming with Parallel Objects Scientific Computing Programming with Parallel Objects Esteban Meneses, PhD School of Computing, Costa Rica Institute of Technology Parallel Architectures Galore Personal Computing Embedded Computing Moore

More information

Real-Time Operating Systems for ehealth Wearable Devices Mauro Marinoni, Gianluca Franchino and Giorgio Buttazzo

Real-Time Operating Systems for ehealth Wearable Devices Mauro Marinoni, Gianluca Franchino and Giorgio Buttazzo Real-Time Operating Systems for ehealth Wearable Devices Mauro Marinoni, Gianluca Franchino and Giorgio Buttazzo ReTiS Lab, TeCIP Institute Scuola superiore Sant Anna - Pisa 1 Outline Embedded systems

More information

High Performance Matrix Inversion with Several GPUs

High Performance Matrix Inversion with Several GPUs High Performance Matrix Inversion on a Multi-core Platform with Several GPUs Pablo Ezzatti 1, Enrique S. Quintana-Ortí 2 and Alfredo Remón 2 1 Centro de Cálculo-Instituto de Computación, Univ. de la República

More information

What s New in MATLAB and Simulink

What s New in MATLAB and Simulink What s New in MATLAB and Simulink Kevin Cohan Product Marketing, MATLAB Michael Carone Product Marketing, Simulink 2015 The MathWorks, Inc. 1 What was new for Simulink in R2012b? 2 What Was New for MATLAB

More information

IBM Rational ClearCase, Version 8.0

IBM Rational ClearCase, Version 8.0 IBM Rational ClearCase, Version 8.0 Improve software and systems delivery with automated software configuration management solutions Highlights Improve software delivery and software development life cycle

More information

Key Attributes for Analytics in an IBM i environment

Key Attributes for Analytics in an IBM i environment Key Attributes for Analytics in an IBM i environment Companies worldwide invest millions of dollars in operational applications to improve the way they conduct business. While these systems provide significant

More information

Parallel I/O on JUQUEEN

Parallel I/O on JUQUEEN Parallel I/O on JUQUEEN 3. February 2015 3rd JUQUEEN Porting and Tuning Workshop Sebastian Lührs, Kay Thust s.luehrs@fz-juelich.de, k.thust@fz-juelich.de Jülich Supercomputing Centre Overview Blue Gene/Q

More information

White Paper COMPUTE CORES

White Paper COMPUTE CORES White Paper COMPUTE CORES TABLE OF CONTENTS A NEW ERA OF COMPUTING 3 3 HISTORY OF PROCESSORS 3 3 THE COMPUTE CORE NOMENCLATURE 5 3 AMD S HETEROGENEOUS PLATFORM 5 3 SUMMARY 6 4 WHITE PAPER: COMPUTE CORES

More information

Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction

Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction Managing Adaptability in Heterogeneous Architectures through Performance Monitoring and Prediction Cristina Silvano cristina.silvano@polimi.it Politecnico di Milano HiPEAC CSW Athens 2014 Motivations System

More information

Driving force. What future software needs. Potential research topics

Driving force. What future software needs. Potential research topics Improving Software Robustness and Efficiency Driving force Processor core clock speed reach practical limit ~4GHz (power issue) Percentage of sustainable # of active transistors decrease; Increase in #

More information

Dominique Toupin, Ericsson

Dominique Toupin, Ericsson Dominique Toupin, Ericsson About me Tool Manager at Ericsson, helping Ericsson sites to develop better software efficiently Telecommunication systems Open, standards-based common platform High availability,

More information

HPC Wales Skills Academy Course Catalogue 2015

HPC Wales Skills Academy Course Catalogue 2015 HPC Wales Skills Academy Course Catalogue 2015 Overview The HPC Wales Skills Academy provides a variety of courses and workshops aimed at building skills in High Performance Computing (HPC). Our courses

More information

GPUs for Scientific Computing

GPUs for Scientific Computing GPUs for Scientific Computing p. 1/16 GPUs for Scientific Computing Mike Giles mike.giles@maths.ox.ac.uk Oxford-Man Institute of Quantitative Finance Oxford University Mathematical Institute Oxford e-research

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Workflow Engine for clouds

More information

Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM

Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM Performance Tuning Guidelines for PowerExchange for Microsoft Dynamics CRM 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying,

More information

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1

Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline 1 Silverlight for Windows Embedded Graphics and Rendering Pipeline Windows Embedded Compact 7 Technical Article Writers: David Franklin,

More information

Tools for Testing Software Architectures. Learning Objectives. Context

Tools for Testing Software Architectures. Learning Objectives. Context Tools for Testing Software Architectures Wolfgang Emmerich Professor of Distributed Computing University College London http://sse.cs.ucl.ac.uk Learning Objectives To discuss tools to validate software

More information

Streaming Media. Advanced Audio. Erik Noreke, Standardization Consultant Chair, OpenSL ES Copyright Khronos Group, Page 1

Streaming Media. Advanced Audio. Erik Noreke, Standardization Consultant Chair, OpenSL ES Copyright Khronos Group, Page 1 Streaming Media Advanced Erik Noreke, Standardization Consultant Chair, OpenSL ES erik@noreke.se Copyright Khronos Group, 2010 - Page 1 OpenMAX Streaming Media Media Infrastructure Portability Open, royalty-free

More information

Scalability and Classifications

Scalability and Classifications Scalability and Classifications 1 Types of Parallel Computers MIMD and SIMD classifications shared and distributed memory multicomputers distributed shared memory computers 2 Network Topologies static

More information

MAQAO Performance Analysis and Optimization Tool

MAQAO Performance Analysis and Optimization Tool MAQAO Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL andres.charif@uvsq.fr Performance Evaluation Team, University of Versailles S-Q-Y http://www.maqao.org VI-HPS 18 th Grenoble 18/22

More information

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering

Outline. Institute of Computer and Communication Network Engineering. Institute of Computer and Communication Network Engineering Institute of Computer and Communication Network Engineering Institute of Computer and Communication Network Engineering Communication Networks Software Defined Networking (SDN) Prof. Dr. Admela Jukan Dr.

More information

GPU Computing - CUDA

GPU Computing - CUDA GPU Computing - CUDA A short overview of hardware and programing model Pierre Kestener 1 1 CEA Saclay, DSM, Maison de la Simulation Saclay, June 12, 2012 Atelier AO and GPU 1 / 37 Content Historical perspective

More information

Chapter 18: Database System Architectures. Centralized Systems

Chapter 18: Database System Architectures. Centralized Systems Chapter 18: Database System Architectures! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types 18.1 Centralized Systems! Run on a single computer system and

More information

DARPA, NSF-NGS/ITR,ACR,CPA,

DARPA, NSF-NGS/ITR,ACR,CPA, Spiral Automating Library Development Markus Püschel and the Spiral team (only part shown) With: Srinivas Chellappa Frédéric de Mesmay Franz Franchetti Daniel McFarlin Yevgen Voronenko Electrical and Computer

More information

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures

Centralized Systems. A Centralized Computer System. Chapter 18: Database System Architectures Chapter 18: Database System Architectures Centralized Systems! Centralized Systems! Client--Server Systems! Parallel Systems! Distributed Systems! Network Types! Run on a single computer system and do

More information

ASAP D7.1 Integration Prototype ASAP System Prototype v.1

ASAP D7.1 Integration Prototype ASAP System Prototype v.1 FP7 Project ASAP Adaptable Scalable Analytics Platform Integration Prototype ASAP System Prototype v.1 WP 7 Integration of the ASAP System Nature: Report Dissemination: Public Version History Version Date

More information

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS

AN INTEGRATION APPROACH FOR THE STATISTICAL INFORMATION SYSTEM OF ISTAT USING SDMX STANDARDS Distr. GENERAL Working Paper No.2 26 April 2007 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS EUROPEAN COMMISSION STATISTICAL

More information

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware

Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware Created by Doug Cutting and Mike Carafella in 2005. Cutting named the program after

More information

Architectural Design

Architectural Design Software Engineering Architectural Design 1 Software architecture The design process for identifying the sub-systems making up a system and the framework for sub-system control and communication is architectural

More information

A Multi-layered Domain-specific Language for Stencil Computations

A Multi-layered Domain-specific Language for Stencil Computations A Multi-layered Domain-specific Language for Stencil Computations Christian Schmitt, Frank Hannig, Jürgen Teich Hardware/Software Co-Design, University of Erlangen-Nuremberg Workshop ExaStencils 2014,

More information

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association

Making Multicore Work and Measuring its Benefits. Markus Levy, president EEMBC and Multicore Association Making Multicore Work and Measuring its Benefits Markus Levy, president EEMBC and Multicore Association Agenda Why Multicore? Standards and issues in the multicore community What is Multicore Association?

More information

Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II)

Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II) Software Engineering Prof. N.L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture-4 Overview of Phases (Part - II) We studied the problem definition phase, with which

More information

1. PUBLISHABLE SUMMARY

1. PUBLISHABLE SUMMARY 1. PUBLISHABLE SUMMARY ICT-eMuCo (www.emuco.eu) is a European project with a total budget of 4.6M which is supported by the European Union under the Seventh Framework Programme (FP7) for research and technological

More information

Customer Analytics. Turn Big Data into Big Value

Customer Analytics. Turn Big Data into Big Value Turn Big Data into Big Value All Your Data Integrated in Just One Place BIRT Analytics lets you capture the value of Big Data that speeds right by most enterprises. It analyzes massive volumes of data

More information

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012

Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 Unstructured Data Accelerator (UDA) Author: Motti Beck, Mellanox Technologies Date: March 27, 2012 1 Market Trends Big Data Growing technology deployments are creating an exponential increase in the volume

More information

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw.

Archiving Systems. Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie. uwe.borghoff@unibw. Archiving Systems Uwe M. Borghoff Universität der Bundeswehr München Fakultät für Informatik Institut für Softwaretechnologie uwe.borghoff@unibw.de Decision Process Reference Models Technologies Use Cases

More information

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design

PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions. Outline. Performance oriented design PART IV Performance oriented design, Performance testing, Performance tuning & Performance solutions Slide 1 Outline Principles for performance oriented design Performance testing Performance tuning General

More information

Developing applications on Yocto. Lianhao Lu Intel Corporation Feb. 29th, 2012

Developing applications on Yocto. Lianhao Lu Intel Corporation Feb. 29th, 2012 Developing applications on Yocto Lianhao Lu Intel Corporation Feb. 29th, 2012 Agenda Embedded Linux Development The Yocto Project Offerings For Embedded Linux Development The Yocto Project Eclipse Plug-in

More information

LR01IT LoadRunner 12.0 Interactive Training by ART

LR01IT LoadRunner 12.0 Interactive Training by ART LR01IT LoadRunner 12.0 Interactive Training by ART Course No.: LR01IT-120 Category/Sub Category: Application Performance Testing/LoadRunner For software version(s): 12.0 Course length: Online Course Delivery

More information